Sunday, March 16, 2014

Scaling a coffee shop (web application)

I own a small coffee shop.

My expense is proportional to resources
100 square feet of built up area with utilities, 1 barista, 1 cup coffee maker.

My shop's capacity
Serves 1 customer at a time, takes 3 minutes to brew a cup of coffee, a total of 5 minutes to serve a customer.

Since my barista works without breaks and the German made coffee maker never breaks down,
my shop's maximum throughput = 12 customers per hour.


Web server

Customers walk away during peak hours. We only serve one customer at a time. There is no space to wait.

I upgrade shop. My new setup is better!

Expenses
Same area and utilities, 3 baristas, 2 cup coffee maker, 2 chairs

Capacity
3 minutes to brew 2 cups of coffee, ~7 minutes to serve 3 concurrent customers, 2 additional customers can wait in queue on chairs.

Concurrent customers = 3, Customer capacity = 5


Scaling vertically

Business is booming. Time to upgrade again. Bigger is better!

Expenses
200 square feet and utilities, 5 baristas, 4 cup coffee maker, 3 chairs

Capacity goes up proportionally. Things are good.

In summer, business drops. This is normal for coffee shops. I want to go back to a smaller setup for sometime. But my landlord wont let me scale that way.

Scaling vertically is expensive for my up and coming coffee shop. Bigger is not always better.


Scaling horizontally with a Load Balancer

Landlord is willing to scale up and down in terms of smaller 3 barista setups. He can add a setup or take one away if I give advance notice.

If only I could manage multiple setups with the same storefront...

There is a special counter in the market which does just that!
It allows several customers to talk to a barista at once. Actually the customer facing employee need not be a barista, just someone who takes and delivers orders. And baristas making coffee do not have to deal directly with pesky customers.

Now I have an advantage. If I need to scale, I can lease another 3 barista setup (which happens to be a standard with coffee landlords), and hook it up to my special counter. And do the reverse to scale down.

While expenses go up, I can handle capacity better too. I can ramp up and down horizontally.


Resource intensive processing

My coffee maker is really a general purpose food maker. Many customers tell me they would love to buy freshly baked bread. I add it to the menu.

I have a problem. The 2 cup coffee maker I use, can make only 1 pound of bread at a time. Moreover it takes twice as long to make bread.

In terms of time,
1 pound of bread = 4 cups of coffee

Sometimes bread orders clog my system! Coffee ordering customers are not happy with the wait. Word spreads about my inefficiency.

I need a way of segmenting my customer orders by load, while using my resources optimally.


Asynchronous Queue based processing

I introduce a token based queue system.

Customers come in, place their order, get a token number and wait.
The order taker places the order on 2 input queues - bread and coffee.
Baristas look at the queues and all other units and decide if they should pick up the next coffee or the next bread order.
Once a coffee or bread is ready, it is placed on an output tray and the order taker shouts out the token number. The customer picks up the order.

- The inputs queues and output tray are new. Otherwise the resources are the same, only working in different ways.

- From a customers point of view, the interaction with the system is now quite different.

- The expense and capacity calculations are complicated. The system's complexity went up too. If any problem crops up, debugging and fixing it could be problematic.

- If the customers accept this asynchronous system, and we can manage the complexity, it provides a way to scale capacity and product variety. I can intimidate my next door competitor.


Scaling beyond

We are reaching the limits of our web server driven, load balanced, queue based asynchronous system. What next?

My already weak coffee shop analogy has broken down completely at this point.

Search for DNS round robin and other techniques of scaling if you are interested to know more.

If you are new to web application scaling, do the same for each of the topics mentioned in this page.

My coffee shop analogy was an over-simplification, to spark interest on the topic of scaling web applications.

If you really want to learn, tinker with these systems and discuss them with someone who has practical knowledge.

Sunday, December 29, 2013

Interview Question: Design an Object Oriented System

I was asked this question in a couple of software engineer job interviews recently. After the interviews, I wondered if there is a general way to approach this question for any system X.

What is the question?

The system to design can be of two types:

1. The system manifests as software processes. Examples would be a rule engine, an app store and so on.
An app store selling software runs as processes in a network of computers. Its interfaces with the outside world can be defined by a handful of software interfaces.

2. The system manifests in the world outside computers. Examples would be an elevator, a parking lot, a DVD lending library and so on.
These systems can be viewed in a couple of ways:

2a. Real time software drives pieces of the physical system. Other software coordinates the pieces. As an example, in a parking lot, the system to design could be the software controlling and monitoring the parts of the parking lot. The software interfaces with the actual hardware using some well defined interfaces.

2b. A simulation of the entire physical system as software processes. Consider 2 examples:
i. A card game, where we model decks of cards, timers and game rules as software processes with a graphical user interface to make the game useful to users.
ii. A parking lot, where we simulate the entire system within software processes. The user interface allows the user to get a view into the system, and perhaps vary system parameters. Reasons to do this may be to study the system's characteristics in a cost effective way or create a virtual reality game.

In a type 2b system, we are essentially converting a type 2 system into a type 1 system.

Before attempting design, I think it is better to agree on a point of view with the interviewer.

If it is a physical system, it may be easier to deal with a model which simulates the system in software in its entirety, rather than one which attempts to design pieces of real time systems. Unless designing an embedded software system complements the job you are interviewing for.

Data structures and algorithms

It may appear that OO design questions can be answered while avoiding data structures and algorithms. This is not true. An interface or class may need to use an abstract data type or a well known algorithm in its implementation.
As an example assume the OO system to design is a syntax checker. You may define an interface which checks for balanced parentheses. The interviewer may want to know how you would implement the interface.
Even if you don’t know the answer off the bat, you could ask the interviewer for a hint or think of a naive answer which might lead to a good solution. The key is to know about the stack abstract data type and its operations. If you don’t know anything about stacks, this could become a difficult problem and raise a red flag with the interviewer.

It is good to know how to estimate time and space complexity of algorithms since interviewers may want to know if you can use this knowledge for design trade-offs.

Some ways to improve knowledge of data structures and algorithms:
Books Internet articles
Exercises
Perusing the standard library of your language
Work on projects.

Heuristic

Some approaches to the object oriented job interview question:

1. In my interviews, I wrote down interfaces and base classes which occurred as I described use cases to the interviewers. Then I began writing out the interactions between the objects. I asked questions when I faced ambiguity in my assumptions. This approach allowed me to engage the interviewer in the design process. However I thought it was a vague method when we got into the details and the interviewers started adding and modifying requirements.

2. A more precise way is to write down use cases first. Based on the use cases, come up with the objects. Then use a method like CRC cards to flush out the interactions between the objects. Use a message sequence diagram to refine the interactions. Finally use a deployment diagram to show various processes which host the objects as they run on computers and network elements. While this is a methodical way of doing object oriented design, it may not be the best approach within the constraints of a job interview.

Message Queue

Here is another way. Assume:
- A central priority message queue where any object can send a request for something to be done. The message queue automatically routes the message to the destination object.
- Unless there is a specific need, objects are created automatically and their lifetime is infinite (at least for the time of the interview).

So we assume an event loop, to which objects can send messages. The event loop routes the message to a suitable destination object, creating it if necessary.

Consider part of the design for an elevator bank.

This is not very different from the other methods given in the previous section; it just provides a compressed approach for the interview.

Events, Async Calls and Collections

In case of an asynchronous call or a call to a collection of objects, it helps to introduce a controller object which serializes the async events or helps choose an item from the collection. We did this in the call to the elevator bank controller, which chooses the optimal elevator to send in response to a passenger's request message.

Source Code

The point of the design is to clarify the domain and make good choices while solving the problem. At this point, we should have enough to switch over to code. This helps make the discussion more concrete.

Obviously we want to use the language which we are most comfortable with, unless the system to design is best expressed in a specific language. For example, if the system to design is a widget which is hosted within a web browser, JavaScript would be a good choice, even if Scheme is your favorite language. In this case it is likely the job involves some client side browser programming, and you should expect to use JavaScript at work.

Using higher level constructs like interfaces in the initial design allows for more flexibility while dealing with modified requirements. But remember such constructs usually imply indirection and you have to deal with the added complexity in the discussion with the interviewer.

Object Oriented Concepts

The main objective is to design a system. A well prepared interviewer will introduce additional requirements into the design, and this is a cue for you to leverage OO design concepts. Objected oriented design concepts like inheritance, polymorphism, encapsulation and dynamic dispatch should be used judiciously (and perhaps sparingly) in the design process. In the elevator design example, the interviewer might ask for a service elevator which is somewhat different from normal elevators to be added to the elevator bank. In case of a rule engine, the interviewer might add additional rule types which you may not have considered in the design.

A Builder

If the interview asks how the processes are setup, we can introduce a builder which translates a configuration into the actual objects at startup time.
For example consider an xml configuration for elevators:

<ellevatorbank>
    <ellevator id="1">
        <door id="1">
        </door>
        <allowedfloors>
            <floor id="1">
            </floor>
            <floor id="2">
            </floor>
        </allowedfloors>
    </ellevator>
    <ellevator id="2">
        <!-- etc. -->
    </ellevator>
<!-- etc. -->
</ellevatorbank>

The builder would just read the configuration file at startup and instantiate the corresponding objects required at startup. It may be good to think about a mechanism where the system behavior may be re-configured at runtime too.

Message Queue Again

With the interfaces and objects in place, we now need a way to put things in motion. We can introduce a priority message queue again. So we start an event loop which serializes messages and routes them to their destination. We can draw parallels with the way load balancers, web servers or operating systems deal with events which typically start or end on external actors.

Using a priority queue helps deal with messages which have to be executed ahead of the order in which they arrive. For example, a maintenance message from a technician to the elevator bank should take precedence over a normal call originating from a passenger.

Design Patterns

Making design patterns more important than the design might signal to the interviewer that we want to complicate the design rather than simplify it.

If I chose a design pattern, I would clarify if it is okay with the interviewer. Where possible I would prefer a well known data type or algorithm which may be available as a language library over a design pattern. And when I choose a design pattern, it would be a one I have used before in practice, not just read about in a book.

Generally avoid buzzwords and concepts you are not clear about.

Functional Programming Design? Procedural Programming Design?

I think it is valid to use alternate design methodologies to answer the question. Couple of things to consider:

- Does the group or company you are interviewing with have an environment where solving problems take precedence over the language and technology being used? There may be good reasons for this question to be answered either ways, and you may want to know the answer before hand using the internet and your contacts.
- Confirm with the interviewer if it is okay to approach the problem using an alternate design method.

Some Tips

In the design process, it helps to keep in mind:
1. The system to be designed is most important; any concepts used in the process should be to help in this design.
2. Articulate any design trade-offs you recognize to the interviewer.
3. Avoid buzzwords or terminology which you don’t understand well.
4. If you are faced with ambiguity, discuss a resolution with the interviewer.
5. At some point, reason with code, it can help focus the design process. Choose a language you are comfortable with.
6. Assume that data structures and algorithms cannot be avoided.
7. Assume that the interviewer will modify the constraints and requirements you have assumed in the initial design process.
8. With an open ended question like object oriented design, you have as much as of an opportunity to gauge your future colleagues and work environment as they have to judge you. Use it as an input while deciding if you want to take the job.

Sunday, September 29, 2013

Iteration - C, Scheme and C#

An iterator enables element-by-element processing of a collection. Before we look at iterators which operate over collection objects, let us examine more fundamental iteration statements.

The C for loop - under the hood

The for loop is a basic building block in languages which derive their syntax from C.

char i;
for (i = 'a'; i < 'z'; i++) {
  printf("%c\n", i);
}
Looking at the assembly generated for the loop (using Visual Studio/Windows running on an Intel processor):

;...
; char i;
; for (i = 'a'; i < 'z'; i++) {
00A81A2E  mov         byte ptr [i],61h  
00A81A32  jmp         main+2Ch (0A81A3Ch)  
;A. Start of loop
00A81A34  mov         al,byte ptr [i]  
00A81A37  add         al,1  
00A81A39  mov         byte ptr [i],al  
00A81A3C  movsx       eax,byte ptr [i]
;B. Compare and jump to the next iteration or out of the loop
00A81A40  cmp         eax,7Ah  
00A81A43  jge         main+53h (0A81A63h)  
;... printf("%c\n", i); is implemented here
; }
;C. Jump back to start of loop
00A81A61  jmp         main+24h (0A81A34h)  
;}
0A81A63  xor         eax,eax  
;...
The compiled code is straightforward. Essentially a pair of jumps (cmp/jge and jmp marked as B. and C. in comments) are used to move to the next loop iteration and eventually terminate the loop.

An iterator in Scheme

An iterator is a mechanism which:
1. Operates on a collection of objects (usually all of the same type, but not necessarily).
2. Provides a way to apply an arbitrary procedure to the current object.
3. Provides a way to move to the next object.
4. Provides a uniform interface for the collection, independent of the type of the objects in the collection.

It is straightforward to implement a simple list iterator in Scheme:

(define (for-each function collection)
  (cond ((null? collection) #t)
  (else (let ((rest (cdr collection)))
          (function (car collection))
            (for-each function rest)))))
               
;Use for-each to print a list of symbols
(for-each print (list 'hello 'world '!))

;Use for-each to square a list of numbers
(for-each 
    (lambda (x)
        (print (* x x)))
    (list 1 2 3))
Our for-each takes an arbitrary function of 1 argument and a collection. It applies the function to each item in the collection till there are no elements left.

We used a recursive function to implement foreach. This looks like an inefficient way to do things, since we would be using up stack frames and if the collection was big enough cause a stack overflow.

However the for-each has been written using a special case of recursion. Since the recursion is in the very last statement, we have what is called tail recursion. With tail recursion, the Scheme compiler can avoid pushing the recursive calls onto the stack. Instead it can optimize the generated code into something similar to the jump mechanism we saw with the generated assembly statement in the C for loop. Whether this optimization leads to better performance in practice or not is another issue which we will not bother about in this post.*

Iterators in C#

That leads us to iterators in C#. One common mechanism consists of a pair of complementary constructs:
IEnumerable interface: Has one method GetEnumerator().
foreach statement: operates on an IEnumerable and allows us to iterate over its objects.


using (EnumeratedFile file = new EnumeratedFile( "c:\\temp\\test.txt" )
  foreach ( var line in file ) {
    Console.WriteLine( line );
  }
}
Here EnumeratedFile implements IEnumerable. foreach applies whatever procedure we provide on EnumeratedFile (here just a Console.WriteLine).

But for this to work, there has to be an enumerator object which IEnumerable.GetEnumerator() provides, which actually iterates over the file. C# provides another interface IEnumerator for this purpose. The enumerator which implements IEnumerator does the real work of giving access to the current object and moving to the next.

IEnumerator has 3 methods whose intentions are obvious from their names: Current (actually a property), MoveNext() and Reset().

Note that the same object can implement both the IEnumerable and IEnumerator interfaces. For example, here is an implementation of EnumeratedFile:

class EnumeratedFile : IEnumerable, IEnumerator, IDisposable
{
  string line;
  StreamReader stream;

  //Constructor
  public EnumeratedFile( string path ) {
    line = string.Empty;
    if ( File.Exists( path ) ) {
      stream = new StreamReader( path );
    }
    else {
      stream = null;
    }
  }

  IEnumerator IEnumerable.GetEnumerator() {
    return this;
  }

  public object Current {
    get { return line; }
  }

  public bool MoveNext() {
    if ( stream == null ) return false;
    return (line = stream.ReadLine()) != null ? true : false;
  }

  public void Reset() {
    stream.BaseStream.Position = 0;
    stream.DiscardBufferedData();
  }

  //We also implement IDisposable to clean up the StreamReader.
  public void Dispose() {
    if ( stream != null ) {
      stream.Dispose();
    }
  }
}
So foreach provides a shorthand for the following:

using ( EnumeratedFile file = new EnumeratedFile( "c:\\temp\\test.txt" ) ) {
  var line = string.Empty;
  while ( file.MoveNext() ) {
    line = (string)file.Current;
    Console.WriteLine( line );
  }
}
foreach can infer IEnumerable interface for array and dynamic types. The C# language spec has details on the behavior of foreach.

C# provides a simpler way to implement an iterator. Instead of implementing Current, MoveNext() and Reset(); we could use the yield keyword to simplify things.

Using yield, EnumeratedFile can be re-written as:

class EnumeratedFile : IEnumerable, IDisposable
{
  string line;
  StreamReader stream;

  //Constructor
  public EnumeratedFile( string path ) {
    line = string.Empty;
    if ( File.Exists( path ) ) {
      stream = new StreamReader( path );
    }
    else {
      stream = null;
    }
  }

  public IEnumerator GetEnumerator() {
    if (stream == null) yield break;
    while ( (line = stream.ReadLine()) != null ) {
      yield return line;
    }
    yield break;
  }

  //We also implement IDisposable to clean up the StreamReader.
  public void Dispose() {
    if ( stream != null ) {
      stream.Dispose();
    }
  }
}
yield does the heavy lifting of generating current/movenext and maintaining state between iterations. Raymond Chen and Jon Skeet have explained the code which is generated by the compiler. We could also look at the generated code directly using Ildasm disassembler or indirectly using a .NET decompiler. IEnumerable and IEnumerator also have generic counterparts IEnumerable< T > and IEnumerator< T >.

What is the big deal?

What is the use of iterator abstraction, if it boils down eventually to a pair of jump statements?

Two things:
1. Iteration is a very common operation while automating work with computers: When we model real world objects, create several instances of them and apply procedures repeatedly on all the instances**. Enough times that many languages provide the iterator as a built in construct or make it easy to write the construct ourselves.

2. By making iterators a part of the language, the language designers provider us users a 'conceptual framework'*** to build higher levels of abstraction. For example IEnumerable in C# lays the groundwork for transforming sequences to sequences, which is one of the underlying ideas behind the Linq framework.


* See this post for tail optimization with respect to C#

** Iteration is not the only way to apply a procedure on a group of objects. Set based operations are another way. In fact, it is a common programming error for application programmers new to databases to iterate over each object in a set, instead of operating over the sets of data, and encounter performance problems as the data set grows.

*** From SICP, on map and similar operations over lists: ...The difference between the two definitions is not that the computer is performing a different process (it isn't) but that we think about the process differently. In effect, map helps establish an abstraction barrier that isolates the implementation of procedures that transform lists from the details of how the elements of the list are extracted and combined. Like the barriers shown in figure 2.1, this abstraction gives us the flexibility to change the low-level details of how sequences are implemented, while preserving the conceptual framework of operations that transform sequences to sequences.

Tuesday, September 24, 2013

Anonymous Methods and Closures in C#

In the last two posts we looked at Action, Func, delegate and callbacks in C#. In this post we look at anonymous methods and closures.

Anonymous Methods

A method is anonymous if it is not bound to a name. A couple of ways to use anonymous methods in C#:

1. The 'new' way, using lambda expressions:

System.Func< int, int > doubler = x => x + x;
2. The 'old' way, using delegate explicitly:

delegate int DoublerDelegate( int d );
class Test
{
  public DoublerDelegate Foo()
  {
    return delegate( int x ) { return x + x; };
  }
}

//In the client, we can call the anonymous method:
Test t = new Test();
int i = (t.Foo())( 2 );
As expected, both ways give the same result.

Under the hood

Looking at the IL in the Ildasm disassembler for the 'old' way (explicitly using delegates), we see:

As expected, a DoublerDelegate which extends System.MulticastDelegate.
And inside the Test class, the compiler generated an anonymous static method corresponding to our anonymous method:

.field private static class anonymous.DoublerDelegate 
'CS$<>9__CachedAnonymousMethodDelegate1'
//...

.custom instance void 
[mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() 
= ( 01 00 00 00 )
 
.method private hidebysig static int32  
'
b__0'(int32 x) cil managed { //... Implements our anonymous doubler method } // end of method Test::'b__0'

In the DoublerDelegate instance, the static method is invoked.

IL_0009:  ldftn    int32 anonymous.Test::'b__0'(int32)

This static method gets called in main, in lieu of our anonymous method.

If we looked at the IL for the 'new' way with Func, we would see that the IL generated for the anonymous method is similar: A static method is generated and called from the Func.

Non-local variables and closure(?)

Our anonymous doubler methods only accessed the variable x. x is local to the anonymous method and not influenced by the world outside the method once it is bound to a value.
This is why the compiler can generate just a static method for the anonymous method. The method does not need external state.

Let us introduce a variable which is non-local to the anonymous method.

public DoublerDelegate Foo()
{
  int i = 10;

  return delegate( int x ) {
    i = i + x;
    return i * x; 
  };
}

i is non-local to the anonymous method, but used by it (in other words, i is a 'free variable'). Now the compiler would have to account for the variable i, whenever the anonymous method is called.

It does this by generating a new class.

.class auto ansi sealed nested private 
beforefieldinit '<>c__DisplayClass1'
       extends [mscorlib]System.Object
{
  .custom instance void 
  [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() 
  = ( 01 00 00 00 ) 
} // end of class '<>c__DisplayClass1'
This generated class encapsulates both the non-local variable i, and the generated anonymous method which uses it.

Raymond Chen's post from years ago elegantly describes how anonymous methods are generated.

BTW, there seems to be some confusion among people who know better than me, whether the example above is a true 'lexical closure'. For example in Scheme, the closure would consist of the anonymous function and the environment which references it. In case of C#, this is simulated by generating classes and methods.

For my point of view, of just another application programmer, the above code is an example of closure in C#.

Garbage Collection is important

In the example, the variable i has to live even after the enclosing method Foo() has returned. Why? Because the anonymous method references it, and it could live on based on the client code's intentions. We saw the compiler generated an internal class which maintains the non-local variable i. But who frees the generated class? The answer is obvious - the garbage collector. The internal object would be marked for deletion whenever the referencing anonymous method is freed. While this is a trivial question in C#, it would not be so simple if the closure would have to be freed in the absence of an omnipresent GC.

Gotchas

In C# anonymous methods, ref and out parameters cannot be captured automatically. The GC essentially works on the heap, not the stack. Reference params are not moved to the heap in C#, they live on the stack and are not captured in the closure. If we do need ref parameters, we would need to make local copies ourselves. In other words we have to handle the copy-in/copy-out.

This post from one of the implementers of anonymous methods in C# explains the copy-in/out semantics. There are other posts on his blog which explain deeper nuances of anonymous methods in C#. The matter of ref and out parameters, stack and heap are also explained in detail by Eric Lippert.

In Practice

Delegates which simulate functions-as-first-class-citizens are a useful feature in C#. They allow us to better express verbs (functions) in what is essentially a kingdom of nouns (classes). This helps an application programmer in terms of expressiveness, composability and terseness.

But how useful are anonymous methods and closures to just another application programmer like me?

Anyone who has used Linq, threading or task library in C# in some depth would be familiar with this topic. Here we use an anonymous method which forms a closure over the non-local variable 'extra'.

var extra = 10;
var result = myList.Select( x => x + extra );

Consider another example. We format a string based on some inputs. We need to insert a comma into the string based on the inputs.

public string SerializeToStr( string subject, string status )
{
  StringBuilder output = new StringBuilder();
  output.Append( "{'ticket':{" );
  bool aFieldExists = false;

  Action< string, string, string > Append = ( string item, string header, string trailer ) => {
    if ( item != null ) {
      if ( aFieldExists ) {
        output.Append( ", " );
      }
      aFieldExists = true;
      output.Append( header ).Append( item ).Append( trailer );
    }
  };

  Append( subject, "'subject':'" + DateTime.Today.ToString() + " ", "'" );
  Append( status, "'status':'", "'" );
  output.Append( "}}" );

  return output.ToString();
}

This needlessly long example could as well have been implemented with a helper method and a state variable for aFieldExists. However the anonymous method does provide for better locality of reference from an application code's point of view.

Anonymous methods and closures seem to be syntactic sugar in C# from a user's point of view. However they do improve expressiveness, and can be quite addictive to use.


Thursday, September 19, 2013

C# Delegate: A closer look

C# Delegate and Multicast

In the previous post, we looked at the C# callback mechanism using delegates, and the syntactic sugar provided by Func and Action.

Can we dig a little into the C# delegate implementation and learn more?

As we know, the delegate is really a function (method) pointer. But it does more - we can assign multiple methods to a delegate. When the delegate is invoked, it calls the methods assigned to it in order.

Consider 2 logging classes which implement their respective log methods:

public class ConsoleClass
{
//...
  public void LogToConsole( string mesg )
  {
    Console.WriteLine( mesg );
  }
}

public class FileClass
{
//...
  public void LogToFile( string mesg )
  {
    string file = "syslog.txt";
    using ( TextWriter f = File.CreateText( file ) ) {
      f.WriteLine( mesg );
    }
  }
}
The log methods have the same signature. So we can assign them to one delegate and invoke it.

//Test client code
public delegate void SyslogDelegate( string mesg );

private static void TestDelegates()
{
  ConsoleClass console = new ConsoleClass();
  SyslogDelegate d = console.LogToConsole;
  FileClass file = new FileClass();
  //Add second pointer to the delegate (multicast).
  d += file.LogToFile;

  d.Invoke( LogMesg ); //Shorthand: d( mesg );
}
As expected, invoking the delegate calls both the console and file log methods in order.

What is the C# delegate?

Looking at some methods/properties the delegate class itself exposes:

int i = 1;
Console.WriteLine( "Methods which were referred to by " +
                    d.GetType().Name + ": " );
foreach ( var delegated in d.GetInvocationList() ) {
    Console.Write( i.ToString() + ". " );
    Console.WriteLine( delegated.Target.GetType().Name + 
                       "->" + delegated.Method.Name );
    i++;
}
This prints:

Methods which were referred to by SyslogDelegate:
1. ConsoleClass->LogToConsole
2. FileClass->LogToFile
We can infer from the public method GetInvocationList() of the delegate, that the delegate simply holds a list of classes which enclose the referred methods. If we look at the IL for our SyslogDelegate in the Ildasm disassembler tool:

.class public auto ansi sealed Delegates.SyslogDelegate
       extends [mscorlib]System.MulticastDelegate
{
} // end of class Delegates.SyslogDelegate
we can see the delegate keyword generates to a class which derives from System.MulticastDelegate.

Looking at the System.MulticastDelegate class in a disassembler and reading MSDN, we can gather a couple of things:
1. The System.MulticastDelegate is something the compiler uses, we cannot normally inherit from it.
2. Since the delegate keeps a reference to the method, at least a part of the class which contains the method is alive as long as the delegate is alive.

The second point implies that as long as our delegate lives, the GC will keep the referred class around. At least if the method the delegate refers to, uses other variables inside the scope the class**. We can test this easily with a test program and a memory profiler.

This has implications wherever delegates are used, especially when dealing with higher level abstractions. For example, we can forget removing long lived events which could cause the referred objects to live for a long time, leading to a memory leak.

Simulating delegate

To understand the C# delegate mechanism better, let us try to write a naive and limited mechanism using reflection.

public class MyDelegate
{
  public MyDelegate(object target, string method) {
    this.Target = target;
    this.Method = method;
  }
  public object Target { get; set; }
  public string Method { get; set; }
  public void Invoke(Object[] paramArray) {
    MethodInfo m = Target.GetType().GetMethod( Method );
    m.Invoke( Target, paramArray );
  }
}

public class MyMulticastDelegate
{
  public List< MyDelegate > InvocationList { get; set; }
  public MyMulticastDelegate()
  {
    InvocationList = new List();
  }
  public void Add( object target, string method )
  {
    InvocationList.Add( new MyDelegate( target, method ) );
  }
  public void Remove()
  {
    if ( InvocationList.Count > 0 ) {
      InvocationList.RemoveAt( InvocationList.Count - 1 );
    }
  }
  public void Invoke( object[] paramArray )
  {
    for ( int i = 0; i < InvocationList.Count(); i++ ) {
      InvocationList[i].Invoke( paramArray );
    }
  }
In the test client code:

MyMulticastDelegate m = new MyMulticastDelegate();
//Add method 'references' to our delegate
m.Add( new ConsoleClass(), "LogToConsole" );
m.Add( new FileClass(), "LogToFile" );   

//Call delegate
m.Invoke( new object[] { LogMesg } );
This would call both the methods like the implementation with real delegates.

Of course, this is a naive implementation. What the C# delegate mechanism provides is:
- The ability to invoke a method at runtime based on its signature. Note that our implementation does not handle return types in any way.
- A type safe way of invoking multicast function pointers. So we get compile time checks (and with Visual Studio as-you-type checking). This is useful, otherwise we can easily cause runtime errors like we could in a reflection based invoke mechanism.

Let us explore closures and delegates in the next post.
Source Code

Side note: Could unsafe pointers and pinning be used to implement a naive delegate like mechanism 'from scratch'?


** If it were a method with no side-effects it could in theory be generated as a static which does not really require the class instance. BTW if the method actually uses variables outside its local scope but in the enclosing class's scope, we have something similar to a closure. More on this in the next post.

Sunday, September 15, 2013

Callbacks, Delegates, Action and Func in C#

C# delegates and generics

We can consider C# delegates as type safe pointers to methods.

Take a simple delegate SomeAction which takes a generic type and returns nothing.

delegate void SomeAction< T >( List< T > input);
We can use this delegate to refer to any method which has the same method signature.
In the code below an instance of SomeAction refers to PrintList

static void PrintList< T >( List< T > inList )
{
  inList.ForEach( ( i ) => { Console.Write( i + " " ); } );
  Console.WriteLine();
}

List< int > test = new List< int >(){1, 2, 3, 4, 5};
SomeAction< int > PrintIntList = PrintList;
PrintIntList( test );
//Output: 1 2 3 4 5

A map method in C# using delegate

Consider a map function, which takes as arguments a list and another function. It applies the function to all members of the list, to produce an output list. A naive implementation in Scheme could be:

(define (map f inlist)
  (if (null? inlist)
    '()
    (cons (f (car inlist)) (map f (cdr inlist)))))
So, if we want to apply a function which doubles a list:

(map (lambda (i) (* 2 i)) '(1 2 3 4 5))
;output => '(2 4 6 8 10)
(A note on implementing map in Scheme is here)

To implement a similar map method in C#, which operates on a List:

delegate T GenericDelegate< T >( T args );

static List< T > Map< T >( List< T > inList, GenericDelegate f )
{   
  List< T > outList = new List< T >();
  for ( int i = 0; i < inList.Count(); i++ ) {
    outList.Add( f( inList[i] ) );
  }
  return outList;
}

List< int > test = new List< int >(){1, 2, 3, 4, 5};
List< int > doubleTest = Map( test, ( i ) => { return 2 * i; } );
doubleTest.ForEach( ( i ) => { Console.Write( i + " " ); } );
//2 4 6 8 10 

Action and Func

These are just syntactic sugar in C# for generic delegates.
We could use this in place of the generic delegate we used in the previous section:

delegate TResult Func< in T, out TResult >(T arg);
So we could implement the map method using a Func overload as:

static List< T > MapWithFunc< T >( List< T > inList, Func< T, T > f )
{
  List< T > outList = new List< T >();
  for ( int i = 0; i < inList.Count(); i++ ) {
    outList.Add( f( inList[i] ) );
  }
  return outList;
}

List< int > test = new List< int >(){1, 2, 3, 4, 5};
List< int > doubleTest = MapWithFunc< int >( test, ( i ) => { return 2 * i; } );
doubleTest.ForEach( ( i ) => { Console.Write( i + " " ); } );
//2 4 6 8 10 
Action is a limited form of Func, it only represents a delegate which takes argument(s) and does not return anything. For example:

delegate void Action< in T > (T arg);
C# provides Func and Action for 1 to 16 arguments, which the language designers think should be enough for most cases.

C# Action, practical use

Consider a solution with a hierarchy of projects

Solution
|
|
---Common
|
|
---Web (References Common)

At times, we would like classes in Common to call methods in the Web project. However, by design, we do not want Common to reference the Web project. We can solve this problem by using callbacks, which in C# are Action or Func (or delegates).

Let us say there is a class in the Common, CommonUtil with a method UsefulMethod

namespace Common {
  public class CommonUtil {
    //...
    public void UsefulMethod(string someArg, Action< string > syslogOnError) {
      //...
      syslogOnError( "Some common error in the context of the caller" );
Now in the Web project, we call the UsefulMethod from a Controller class:

using Common;
namespace Web {
  public class TestController : BaseController {
    CommonUtil util = new CommonUtil();
    util.UsefulMethod("abc", new SyslogControllerDelegate( this ).Syslog());
    //...
Where the Syslog method would use the controller to add extra context to a syslog message:

namespace Web {
  public BaseController controller;
  //...
  public Action< string > Syslog() {
    Action< string > syslogAction = delegate( string mesg ) {
      //Writes controller context in addition to mesg to the syslog
      new SyslogBuilder( controller ).Info( mesg );
    }
  }
}
So when UsefulMethod calls its syslogOnError, in addition to the Common error we would also get whatever extra context SyslogBuilder wants to print about the calling TestController. Note that Common did not have to reference the Web project to do this, it just used a callback provided by the calling Web class.

Conclusion

We can consider Action and Function as syntactic sugar for generic delegates in C#. They are a simple and type safe way to implement callbacks (function pointer like functionality) in C#.

Wednesday, September 4, 2013

Back to basics: A demo http server and the TCP/IP stack on Windows

Let us build a web server

At work, I spend a lot of time on trivial things like 'how to fix html/css to make this page work for IE7'. I realized it has been a long time since I had to look at how the web server which powers our product really works under the hood. The best way to refresh my memory was - implement a web server and learn how it works bottom up. I used the language and platform I am currently using at work: C# and Windows.

Webserver implementation

Source code in GitHub
I updated code someone left on pastebin. The link is gone now, so I cant reference it. This is a toy implementation, error handling or performance are not considered. It uses the .NET TCP library and implements a small subset of the http server side protocol. Features:
- Simple threaded handling of requests
- Logging
- URL re-write
- Implement GET and POST
- CGI meta-variables
- Powershell as a CGI extension
- Process Cgi and File GET
- Implement some error codes - 400, 403 and 404


TCP/IP

If I had to implement any of the protocol layers, the RFCs and standards corresponding to that layer would be a good starting point.
Diagram modified from: http://www.codeproject.com/KB/IP/serversocket.aspx

Windows TCP/IP stack

OSI Layers 1 and 2

The driver interface is generally implemented by a Network Interface Card (NIC) vendor. To program a driver, you need to get the Windows Dev Kit (WDK), ndis.lib (Network Driver Interface Specification library). Good places to start would be the WDK 'passthru' sample and source code for the WinPcap (Windows Packet Capture) library. C or C++ are the languages to program with the NDIS.

OSI Layers 3 and 4

TCP/IP is implemented in the protocol driver TCPIP.sys, as part of the Windows kernel. At this layer AFAIK there is limited programmability. Windows does provide several configuration settings in the registry: TcpNumConnections, TcpWindowsSize and others. There are command line programs which operate at this level: arp, route, netstat, tracert, netsh, nsloopup, ipconfig. I have read about people who actually patch the TCPIP.sys, but I dont know how this is done.

OSI Layers 5-7

At this layer we are above the kernel in the user space. Various dlls implement different services:
TCP and UDP sockets - ws2_32.dll, winsock.dll
WinInet (FTP, HTTP 1.0, HTTP 1.1) - wininet.dll
RPC - rpc*.dll
NetBIOS - netapi.dll
WNet - mpr.dll
Diagram modified from: http://i.msdn.microsoft.com/dynimg/IC72693.gif

Protocol

Notes on protocol specs for the web server implementation.

HTTP 1.1 – RFC 2616

- The HTTP protocol is a request/response protocol.
- A client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content over a connection with a server.
- The server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity meta-information, and possible entity-body content.

MIME – RFC 2045 etc.

- Multipurpose Internet Mail Extension
- Defines how to send data over the internet
- Content-types: text, multipart, message, application, image, audio, video

Diagram from: http://docs.oracle.com/cd/E19957-01/816-6028-10/asd3j.htm

HTTP Communication

request chain ------------------------>
<---------------------- response chain
UserAgent------------- Origin server
example
Browser---------------- Webserver


request chain -------------------------------------->
<------------------------------------- response chain
UserAgent --------A-------B------- Origin server
example
Browser----Firewall----Router----- Webserver

Http Request:

HTTP-message = Request | Response
Request = Request-Line
*(( general-header
| request-header
| entity-header ) CRLF)
CRLF
[ message-body ]
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Method = "OPTIONS" | "GET" | "HEAD" | "POST" | "PUT" | "DELETE" | "TRACE" | "CONNECT" | extension-method
...

Http Response:

Response = Status-Line
*(( general-header
| response-header
| entity-header ) CRLF)
CRLF
[ message-body ]
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
Status-Code = "100"| "101"| "200"| ...
Reason-Phrase = *
...

CGI – RFC 3875

Common Gateway Interface
How a HTTP server and script/program share responsibility to respond to client request.
Request meta-variables.
Contain data about the request passed from the server to the script.
Can be accessed by the script in a system defined manner.
CGI Meta variables: "AUTH_TYPE" | "CONTENT_LENGTH" | "CONTENT_TYPE" | "GATEWAY_INTERFACE" | "PATH_INFO" | "PATH_TRANSLATED" | "QUERY_STRING" | "REMOTE_ADDR" | "REMOTE_HOST" | "REMOTE_IDENT" | "REMOTE_USER" | "REQUEST_METHOD" | "SCRIPT_NAME" | "SERVER_NAME" | "SERVER_PORT" | "SERVER_PROTOCOL" | "SERVER_SOFTWARE" | scheme | protocol-var-name | extension-var-name


Tuesday, January 22, 2013

Logging for fun and profit

The problem

I work for a startup, and the typical users of our web based application are school administrators and teachers.

Sometime back, my manager presented interesting data related to user activity that he had collected with Google Analytics. From the questions our non-engineering colleagues asked during the presentation, it seemed like this was not enough for non-developers to answer business questions like:

  • Did user X in school district Y edit student Z's test information?
  • User A has raised a vague help desk ticket. What did A really do?
  • Which is the most active school district for today? For this month?

Getting the others to make database queries to the audit database didn't seem like a viable solution. Firstly needed to give an intro to SQL, and then get them familiar with the schema and then...

Another option was to extend our Google Analytics integration. Since we preferred to have more control over the data, we thought of doing it differently - implement a log analysis mechanism to allow non-developers to get quantitative answers to business questions. Some of the aims while implementing a solution were:
  1. Low cost, at least till the solution has paid for itself. As a startup, this aim is usually a necessity.
  2. Have minimal impact on the application.
  3. At a later time, be able to feed back this data into the features in the main application.

The message format

Layer 1

If a standard has already been defined as an IETF RFC, it is a good idea to use it.
I based the message format on the syslog protocol, and simplified the format prescribed in the RFC to allow simpler parsing/indexing of the message data.

SYSLOG-MSG = STRUCTURED-DATA [MSG]
STRUCTURED-DATA = "[" "TIME="TIMESTAMP "PRIORITY="PRIORITY-VALUE "APP="APP-VALUE "UUID="UUIDID [SD-NAME"="SD-VALUE]* "]"
(More in the RFC)

To keep the messages short, I constrained them to a max of 2048 characters.

Layer 2

My guess was that the data search will eventually be better if the applications formatted messages in a natural language like way. For this I chose a subject-object-verb format.

[USERDATA] [SUBJECT] [VERB] [OBJECT] [FOR] [ADDITIONAL PARAMETERS]* [MESSAGE]

So an example message could be:
[ TIME="2013-01-21 16:30:22:51" PRIORITY="INFO" APP="MyAppNAme" UUID="a2383776-9bfa-ced3-b861-1349808ddba6" CONTROLLER="StudentHistory" USER="john" USERID="1090" DISTRICT="Chester County Schools" STATE="SC" SUBJ="STUDENT" VERB="VIEW" OBJ="HISTORY" FOR="EDIT" STUDENTID="248" STUDENT="Jane Doe" ] This does not really need a message.

The protocol

I decided to use UDP based on past experience. As compared to TCP, UDP is:
  1. Non-streaming i.e. 'Fire and forget'. By design, the client (i.e. our production system) is more decoupled from the logging server when compared to TCP. If there are issues with the logging server, there is less likelihood of it hurting the web servers.
  2. Faster. Minimal overhead (no 3 way handshake). Smaller header size. No recovery options etc.
  3. Works well for small messages, which is true by design for our logging.
SNMP, the popular network management protocol uses UDP as transport.

On the other hand UDP is unreliable and lacks congestion control. However these weren't big issues for us, since we didn't care if say 1% of the logging data is lost, at least not currently. And since we planned to constrain the amount of data being logged in the application, we didn't think congestion would be an issue.

Implementation

The C# implementation (our web servers run on IIS/ASP.NET/Windows) is quite simple. All that is needed is a UDP library, string manipulation and optionally a dictionary data structure for the 'structured data'.

The parts consisted of:
A configuration section with server(s) name and port.
Layer 1: Syslog class which takes a dictionary of 'structured data', formats it into a log message and sends it over UDP to the server(s)
Layer 2: A builder class which allows log messages to be created using method chaining.

The details aren't particularly interesting, so won't describe it here. The end result is that in the consumer code, an example log call would be:


new SyslogBuilder( this )
    .Subj( "USER" )
    .Verb( "VIEW" )
    .Obj( "SWITCHDISTRICT" )
    .AddData("DISTRICTNAME", district.districtName)
    .Info( Session.SessionID );



The remaining work was to insert the logging code in the application.

The log server

It was tempting to implement a UDP listener which parses the log messages, create an indexing scheme using a non-SQL store and build a UI which then... However with 10 other pending tasks, this didn't seem like the way to go. Moreover, mine would likely be a terrible implementation for a problem more intelligent programmers have solved elegantly.
(sometimes I look back at my previous job in a big company, when I had time to do more fun stuff like that - but usually ended up browsing the web :)

So I chose Splunk, an excellent tool I had used before for log analysis. The free version has:
  1. A UDP listener
  2. A schema-less system which indexes text data as it ingests it.
  3. An elegant search language based on UNIX command pipeline.
  4. A web UI with ability to create dashboards and applications.
  5. Free for up to 500MB of log data a day. Based on some quick calculations, that was more than enough till that we can move on to a real license of Splunk (i.e. we make lots of money, get taken over or some such happy scenario).
I uninstalled software unnecessary for my purpose from an Amazon EC2 Ubuntu server, installed the free version of Splunk, set it up to listen on the UDP port of choice and configured the firewall for our webservers.

The free version of Splunk does not provide access control for its web UI, so had to setup Apache as a reverse proxy to provide for this (adding some firewall rules is probably not a bad idea too).

Starting it up

After making sure the firewall rules in our data-center was okay, it was time to send some test data. Perhaps with a Powershell script, but my laziness demanded something simpler: copy over nc.exe (netcat) from my Cygwin installation, to a temp directory on the server. Ran it but it complained about missing cygwin1.dll, copied that and we are ready to test.

nc -uvv mysplunkserver 514
Hello world

Doing a search on the Splunk server gave:
12/16/12
11:44:44.000 AM
hello world
host=mysplunkserver   Options|  sourcetype=syslog   Options|  source=mysourceipaddress.mydomain.com

Ready to turn production data on!

With data from a couple of days was able to create some drill-down reports. This was useful to present the tool to the non-developers. A few screenshots:





The pretty graphs are useful for some scenarios, but the real power is usually in the search.  Splunk makes searching easy since it indexes the data based on its inherent format.

e.g. to look for the top users in a particular district:
DISTRICT="Chester County*" | stats count by USER
which I think is a simple enough language to learn for non-developers.

Soon our customer support person had used this to solve a real life problem. Our founder, who does not have an engineering background, has thought up more uses for the tool. IMO this is one of the better things of working in a startup, your implementation's usefulness is quickly verified. Can be disheartening when it fails, but great when occasionally it is useful to others.

At some point, we hope to use the data (via Splunk's RESTful API) to feed back into certain features in our main application.

Would like to hear how you implement logging in your system!


Boston, MA, United States