RavenDB is a bad choice for Asp.Net Session Provider

I was considering using RavenDb as a custom session state provider in a web application.  Essentially I am looking for a document database that works as a session state provider and also serves other application needs.  NServiceBus bundles this database, so I thought I don’t have to use one more document database.

This is my first encounter with RavenDb.  I read lot of good things about this database.  I wanted give a try to see if this fits in to my requirements.

With a little bit of googling,  you can find few session state provider implementation for RavenDb here, and here.  All these implementations are based on Microsoft’s ODBC session state provider sample here.

All the important methods you need to implement for your own provider are explained here.

Session State Provider is little tricky to implement.  In the context of session asp.net can server three types of page requests.

  • Pages that require no session
  • Pages that require read-only session
  • Pages that require writable session

All these requests can concurrently reach your provider.  Thanks to ajax calls.
All these three types of calls have the potential to update the session document concurrently.

Now the writable session pages attempt to use ‘LockId’ and ‘Locked’ properties to protect session data corruption from concurrent write calls.

Pages that require no session data just extend the expiry time of the session.  Pages that required read-only sessions should be able to retrieve the session data given a session id and they will wait for the releasing lock.

All of the above RavenDb session state provider implementations buckle at concurrent user loads.  These providers try to work around RavendDb limitations for this scenario.

Queries require Indexes

You cannot use RavenDb queries to get the session document(s).  This is because RavenDb queries require indexes and these indexes run in the back ground.  Under a heavy load these indexing yields stale documents. And if you wait for the indexing to finish you will run in to timeouts.

No Find And Modify Support

RavenDb won’t support find and modify operations over a collection.  Following type of query is not possible.  Now you are left with loading the session by its id and then modifying its partsby examining its properties.

UPDATE Sessions SET Locked = true WHERE Id = 'xyz' AND Locked = false

With out using indexes, and without the support for atomic operations as above, you are forced to write the following code.

var doc = sessionStateDoc.Load("xyz");
if(!doc.Locked){
// other code
}
This kid of code increases the chances for concurrency.
 
Optimistic Concurrency
 
Pages that doesn’t requie session attempts update the “Expires” property of session document, at the same time a page that requires writable session might attempt to update the “LockId” property of session document.
Even though they are totally different properties of the same document, we are forced to deal with the concurrency.  There might be a way to do fine grained concurrency based on specific fields, I find it way too much trouble than necessary.
When session module calls GetItemExclusive method, if it runs in to concurrency problems, we can simply return null by indicating session module to retry getting the document.  But how many times we should do this?  This slows down the pages.

You can probably attempt to use RavenDb patch command update “Expires” property, thus avoiding concurrency conflicts.  But soon you will find that this idea fails when we try to use RavenDb Expires Bundle.

Expiring documents

RavenDb comes with an expiration bundle that allows you to remove expired session documents.  In order to make this bundle work, you need to make use of the metadata constructs like the following. Unfortunately you must include this setter as part of the unit of work.

db.Advanced.GetMetadataFor(session)["Raven-Expiration-Date"] = DateTime.UtcNow.AddMinutes(20);
sessionStateDoc.Expires = DateTime.UtcNow.AddMinutes(20);
sessionStateDoc.SaveChanges();

This prevents us from doing partial updates to the document. This metata data update must be done every time you update the collection. 

Other minor but annoying issues  

  • As of build #2261 there are still bugs.
  • Master-Master replication won’t work when you use API Keys.
  • Expiration bundle randomly deletes session documents.
  • Raven Studio doesn’t give you a comfortable feeling of using a professional grade database.

Following changes gave me a relatively stable implementation of RavenDb Session Provider under higher loads.

  • Do not use expiration bundle.  Use a server side trigger or a scheduled task to expire documents. This allows us to do path command for updating “Expires” property in “ResetTimeOut” method.
  • Do not use concurrency checks while removing the item, saving the session data, and while releasing exclusive locks.  These calls must succeed, if they fail you might get in to logic errors in the app.
  • Use optimistic concurrency check only in GetItemExclusive/GetItem routines.  If the concurrency check fails, simply return null, this will force session module make calls to these methods.
All in all I am not happy about the friction.  I smell maintenance head aches.
I would like to try another document database to see if that fits better for this usecase.

Typescript is neat

javascript is not picky about line endings.  Following code looks fine but causes trouble.

function func() {
  return
  {
    greet: "Hello World"
  };
}
console.log(func().greet);

Line number 7 prints, undefined. 

This is because return statement at line number 2 has line ending.  Typescript marks the usage of greet in line number 7 as an error.

 Error    1    Expected var, class, interface, or module   …TypeScript1TypeScript1app.ts    7    13    app.ts

May be they can do better with the error message.

This problem can be fixed by moving opening curly brace at line 3 to line 2.

function func() {
  return {
    greet: "Hello World"
  };
}
console.log(func().greet);

Now the error goes away, and also you will get intellisense  for the string property “greet” after func().  in line 6.

Tips for windows developer working on Mac–Tip # 2

Many times you want to run a text editor from terminal.  May be you want to edit a .bashrc file or a .gitignore file.  I use my favorite editor that works on both windows and mac, Sublime Text

ln -s “/Applications/Sublime Text 2.app/Contents/SharedSupport/bin/subl” /usr/local/bin

once you typed that command from terminal, you can now open files using sublime directly from your terminal by typing

subl ~/.bashrc

Posted in: mac |

Tips for windows developer working on Mac–Tip # 1

Files/folder with names starting with period are hidden by default in Finder.  So files like .bashrc or .gitignore are not can’t be found.  Running following commands from the terminal fixes that issue

defaults write com.apple.finder AppleShowAllFiles TRUE

killall Finder

Second command restarts the finder.

Posted in: mac |

Running Asp.Net MVC controller actions on STA threads

Recently while we were converting a legacy asp application (Yes they still exist) to asp.net mvc, we had to work with a set of 3rd party business critical components.

These components were legacy COM components.  In load testing we found that controller actions that contain calls to these components were crashing w3wp process almost every minute.   A little bit of research around this problem yielded the following article.

Running ASMX Web Services on STA Threads

The summary of the problem is, MVC action methods run in COM multithreaded apartment (MTA) threads.  These legacy components were being created from MTA threads are being serialized and are being processed by single STA thread.  On top of that these components are loading tons of data in to memory and the load is causing the memory corruption.

So the solution is to make the MVC action method to run on STA thread, thus allowing COM to place the object instances on the creator’s thread.

Here is an MSDN thread that ties up the STA with asp.net mvc

AspCompat=true does not work with MVC

The above solution doesn’t support asp.net sessions and also was written for asp.net mvc 1.0

Here is the asp.net mvc 3.0 adjusted solution.

First the RouteHandler, it derives from MVCRouteHandler and instantiate a IHttpHandler derived class.

public class STARouteHandler : MvcRouteHandler
{
  protected override IHttpHandler GetHttpHandler(RequestContext requestContext)
  {
   return new STARequestHandler(requestContext);
  }
}
Next the STARequestHandler,  the key to this entire magic to work is in the BeginProcessRequest and EndProcessRequest methods. These two methods create an aspnet compat wrapper around the actual execution of action method.  This handler also implements IRequiresSessionState marker interface to support the sessions.
public class STARequestHandler : Page, IHttpAsyncHandler, IRequiresSessionState
{
  public STARequestHandler(RequestContext requestContext)
  {
    if (requestContext == null)
      throw new ArgumentNullException("requestContext");
    this.RequestContext = requestContext;
  }

  private ControllerBuilder _controllerBuilder;

  internal  ControllerBuilder ControllerBuilder
  {
    get { return this._controllerBuilder ?? (this._controllerBuilder = ControllerBuilder.Current);}
  }

  public RequestContext RequestContext { get; set; }

  protected override void OnInit(EventArgs e)
  {
    string requiredString = this.RequestContext.RouteData.GetRequiredString("controller");
    var controllerFactory = this.ControllerBuilder.GetControllerFactory();
    var controller = controllerFactory.CreateController(this.RequestContext, requiredString);
    if (controller == null)
        throw new InvalidOperationException("Could not find controller: " + requiredString);
        try
        {
            controller.Execute(this.RequestContext);
        }
        finally
        {
            controllerFactory.ReleaseController(controller);
        }
        this.Context.ApplicationInstance.CompleteRequest();
    }

    public override void ProcessRequest(HttpContext httpContext)
    {
        throw new NotSupportedException("This should not get called for an STA");
    }

    public IAsyncResult BeginProcessRequest(HttpContext context, AsyncCallback cb, object extraData)
    {
        return this.AspCompatBeginProcessRequest(context, cb, extraData);
    }

    public void EndProcessRequest(IAsyncResult result)
    {
        this.AspCompatEndProcessRequest(result);
    }

    void IHttpHandler.ProcessRequest(HttpContext httpContext)
    {
        this.ProcessRequest(httpContext);
    }
}
 And lastly the usage of this handler.
While creating a route for the action method, simply attach this handler to the route definition.
context.MapRoute("STARoute", "{controller/{action}", 
        new { controller = "Home", action = "Index")
            .RouteHandler = new STARouteHandler();
You can test the apartment state by calling the following line of code in the action method.
Thread.CurrentThread.ApartmentState.ToString();

MongoDB mapreduce for stackoverflow.com recent tags

I was playing with MongoDB’s mapreduce, and wanted to write a query that simulates the list of ‘Recent Tags’ feature on stackoveflow home page.

I am using mongo-csharp-driver for this experiment.

Here I am taking a guess at stackoverflow’s domain model. Here is the Question model with enough properties to demonstrate the mapreduce query.
Every question is associated with a list of Tags and has a CreatedOn property.

public class Question
{
    public ObjectId Id { get; set; }
    public DateTime CreatedOn { get; set; }
    public ICollection Tags { get; set; }
}

Create a database with some name and add ‘Questions’ collection to the mongodb.

We are going to make the following call to find the recent tags.

    var recentTags = questions.MapReduce(map, reduce).GetResultsAs();

RecentTagResult holds the results of the mapreduce query and defined as

public class RecentTagResult 
{
    public string Id;
    [BsonElement("value")]
    public RecentTag Value;
}

MongoDb’s mapreduce call outputs a result set with two properties _id and value. So here I am defining a mapper class with a property ‘Id’ and property ‘Value’ of type RecentTag. BsonElement attribute in the above code simply maps lower case property ‘value’ from the result set to title case property ‘Value’. _id property from result is automatically mapped to Id by ‘GetResultsAs’ call.

RecentTag is defined as follows. Here I am expecting that the query results are going to contain Tag, count of tags, last time when a question was created with this tag. As you can guess our map/reduce functions must emit the values that match the following class definition.

public class RecentTag 
{
    public string Tag { get; set; }
    public int Count { get; set; }
    public DateTime LastSeenOn { get; set; }
}

Now coming to the meat of the problem. A question can have more than one tag. We are looking for a list of tags used by questions asked in the last month. Here is the definition of the map function (which is completely written in javascript) that goes over entire collection of questions and emits a result each time a tag is found if that question is asked in the last month.

private string map =
    @"function() {
        if(this.CreatedOn >= new Date('Oct 1, 2011') && this.CreatedOn {
            var lastseen = this.CreatedOn;
            this.Tags.forEach(function(tag) {emit(tag, {Tag: tag, LastSeenOn:lastseen, Count: 1});});
    }}";

It is very important that the emit function’s value should match our RecentTag type definition.

emit(tag, {<b>Tag: tag, LastSeenOn:lastseen, Count: 1</b>})

Now coming to reduce, we have a bunch of emitted results from map function and we simply count them to find the total count of each tag found in last month.
A tag might appear more than once in the last month. If the tag appears only once, reduce function will never be called.

    private string reduce = 
        @"function (key, arr_values) {
        var dates = [];
        arr_values.forEach(function(val) {dates.push(val.lastseenon)});
        var result = {Tag: key, LastSeenOn: new Date(Math.max.apply(Math, dates )), Count:0};
        for(var i in arr_values)
        {
            temp = arr_values[i];
            result.Count += temp.Count;
        }
        return result;
    }}";

Here att_values contain all emits for a single tag. Again it is important that our return type must match with ResultTag definition similar to the map function.
We start with that definition first

var result = {Tag: '', LastSeenOn: new Date(Math.max.apply(Math, dates )), Count:0};

And then iterate through arr_values and simply increment the count to get the final result.

Filling LastSeenOn property is little tricky. Here we are trying to find out the Max of the CreatedOn property of all emitted values of a single tag.

var dates = [];
arr_values.forEach(function(val) {dates.push(val.lastseenon)});

Here we are gathering all the dates from lastseenon property in to an array. And then while defining the result we are applying the javascrpt’s Math.Max function to find the last seen date.

LastSeenOn: new Date(Math.max.apply(Math, dates )

That is all to it.

Open Source Dependencies

Total Cost of Ownership is a critical metric which I would like to pay attention to in software development. A project with lots of open source dependencies can become very difficult to maintain.
With advent of modern package systems like gems, Nuget and CDNs it has never been this easy to use open source in software projects. As of this writing there are thousands of JQuery plugins, hundreds of ruby gems, and hundreds of Nuget packages. I have seen developers arguing about their favorite ORM tool. I have not seen enough arguments about which JQuery light box plug-in should be used for the user interface. If you try to take a stock of all dependencies (both commercial/open source) on your code, you might be surprised to see the list.

I surveyed few .Net web applications. A single web application can have following list of components, without counting the major dependencies like Object Relational Mapper

  1. PDF
  2. Charting
  3. Ajax
  4. Half a dozen JQuery plugins/other JavaScript alternatives
  5. Spread sheet
  6. Dashboard
  7. Social networking
  8. Payment gateway
  9. Reporting
  10. Other value added services like support, feedback, live help
  11. External web services
  12. Scheduling
  13. Email
  14. JSON
  15. Mocking
  16. Unit Testing

.. and the list goes on.

Now imagine using a open source component for each one of these dependencies. That is a lot of code to maintain!

All non-trivial abstractions, to some degree, are leaky

At one point or the other your team needs to know internals of every open source library used in your project. Some of these libraries have hard dependencies, with a potential of preventing the future upgrades.

Never under estimate the testing burden. Take the example of a JQuery plugins. Usually these plugins depend on JQuery. When you upgrade JQuery, you are forced to upgrade the dependent plugins. Depending on quality of the plugin, many times you end of spending hours in debugging why a web page is failing to find the correct version of plugin that works. Cross browser testing is also time killer. All this can become very complex when all you trying to do is upgrading JQuery to its latest version, which itself is a trivial task.

You might not see this kind of risk with popular open source libraries. As they operate similar to commercial offerings. But not all of the open source projects are popular.

Some tips to control the proliferation of these dependencies in to your projects

  1. Always maintain a list of open source dependencies. Make it available to QA team and developers.
  2. Create a test suite to test all these dependencies. And run the tests with every deployment.
  3. Always check-in the source code for the open source project in to source control.
  4. Try to keep these dependencies to minimum. Remember that your team needs to re-learn these dependencies on every upgrade or major functionality rewrite.
  5. If possible try to stick to a known set of controls from a single vendor.
  6. While choosing the libraries weigh in your team’s skill set, team’s composition and future direction. If you don’t have dedicated resources to focus on Ajax work of the website, it is better to stick to a commercial solution than using a laundry list of multiple open source offerings.

ASP to ASP.Net upgrade Part -1

Recently I worked on a project that involved upgrading a 12 year old legacy web site to written in asp to asp.net. Yes asp is still being used on a large number of websites.

This blog series is an account of how we did this upgrade, what I learned on the way.

Goal

The goal is upgrade the multi-tenant asp site to asp.net 4.0. This should be done with out altering the site’s functionality in any major way in the first iteration. Upgrade should be done with minimum business disruption. The idea is to modify the parts of the application iteratively in the future.

If you understand the feeling of changing the wheels of a moving train, that is exactly how I felt at this requirement.

The code base contained about 5000 pages, with an average of 1000 lines of asp code, data access code, business logic, COM components, HTML, Javascript. There was no source control, and most of the current changes were being done directly on production site. It would take about at least a week to create another copy of production version of the site in a test environment. Code itself is not complicated but the multi-tenancy nature of the site forced the previous developers to copy paste the similar code in various customization scenarios.

The Team

A developer with strong scripting background, preferably PERL – Writes code to perform the string manipulation magic where ever visual studio comes short.
Couple of developers from legacy team – These developers act as subject matter experts for the new code base. Responsible for finding all defects before the new code hits QA cycle.
Senior ASP.NET developer – This developer knows in and out of the asp.net and should be able to negotiate with any awkward porting scenario and comes up with the proper guidance
ADO.NET developer – There will be tons of ado code that need to be converted in to ado.net
Couple of mid level asp.net developers – Works on most of the conversion tasks
A business analyst – Glue between project team members and stake holders.
Couple of QA testers – Responsible for testing the code continuously, helping the dev team to stabilize the builds

The Practice of Programming

I am a software programmer, I love programming and I like to practice it every day.   Malcolm Gladwell in his famous book “Ourliers’ talks about 10000 hours, the difference between best and mediocrity.

Here are few tips that help with the practice of programming.

Daily Practice

Allocate 1 hour every day for the deliberate practice.   Start your work day an hour early than normal. Don’t be surprised if you are feeling like you are watching your favorite soap on TV after a while

Align your practice with your work

Pick a cookbook/recipe book on your favorite programming language.  These books contain a number of every day problems you face in the area you are working. Usually solving these problems won’t take much time. It is very important not to look at the solution until you solved the problem. The very pain you go through while trying to solve a problem is essential element of the practice.

Write unit tests while solving those problems. It is another nice way to improve the quality of your tests in an education setting.  After solving a number of these problems, you should see an improvement in the quality of your work at your work place.

Read, Read, Read

Reading is essential to your practice. Spend your lunch time reading a book on software development or blogs. One advantage with reading is, you get to learn from others mistakes. Why a specific tool is good or bad? Why you need to care about HTML 5? You will get to see different perspectives on contemporary technologies.

Learning on weekends

Subscribe to one of the video learning sites.  2-3 hours of time spent on learning from these sites is enough to pick up the essential skills needed for a given technology.  There are a number of websites that helps you in learning  Pluralsight, TekPub, Peepcode, nettuts.

The Upgrade Dilemma

If you are working on a project that uses 3rd party products you should be facing this dilemma regarding upgrades, regularly. It could be a core system upgrade such as .Net framework version or a SQL Server database version or a significant 3rd party component upgrade. Usually these upgrades proposals come from tech teams.

Assuming that cost of the software that is being upgraded itself is not prohibitive, most of the arguments are centered on the work that needs to be done by the team to stabilize the product with the upgraded version

Arguments

1. Arguments in favor of the upgrade

a. If we don’t upgrade now, when would we upgrade? After we acquired another 1000 customers?
b. It would be an incentive for the developers. It will help with both hiring and retaining good developers, since they tend to want to work on the latest and greatest.
c. We could get rid of some of the nagging bugs in the product, as the new upgrade solves them
d. New upgrade contains significant and important changes and we should take advantage of them
e. If we are up to date with the latest version, our product won’t become obsolete

2. Arguments against the upgrade

a. There aren’t many useful features in the new version
b. Everything works now so why should we unnecessarily disturb the code base?
c. What is in it for customers, since they didn’t ask for the upgrade?
d. If we upgrade component A, component B will not work
e. We will need to test the whole product, which would be very expensive. We might miss some parts which can put the product at risk
f. The whole process could be exhaustive and we may burn out the team
Most of the arguments against the upgrade are centered on changing the code base, and the fact that changing code could be a risky affair. However, I usually favor staying as current as possible, because the speed with which new upgrades are being released to the market is extremely fast, and if we fail to stay current, the product will end up becoming stale and will almost certainly require a big rewrite in the future.

Process – What Not To Do

Upgrade by committee

It is very tempting to gather all parties involved and conduct formal meetings, get all of their buy ins, create and manage a project plan. This means lot of meetings, follow ups, emails and creates inevitable creation of heroes who will end up taking on the bulk of the work to meet some arbitrary deadline. It may sound logical, and works most of the time. But this is a very expensive process, and usually involves lots of meetings. These meetings are the breeding grounds for “Fear, Uncertainty, Doubt (“FUD”). Any one person in these meetings can derail the attempt or significantly delay the process. After one such upgrade experience you will lose the appetite for any future attempts.

Process – What To Do

If you are really looking ways for successfully convince the organization or your boss for an upgrade here is a summary of what worked for me.

1. Build confidence

Your organization must have faith in your team. They should believe that your team can deliver. They should know that you can fix issues fast and that you care about the success of both the business and the product. This won’t come in a single day. You need to build that confidence every day. One trick that works is to constantly solve simple problems. These kinds of problems won’t take much time. They don’t need big project plans. They come from all parts of the organization. Solving them proves that you are for continuous improvement.
Try to tackle troubled areas of the products. These are the parts of the software that attract frequent bugs, frequent code churns. Try to take initiative and make that area trouble free. Once you successfully solve such problems, you are guaranteed to gain the support in any of your future upgrade proposals.

2. Build safety net

If you don’t have them already, you should start writing tests for important areas of your application. These tests when written properly help you in two ways. They help you identify the problem quickly and also help serve as a repository of business knowledge. You must run these tests with every code check-in.

3. QA is your friend

The most important help you can get from you QA team is in the form of exploratory tests and also from testing tedious, non-automated parts of the application. A tester with an intention to solve the problem is the best resource than a tester with an intention to just report a problem. Have them involved in every non trivial decision your team makes.

4. Research

If the upgrade is going to have big impact on your product, wait for about 6 months after the release date of the upgrade. Go through the release notes, breaking changes, write small prototypes to test the areas of the concern.
Most importantly go through the product support forums to understand the quality of the version you are upgrading. Even if you decide not to upgrade, this research helps you decide the future direction of your product.

5. Build excitement within the team

Try to lay out the broad plan for the upgrade. Take your team’s consensus. Understand possible real problems. Tell them why the upgrade is important. A self-organizing, motivated team can make the whole upgrade process a trivial affair. As a team, you should try to think through and anticipate every major problem, and come up with possible options. Consider other major projects your team is dealing with to make sure that they are not being unnecessarily burdened. Come up with a possible time line.

6. Sell the proposal

Get the major stake holders in a room and sell your proposal. If you did your homework properly this should go well.

7. Appoint a facilitator

Find someone in the team (a PM, BA, team lead etc.) who can act as glue with all the team members. Their job is to continuously update all stake holders (management, team, end users etc.) about the team’s decisions, choices and find answers.

8. Short but to the point standup meetings

Meet twice a day to talk about the most important issues that you team is facing that day. This helps the team to share the work, gather questions, seek answers and inform stake holders about problems.
I too often here developers complaining that their organization is not good at keeping at the top of the latest technology trends. I believe that this is mostly due to the lack of effort from the software teams themselves.