Monday 18 July 2011

Lazy loading and the select n+1 problem

With Nhibernate Lazy loading comes switched on by default (in fact it has since version 1.2). However there are some things that you need to understand, lazy loading only works as long as the session is open and one of the side effects is that you may hit the select n+1 problem.

To see this problem in action then let's say you have a collection of Movies and each Movie has a collection of Actors. In other words, Movie to Actor is a one-to-many relationship.

Now, let's say we get the first 5 Movies and then for each iteration we need to show all the actors.

The following SQL would be sent to the server:-
SELECT TOP 5 * FROM Movies;
SELECT * FROM Actor WHERE MovieId = 1;
SELECT * FROM Actor WHERE MovieId = 2;
SELECT * FROM Actor WHERE MovieId = 3;
SELECT * FROM Actor WHERE MovieId = 4;
SELECT * FROM Actor WHERE MovieId = 5;
In other words, you have one select for the Movies, and then N additional selects for actors. As you can see you would be forced to hit the database 6 separate times, this is in essence is the select n+1 problem.

NHibernate overcomes this problem by allowing you to specify the batch-size in your mapping files. To see this in actual we are going to use the following domain:-
public class Movie : Entity
{
  public virtual string Director { get; set; }
  public virtual IList<actor> ActorList { get; set; }
}

public class Actor : Entity
{
  public virtual string Name { get; set; }
  public virtual string Role { get; set; }
}
All you need to do is to place the batch-size attribute to the bag definition.

  <key column="MovieId" not-null="true" />
  <one-to-many class="Actor" />

This now will produce the following SQL that only hits the database twice:-
SELECT TOP 5 * FROM Movies;
SELECT * FROM Actor WHERE MovieId IN (1,2,3,4,5);
If we set a batch-size of 4 we would would hit the database three times:-
SELECT TOP 5 * FROM Movies;
SELECT * FROM Actor WHERE MovieId IN (1,2,3,4);
SELECT * FROM Actor WHERE MovieId=5;
This is a real sweet spot that NHibernate provides for you out the box. I think is a real testament to some of the power NHibernate gives you allowing you to fine tune your code with minimal fuss. Of course for a simple model you could use eager loading but for models with lots of properties (database columns) then this might not be desirable. I for one prefer choice and this reinforces the sweetness that comes with NHibernate.

Entity framework (as of 4.2) cannot solve this problem out of the box and one approach is to use .Include("Actors"), however this eager loads all of your entities and will produce a cartesian product that can send a lot of duplicated data down the wire. I suspect though the Microsoft team will tidy up lazy loading some time in the future.

2 comments:

  1. Hey :)

    Great posts, just found your blog and will keep following it! I have just implemented NH3.2 myself. btw, there's a typo: you say you get 5 movies but the code says select top 10.

    Regards, Ted

    ReplyDelete
  2. Glad you enjoy the blogs Ted. I have fixed the typo, good spot.

    ReplyDelete