Friday 15 July 2011

Using eager loading can produce unexpected results

When using NHibernate for the first time you sometimes decide that you want to utilise Eager loading for some of the following valid reasons.
  1. You want to make only one trip to the database
  2. Your result set is only small, contains 10 or so rows and only a small number of columns
Therefore you have decided to embark on the Eager loading journey however one of the side effects is that the generated SQL always returns a cartesian product.

According to WIKI A Cartesian product (or product set) is the direct product of two sets.

The SQL snippet below shows us that duplicate movies will be returned for every actor in the table. This can be quite clearly seen:-
SELECT * FROM
 Product p
left outer join
 ActorRole a on p.Id = a.MovieId

So now we are armed with this knowledge how does NHibernate handle the cartesian product? I have the following model, I have a one-to-many join on actors. The domain model looks like this:-
public class Movie : Entity
{
  public virtual string Director { get; set; }
  public virtual IList<Actor> ActorList { get; set; }
}

public class Actor : Entity
{
  public virtual string Name { get; set; }
  public virtual string Role { get; set; }
}
When using NHibernate and QueryOver we would specify eager loading by using:-
return Session
  .QueryOver<Movie>()
  .OrderBy(w => w.Name).Asc
  .Fetch(f => f.ActorList).Eager
  .List();
However this produces the following result:-
As you can clearly see NHibernate hydrates all movies (root entities) regardless whether they are duplicated or not. So how do we stop this happening? One way is to tell Nhibernate that we are only interested in hydrating distinct movies, e.g.
return Session
 .QueryOver<Movie>()
 .OrderBy(w => w.Name).Asc
 .Fetch(f => f.ActorList).Eager
 .TransformUsing(CriteriaSpecification.DistinctRootEntity)
 .List();
Notice that we are specifying a result transformation which tells NHibernate to only hydrate the root enities once per actor. Now we get the desired result.

It should be noted that the duplicates are removed (well probably not added in the first place) in client code, but you should remember that you are returning duplicate movies from your SQL.

Also it should be noted that this only works for single joins. If you are joining two tables or more then you should read this article by Ayende.

1 comment: