Entity Framework – Storing a complex entity as JSON in a single DB column

jsonDuring the development of TicketDesk 2.5, I came across an unusual case. I wanted to store a large chunk of my entity model as a simple JSON string, and put it into a single column in the database.

Here’s the setup:

I have an entity that encapsulates all of a user’s display preferences and similar settings. One of those settings is a complex set of objects that represents the user’s custom settings for list view UI pages. There can be many lists, each with separate settings. Some of the settings for a list contain collections of other objects, resulting in a hierarchy of settings that goes three levels deep

I didn’t want to represent these settings as a relational data model in the database though. Using EF’s standard persistence mapping conventions, this collection of settings ends up being spread across six  tables. The TSQL queries to access that data would be rather slow and cumbersome, and the relational model doesn’t add any value at all.

Instead, I just wanted to serialize out the entire collection of settings as a single JSON string, and store it in one column in the user settings table. At the same time though, I wanted the code behave as if this were just a natural part of my regular EF entity model.

The solution:

The solution was to use a complex type, with some fluent model binding magic, to flatten the hierarchy into a single column. The heirarchy itself is represented as a custom collection, with a bit of manual JSON serialization/deserialization built-in.

I got a pointer the right general direction from this SO post, which saved me a bunch of time when approaching this more advanced scenario.

First, let’s take a look at the root entity here:

public class UserSetting
 {
 	[Key]
    public string UserId { get; set; }

    public virtual UserTicketListSettingsCollection ListSettings { get; set; }
}

This is the only entity which will map to its own table in the DB. The ListSettings collection is the property I want persisted as JSON in a single column.

Here is the custom collection that will be stored:

public class UserTicketListSettingsCollection: Collection<UserTicketListSetting>
{
    public void Add(ICollection<UserTicketListSetting> settings)
    {
        foreach (var listSetting in settings)
        {
            this.Add(listSetting);
        }
    }

    [JsonIgnore]
    public string Serialized
    {
        get { return Newtonsoft.Json.JsonConvert.SerializeObject(this); }
        set
        {
            if (string.IsNullOrEmpty(value))
            {
                return;
            }

            var jData = Newtonsoft.Json.JsonConvert.DeserializeObject<List<UserTicketListSetting>>(value);
            this.Items.Clear();
            this.Add(jData);
            
        }
    }
}

This is a collection type, and is inheriting generic Collection<T>. In this case, T is the UserTicketListSetting type — which is a standard POCO wrapping up all of the settings for all of the various list views in one place.

Some of the properties inside UserTicketListSetting contain collections of other POCOs. The specific details of what’s going inside those classes doesn’t matter to this discussion, just understand that it results in a hierarchy of related objects. None of the properties in that hierarchy are marked up with EF attributes or anything.

The only magic here is that we have a Serialized property, which manually handles converting from/to JSON. This is the only property that we want persisted to the database.

To make that persistence happen, we will make UserTicketListSettingsCollection an EF complex type, though not by using the [ComplexType] attribute. Instead, we’ll manually register this complex type via the fluent model builder API.

In the DB Context this looks like this:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
	modelBuilder.ComplexType<UserTicketListSettingsCollection>()
        .Property(p => p.Serialized)
        .HasColumnName("ListSettingsJson");
}

This just tells EF that UserTicketListSettingsCollection is a complex type, and the only property we care about is the Serialized property. If there were other properties in UserTicketListSettingsCollection, you would need to exclude them with something like:

modelBuilder.ComplexType<UserTicketListSettingsCollection>().Ignore(p => p.PropertyToIgnore);

And that’s all I needed to get EF to store this entire hierarchy as a single JSON column.

Using this model in code is just like using any other EF entity. I can query it with LINQ expressions, and SaveChanges on the DbContext updates the JSON data in DB just like any other entity. Even the generation of code-based migrations works as expected.

It took a LOT of experimentation and digging to figure out how to make this work, but the implementation is rather simple once you know how to approach the problem.

This also reflects the amazing power and flexibility of Entity Framework. EF can be extended to fit very advanced scenarios even when the designers didn’t anticipate them directly.

You can see the full implementation of this in TicketDesk 2.5. Currently, TD 2.5 is in alpha, so look to the develop branch in source control on CodePlex. You will find this example, as well as several variations in the TicketDesk.Domain assembly.

9 Replies to “Entity Framework – Storing a complex entity as JSON in a single DB column”

  1. This looks like a perfect solution but I have few questions.
    1) could you put different JSON (in the same column) and either leverage TPH so that different (but similar) objects could live in the same table and use the discriminator field to choose the proper serialization?

    2) JSON is a great solution but would it be possible to use an XML column in SQL Server, and serialize to XML. I was think then you could maybe tap into Xml-Indexes to assist in querying on the server… Not worth it?

    3) Or put different JSON in the column and then expose different sub-objects depending on the JSON content…. I guess you would have to store something in the column to tell it what the JSON payload was.
    The reason I ask is I though I might be able to use the mechanism to extend objects… older objects might just store a subset of values.. For instance you want to keep the same DB structure as you add new user settings and permissions. The new objects would be able to consume the old data and just add the new values on the first save, and even old objects might be able to continue working (EF would probably complain about the context being out of date) but would just not pick up the new values… hmm that probably wouldn’t work.. as it would delete the new values on save… still possible? or not eve worht looking at?

    1. 1 & 3) yes, you should be able to store whatever you want in that column, and you can do amazingly complex stuff with it on (de)serialization. You should’t have any trouble at all mapping the JSON into and out of several locations within your model… though the properties you map to/from should be flagged as unmapped so that EF’s model mapper ignored them.

      Honestly though, it sounds a lot like you should consider a JavaScript based document DB instead of a relational database. I’ve had good luck with RavenDB. If you have a lot of JSON data, especially complex JSON documents, you can’t beat the covienience of a javascript based document DB.

      2) You should easily be able use XML serialization/deserialization this same way too. Again, the trick is that the one mapped property handle the serialization/deserialization correctly.

      As for persisting the data into a real XML column in SQL… I can’t say. I have never found the actual XML datatype in SQL to be worth the pain… but I’ll be honest, I also haven’t look at it since about 2008.

      Last time I needed to store anything non-trivial as an XML column, even the best index magic I could devise still performed so slowly that I’d have been better off just having the application send me an email, then wait on me to manually type a result in reply. I also absolutely HATE working with the XML query syntax in SQL server.

      Instead of storing XML in XML columns, I either store it as a text column, and just let the application decipher the contents at runtime, or I use a mapper to convert the contents to and from a relational model.

      1. Haha, Ok thanks for the response, I haven’t used the XML data type in SQL server ever, I did use the Oracle variant way back in 2004 (Yikes 10 years ago). I really appreciate the to the point and no nonsense answer. I don’t think I want to go to a document database yet, its just we have a fairly common situation where we have a base object for example a “Service Requirement” … Initially we modeled and built out a object model around this, only to determine later that there other variants of “Service Requirements”. Now we did refactor a base class out with the common data elements, but I foresee this being an ongoing process as the application evolves and grows. Which got me to thinking, would it be practical to build on top of EF (or extend) such that I could store the additional fields that are different in a single column … The reason being most of our queries and UI logic treat different Service Requirements the same, we want a list of Service requirements whether they are type 1, 2, 3… just thought this might provide a clean way to achieve that. The alternative is a bit of pain, we have to union the different entities (tables) and project out a common view…. its just you loose a lot of the objected oriented cleanness once you project out. It would be nice to work with a base “Service Class” for the parts that are only concerned about displaying the base data, and then casting to the actual object when you needed to get the specifics out… and be able to keep everything in one table… I’ve looked at TPH but I don’t like those wide sparse tables and how they create queries is convoluted … I really wish EF would just create views for the descended types… it would make the generated equerries so much cleaner and easier to understand. Anyhow, I’m going off on a tangent, Thanks for the article and the responses to my questions.

  2. In your example I guess you forgot to use the attribute [JsonIgnore] on the property Serialized. Without this attribute you got an StackOverflowException.

  3. Make your Serialized property private (at least the setter). That will allow EF to map it while not exposing it as a part of your API.

    You’ll have to do a bit of expression magic in order to get the fluent interface working properly


    public class UserTicketListSettingsCollection: Collection
    {
    ....
    public static readonly Expression<Func> Serializer = (x) => x.Serialzed;
    }
    ....

    modelBuilder.ComplexType()
    .Property(UserTicketListSettingsCollection.Serializer)
    .HasColumnName("ListSettingsJson");

    but that will solve your leaky abstraction.

Leave a Reply

Your email address will not be published. Required fields are marked *