Self-joined HABTM with Cache Updates

You have a self-joined HABTM table. It’s probably called User too. All is well, until you realize that when relationships change, your site does not display those changes. The reason is that the related objects (users) are not being updated when the relation is created/removed or one of the objects are updated/destroyed. And the cache keys that rails generate for the parent object are still referencing old data.

Here is how to fix it.

Self-Joined Users

Pretty simple really. There is a table in the database called  user_relationships with two fields:  adult_id ,  child_id . You can create it with this migration:

Now every User can have  .adults and  .children. And those adults and children can have adults and children. And so on. And so on.

User Views

You are using rails fragment caching aren’t you? If not, stop right now and read up on how to implement caching in your rails app because even if you don’t need it now, you most like will need it later. Your users will love if for it!

Watch more: Greg Pollack has an excellent video on Dalli & Cache Digests. He does a much better job explaining caching, cache keys, and fragment caching that I.

Rails uses the concept of a cache_key for each object. Cache keys are simply the model name, record id, and the datetime of the  updated_at attribute for the ActiveRecord object. Check it out:

Where cache keys come into play are in views. When a view generates it’s html, the cache  method will ask your caching system (most likely memcached) if a key exists. If it does, then it pulls that fragment from the cache without the expense of generating the html. If they key doesn’t exists, then it renders the html and then stores the result in the cache. Here is a very simple example.

When rendered, this will display the user’s name, and the names of all their children.

Changing Relationship Doesn’t Change the Rendered View

All is going well, until you start getting reports of problems. A user deletes one of her children, but when she views her show page (see above view), she still see the deleted child’s name. The reason is that when you make a change to a HABTM relationship, it only removes the relationship in the uses_relationships  table. It doesn’t update the  updated_at datetime for the parent record. Thus, the cache key for the parent record does not change, and the old (incorrect) information is read from cache.

How To Fix

I wan’t able to find an definitive example of how to fix this completely and correctly for all cases. So here is my implementation.

What you need to do is “touch” all related objects in a relationship anytime that relationship changes. And those changes could be:

  • Parent (adult) object is updated.
  • Any children objects are updated.
  • Relationships are created.
  • Relationships are deleted.

To fix the first two cases, we just need a couple of callbacks.

Astute coders will notice that we used update_all instead of  touch here. While we could have used touch, there is the possibility that you could create a circular update infinite loop. The touch  method will cause all touched objects to also execute any touches that are defined on their relationships.

In our particular case, we are not concerned about touching any additional relationships here because were are only updated the updated_at  attribute to trigger a cache invalidation. We aren’t actually changing the related objects data like a name or birthday.

Now, we need to fix the last two cases when relationships are created or deleted.

Note that we added  after_add: :touch_updated_at, after_remove: :touch_updated_at  to the HABTM definition. These are callbacks that are triggered upon relationship additions or deletions. They both call the touch_updated_at  method that like above, will update the updated_at  attribute. But in this case, the HABTM will pass a reference to the modified user object.

Here we use the update_column  method because we do not want to trigger any callbacks, validations, or cascading touches.

Fixed!

Now, anytime a user adds a child, deletes a child, or any data on all related children is changed, the updated_at  attribute is set to the current datetime. This will change the calculated cache key for the user objects, and voilà, the view will regenerate the html and users will see the modifications.

Posted in Uncategorized.