Render arrays and #cache in Drupal 7

Drupal 7 includes many new features, among them, render arrays. If you aren't familiar with render arrays the drupal.org documentation does a great job of explaining what they are, why they were created, and how to use them.

One of the specific features of render arrays that caught my attention when they were first presented is the #cache property. It allows you to cache individual elements of your render array using the drupal cache system, so that if for example, you have some particularly expensive code that is executed to build an element, it can load the cached version of that specific element instead of executing all the code required to render it on subsequent page loads.

An Example

So what does this look like? Here is an example from render_example.module in the Examples module, stripped down for clarity:

function render_example_arrays() {
  $interval = 60;
 
  $page_array = array(
    t('cache demonstration') => array(
      '#markup' => t('The current time was %time when this was cached. Updated every %interval seconds', array('%time' => date('r'), '%interval' => $interval)),
      '#cache' => array(
        'keys' => array('render_example', 'cache', 'demonstration'),
        'bin' => 'cache',
        'expire' => time() + $interval,
        'granularity' => DRUPAL_CACHE_PER_PAGE | DRUPAL_CACHE_PER_ROLE,
      ),
    ),
  );
 
  return $page_array;
}

Note that we are returning the array here unrendered. This function is a menu callback in the render example module, so we can expect the theme layer to run this through drupal_render() after it has given other themes and modules a chance to alter it. In other situations you may need to invoke drupal render yourself, so the last line would instead appear as:

  return drupal_render($page_array);

If we were to do that in our menu callback function it would defeat the purpose of using a render array, as there would then be no possibility of altering it.

This render example is pretty simple. The #markup property is the content of the array that we would like to cache. The #cache property tells Drupal how we want to cache it.

Keys is an array of strings that together should uniquely identify this content. They will be concatenated together, separated by colons, and used as the cache id.

Bin is the cache table where we want to store our content. Here it is using the generic cache table, but you could also use your own custom cache table. Adding a custom cache table is easy. In your module install file (mymodule.install for this example) add:

/**
* Implements hook_schema().
*/
function mymodule_schema() {
  $schema['cache_mymodule'] = drupal_get_schema_unprocessed('system', 'cache');
  $schema['cache_mymodule']['description'] = 'Cache table for my module to store rendered content.';
 
  return $schema;
}

Update 'mymodule' to reflect your module name where appropriate. With this example you could then set the bin property to 'cache_mymodule'.

Expire is a timestamp that tells Drupal when the cached data is no longer valid. It is worthwhile to note that an expired cache will only be cleared when cron is run. There is no check in drupal_render() itself to determine if the cache is stale. There is a bug filed to address this situation here: http://drupal.org/node/1354718.

Granularity indicates whether and in what situations different versions of this content should be cached. The content may be different depending on the role of the user viewing it, the page it appears on, etc. For example, we wouldn't want anonymous users seeing content that has been cached for an administrator. This allows us to cache separate versions per role to avoid that situation.

A Problem

So that seems easy. We were able to add caching to our render array with just a few lines. But there is a problem. Have you spotted it? Maybe it is easier to see if we rewrite our example.

function render_example_arrays() {
  $interval = 60;
 
  $page_array = array(
    t('cache demonstration') => array(
      '#markup' => expensive_stuff($interval),
      '#cache' => array(
        'keys' => array('render_example', 'cache', 'demonstration'),
        'bin' => 'cache',
        'expire' => time() + $interval,
        'granularity' => DRUPAL_CACHE_PER_PAGE | DRUPAL_CACHE_PER_ROLE,
      ),
    ),
  );
 
  return $page_array;
}
 
function expensive_stuff($interval) {
  $time = t('The current time was %time when this was cached. Updated every %interval seconds', array('%time' => date('r'), '%interval' => $interval));
 
  drupal_set_message($time);
 
  return $time;
}

We moved our content into a separate function and added a drupal_set_message(). Why did we do that? If you reload the example a few times it should become apparent. The render array is caching our content. You should be able to tell because the time displayed by the rendered array should be the same as the first time it loaded. But the message displays the current time for each page load. But wait, why is a message appearing at all on subsequent page loads? The point of caching our expensive_stuff() function is so that we don't execute it on every page load. If it is still running every time then even though the render array is caching our data it isn't actually doing anything useful. We just end up with stale data and the performance price of running our expensive_stuff() function on every page load. So clearly there is a little more work required to get something useful out of the #cache property.

This behavior should be obvious in retrospect. Render arrays are just normal arrays, formatted in a particular way and passed to drupal_render(). Drupal_render() does not have a magical ability to reach back in time to prevent our expensive_stuff() function from executing if it finds a cached version. So how do we make this work?

A Solution

The solution is the #pre_render property. Pre_render allows us to pass in an array of function names which will be called just before the array is rendered. The key here is that drupal_render() calls these functions after it has checked if a cached version of the element exists. If it finds a cached version then those functions are never called. Here is our example reworked to use #pre_render, so that #cache is doing something useful:

function render_example_arrays() {
  $interval = 60;
 
  $page_array = array(
    t('cache demonstration') => array(
      '#markup' => '',
      '#pre_render' => array('render_example_cache_pre_render'),
      '#cache' => array(
        'keys' => array('render_example', 'cache', 'demonstration'),
        'bin' => 'cache',
        'expire' => time() + $interval,
        'granularity' => DRUPAL_CACHE_PER_PAGE | DRUPAL_CACHE_PER_ROLE,
      ),
    ),
  );
 
  return $page_array;
}
 
function render_example_cache_pre_render($element) {
  $element['#markup'] = expensive_stuff();
 
  // The following line is due to the bug described in
  // http://drupal.org/node/914792. A #markup element's #pre_render must set
  // #children because it replaces the default #markup pre_render, which is
  // drupal_pre_render_markup().
  $element['#children'] = $element['#markup'];
  return $element;
}
 
function expensive_stuff() {
  $interval = 60;
  $time = t('The current time was %time when this was cached. Updated every %interval seconds', array('%time' => date('r'), '%interval' => $interval));
 
  drupal_set_message($time);
 
  return $time;
}

Note that the #markup property is now an empty string in the render array. Technically it does not need to be there at all. Instead we add the actual #markup content via the #pre_render function which is then responsible for calling our expensive function. Loading this new example multiple times should confirm that the expensive_stuff() function is only executed on the initial page load and then again only as the cache expires.

That's all there is to it. I have created and submitted a patch for the Examples module, so that the example in render_example.module is more thorough in its treatment of #cache usage. Please let me know if you have any questions, comments, or corrections.