Liquid Memory Leaks in Shopify

Michael on Sep 21

Shopify's liquid engine is pretty awesome. You get a functional templating language, without having to worry about performance or security. Unless you get this:

Liquid error: Memory limits exceeded

This was a first for us, and led to discovering some ways to better optimise liquid within the constraints of its limited logic.

The problem

While liquid looks and behaves like other templating tools found in web-focussed languages like Ruby and PHP, it is a little simplistic when it comes to what you can do with different data structures. You can loop through items in an array, and can perform some searches within arrays, but you can't perform any kind of set theory based operations (like returning the intersection or union of two arrays). This might leave you writing lots of nested loops iterating over arrays to query their contents, which can lead to the above error.

For example, if you want to see if a product has a certain tag, you might be tempted to write something like

1
2
3
4
5
6
7
8
9
{% assign tag_found = false %}
{% assign query_string = "the tag we're looking for" %}
{% assign product_tags = product.tags | split: ', ' %}
{% for tag in product_tags %}
  {% if tag == query_string %}
    {% assign tag_found = true %}
    {% break %}
  {% endif %}
{% endfor %}

Which would leave us with a tag_found boolean that would be true if the product's tags held query_string. This is a bit verbose when you might be used to something like array.includes?(tag) in Ruby, or array.indexOf(tag) > -1 in Javascript, but it works.

How does this lead us to the Memory limits exceeded error? When the liquid engine receives a request, it needs to build the response by processing all the liquid that is needed for that page, from the top-level theme.liquid right down to each snippet used in the given template. To do this, it will need to load into memory every object accessed by the liquid, leading to a potential situation where multiple loops in various templates and snippets could exceed the total memory allocated by Shopify to a single request.

Let's say the above example lives in a product_thumbnail.liquid snippet, that is used in the collection.liquid page, and this store uses a lot of tags (each product has 1000 tags).

1
2
3
{% for product in collection.products %} <!-- 50 products -->
  {% include 'product_thumbnail' %} <!-- 1000 tags on each product -->
{% endfor %}

In the worst case scenario, this would lead to 50,000 iterations, and could then push the rendering memory load past the thresholds Shopify have set.

When looking at how these for loops will perform, we have to assume the worst-case scenario, given that we often can't control the actual product data that will exist on the store. In the above example, the worst case is that for each product, the tag being searched for isn't found, leading to a full 1000 iterations for each product.

The solution

To resolve the memory error, we need to try and reduce the number of unnecessary loop iterations, and avoid using for loops when there isn't a good reason to.

There are various different ways to achieve this, and will depend on what you're looping through.

An array of strings

If we have an array of strings (e.g. the tags property), we can use the contains operator.

1
2
3
4
5
6
<!-- product_thumbnail.liquid -->
{% assign tag_found = false %}
{% assign query_string = "the tag we're looking for" %}
{% if product.tags contains query_string %}
  {% assign tag_found = true %}
{% endif %}

An array of collections/products/articles

If we have an array of objects, we can sometimes select the relevant object using its handle, with the array[handle] syntax:

Instead of a for loop

1
2
3
4
5
6
7
8
{% assign found_collection = nil %}
{% query_handle = "Collection Title" | handleize %}
{% for collection in collections %}
  {% if collection.handle == query_handle %}
    {% assign found_collection = collection %}
    {% break %}
  {% endif %}
{% endfor %}

We could directly retrieve the collection

1
2
3
{% assign found_collection = nil %}
{% query_handle = "Collection Title" | handleize %}
{% assign found_collection = collections[query_handle] %}

Something more complex

While the above are simple examples to showcase how to improve your code, there are still some scenarios where these approaches won't work. For example, if a merchant needs to store more structured data on a product, they may want to do so using tags that use a key:value structure, that make it easy to identify what the tags represent.

In this scenario, a store has some products that have a country of origin, which needs to be shown as a flag on the product in the collection and product templates. Any product with a country of origin has a tag country:COUNTRY_CODE, where COUNTRY_CODE would be something like US or FR.

Our initial version of the liquid might look like this:

1
2
3
4
5
6
7
8
9
10
<!-- product_thumbnail.liquid -->
{% assign country_code = false %}
{% assign product_tags = product.tags | split: ', ' %}
{% for tag in product_tags %}
  {% assign split_tag = tag | split: ':' %}
  {% if split_tag[0] == 'country' %}
    {% assign country_code = split_tag[1] %}
    {% break %}
  {% endif %}
{% endfor %}

This would loop through all tags for a product, and assign COUNTRY_CODE to the liquid variable country_code, allowing us to then use that later. If planning for the worst case scenario though, there may be no products on the page that have the country:COUNTRY_CODE tag, and so the loop would have to iterate through every tag on each product. Ideally, we'd only enter the for loop if we knew the product had a country:COUNTRY_CODE tag, and just skip the loop entirely if it didn't (reducing the memory load to just those loops that were necessary).

Unfortunately contains doesn't help here, as it will match against the full value of each string in the array; {% if product.tags contains 'country:' %} would return false. It might seem like we're stuck, but we can leverage the fact that product.tags (and most other array-like properties in Shopify) are actually stored as a string, and can be turned into an array of strings by us using the split filter.

As a reminder:

The split filter takes on a substring as a parameter. The substring is used as a delimiter to divide a string into an array.

One additional behaviour of the split filter is that if the substring isn't found in the string, the split filter will return an array with two elements, where the first is the original string, and the second element is nil.

This means we can use split to accurately identify if a String does not hold a substring, or to put another way, if an array does not have a particular element.

In context of our original loop, this looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<!-- product_thumbnail.liquid -->
{% assign country_code = false %}
{% assign initial_query_string = 'country:' %}
{% assign initial_check = product.tags | append: ',' | split: initial_query_string %}

{% unless initial_check[1] == nil %}
  {% assign product_tags = product.tags | split: ', ' %}
  {% for tag in product_tags %}
    {% assign split_tag = tag | split: ':' %}
    {% if split_tag[0] == 'country' %}
      {% assign country_code = split_tag[1] %}
      {% break %}
    {% endif %}
  {% endfor %}
{% endunless %}

We know that if initial_check[1] is nil, product.tags does not include query_string, and so can skip lines 7-14 entirely, reducing the risk of exceeding the memory limits.

Why do we still need the for loop after the initial_check

While this allows us to determine if the product's tags include what we're looking for, we can't know what the value (i.e. COUNTRY_CODE) of the key:value pair is without looping through the tags, and doing a split on each tag.

Why append: ',' before the split filter?

This protects against a false negative in the case where initial_query_string is at the very end of the string being searched.

For example, if the string is test-1, test-2, test-3, splitting on test-3 will return ["test-1, test-2, ", ""], which would incorrectly suggest that test-3 was not found in the original string. By appending a final , before calling split, the result of the split will instead be ["test-1, test-2, ", ","], which would ensure that the for loop would still execute.

What about metafields?

If you've used metafields before, you may be thinking that this would be a much easier solution than looping through an object's tags - if you're looking for a stored value, you can directly access it without needing to loop through an array. Metafields could be a great alternative solution to the above, but the choice of whether to use them can often be driven by both technical and non-technical requirements.

It's relatively easy for a merchant to bulk modify an object's tags from the store admin, and when uploading products via csv, tags can easily be included in the upload, making them great for where you want to apply the same value to multiple objects. By comparison, metafields are really flexible for where each object requires a different value, but they're a bit harder to bulk modify, and they can't be included in the main product csv upload. Metafields are also not easily accessible via code in some situations - for example, they are currently unavailable via the Storefront GraphQL implementation.


Summary

It can be easy to skip worrying about the performance of your liquid code, and in most situations, it isn't a concern for theme development. However, while the use case of 1000 tags per product may be a little extreme, the build up of multiple smaller inefficient loops can lead to unexpected memory overflows, even though each individual loop might be within bounds. Using simple tricks like splitting the string and checking before converting to an array can help reduce the memory requirements of your liquid.