How to urql, Part 3: The Normalized Cache

August 11, 2020

In our first blog post we talked about exchanges and how by default we're using a document-based cache. This is the cache that comes out of the box with urql and solves a lot of common cases, but what if this cache isn't sufficient for you?

There is a more advanced cache in the urql-ecosystem called Graphcache — this is a normalized cache. This cache brings certain possibilities like:

  • Reducing network traffic: this cache will enable you to update entities from a mutation response or a subscription trigger.
  • The cache reduces the amount of data in memory since it will "reuse" entities due to all data being normalized.

Just like the document-based cache, the normalized cache may be added as an exchange to the urql Client. Every time the Client receives a response it deeply traverses the response, to flatten the data, as if it came from a RESTful API. Imagine we receive the following response:

{ "__typename": "Query", "todo": { "__typename": "Todo", "id": 1, "title": "implement graphcache", "author": { "__typename": "Author", "id": 1, "name": "urql-team" } } }

In the document-based cache we would take this result, extract all typenames and store it so it knows when to invalidate this data.

In a GraphQL result we will see objects and arrays with typenames. An object with a typename is what we'd call an "entity" as it has a concrete type in the API's GraphQL schema. We would call fields on this type with values "records" and fields that refer to one or more different entities "links."

In a normalized cache we will need to traverse this response and transform it. We'll use the __typename and id fields to generate unique keys for each entity we encounter. We'll see two types during this traversal, one being a record, which is a property of the current entity, and the other being a link that describes how this entity links to another entity.

In the above example we see a link to an author — our Todo has a relation to an entity called Author.

Now we can start listing records for that Todo. We see a __typename and an id field so we can make the unique key for this entity Todo:1. A first record would be Todo:1.title = 'implement graphcache'. While traversing we notice another set of records for the Author entity. We save these as well and define that our Todo links to Author:1.

Essentially we make a list of authors and a list of todos and describe which of these relate to each other by the concept of links. This concept isn't new and can be found in Redux, for instance, where we'd manually have to do this. In GraphQL the query helps us structure this normalization.

You may wonder why we'd implement this complex logic when we have a key for each operation, which we can use to uniquely store a response. That's a great question and let's look at why this is not only better for memory but also for network traffic.

With the document-cache, when we receive a response to a mutation, we have to refetch all affected typenames. This results in all queries with said typenames to be invalidated and refetched. This invalidation can trigger a large amount of network requests — up to one for every query that's currently on the page. Not only that but each response that could have shared entities will be stored and take up more memory than needed.

With a normalized cache we'll share entities because we can identify them by id and __typename, this allows us to not only reduce the network payload but also to automatically update this entity on a mutation response. When a mutation would respond with the following payload:

{ __typename: 'Todo', id: 1, text: 'Convert to @urql/exchange-graphcache' }

We can safely do a lookup inside of our cache, find the Todo:1 and update its text property to be the new one instead of having to refetch all of these queries. Each entity is stored separately and we store how these entities link to each other. This allows us to treat responses as descriptions of how to update these entities and relations.

This is made possible with GraphQL because we already have instructions on how to traverse the result. This is the GraphQL Document that we send to the API as a query in the first place. A normalized cache can use __typename together with this document to automatically build stores of different types.

Caching logic

Graphcache can assume a lot automatically, but just like a real database it will need some explanation and logic to work more effectively. We've centralized this configuration since we believe that this should be reusable on an entity-level.

Identifying entities

When using Graphcache we prioritize developer ergonomics — this is why you a see a warning like this one, which means the cache sees a __typename but is missing an identifier.

Let's say our todo is a bit out of the ordinary and uses a cid field to identify the entity. Graphcache allows us to specify this behavior with the keys config:

import { cacheExchange } from '@urql/exchange-graphcache'; const cache = cacheExchange({ keys: { // We insert the Todo __typename Todo: (data) => data.cid } });

Now we made our cache aware that cid is the identifier for every Todo.

Some entities may not be uniquely identifiable, like an object that just contains geo location coordinates. In this case this config can also be used to tell the cache that a certain entity has no key, by doing () => null . This will result in the keyless object to be embedded into its parent.

Resolving data

This pattern can be compared to the backend resolvers pattern of GraphQL. We'll specify a function that can override or resolve a certain piece of data. We'll see two use cases for this:

  • Connecting an entity from a list
  • Converting a piece of data to another format

Let's start with converting a piece of data. Let's say our entity has a field called createdAt. We need this to be a normal JS date object but at the moment it's returned as a string from our server:

const cache = cacheExchange({ resolvers: { // Our _typename Todo: { // Our field createdAt: (parent) => new Date(parent.createdAt) }, } });

Now every time we query our Todo the createdAt field will be converted to a JS-date from the string.

The other use case is connecting an item from a list. Imagine we have queried a list of our entity and we want to click one of them to see its details. Our cache can't assume that a field called todo is a specific item from the queried todos, so we'll need to help our cache. We can do this very similar to the above — we know that in a normalized cache we need a __typename and id to resolve our entity. When we query a specific item we know what entity we are asking for and the id will most likely be part of the variables.

const cache = cacheExchange({ resolvers: { // Our typename here is the root Queryfield Query: { // The field is one single todo todo: (parent, args) => ({ __typename: 'Todo', id: args.id }) }, } });

Now the item queried from the list will be used for our details.

There's one caveat here: when there's a field missing that's not in the list, for instance in the list we only ask for the id and text but in the details, we also ask for the creator, ... then we still have to do a network fetch. This means the cache won't show you the data immediately since all partial data is considered a cache-miss. Unless graphcache is aware of the shape of your server-side schema (more about this later).

Updating data

The updater configuration allows you to define behavior that has to be executed when a subscription or mutation comes in. Graphcache will try its best to automatically update entities but when the entity isn't present in the cache (or has to be removed) it can't really assume how this should be done. Graphcache will need our help here. Let's consider a scenario where we add one todo to our list:

const cache = cacheExchange({ updates: { // We tell graphcache that this field is a mutation, we can also do Subscription Mutation: { // The name of the field addTodo: (result, args, cache) => { cache.updateQuery({ query: TodosQuery }, (data) => { return { ...data, todos: [...data.todos, result.addTodo] } }) }, }, } });

Now we've told graphcache that when it sees a response to addTodo it has to append it to the existing list of todos.

Server-side schema

In the resolvers section we spoke about partial data and it not showing data without graphcache being aware of your server-side schema. Schema-awareness is our measure to show graphcache which of our fields are optional and which ones are mandatory, so when we provide the schema option you'll be able to return partial data for your entities. Not only that, but schema-awareness also brings you a set of developer warnings relating to Fragment matching. In short, the cache now has knowledge about how your data should look.

Adding a schema can be done like this.

Putting it into practice

Remember in the last post where we had a verbose piece of code which was only used to update our list of todos when a subscription triggered? With Graphcache we can now fix this without having to define custom logic in our React components.

You can follow along with this template.

If the API isn't working manually surf to https://k1ths.sse.codesandbox.io/ once to start it up.

Let's start off by adding our new cache to our dependencies.

npm i --save @urql/exchange-graphcache ## OR yarn add @urql/exchange-graphcache

We're all set to start adding this to our client-exchanges now, so we go to our App.js and create the cache from the factory-function exported by @urql/exchange-graphcache and add it to our exchanges.

import { cacheExchange } from '@urql/exchange-graphcache'; const cache = cacheExchange(); const client = createClient({ ... // Note that we removed the original cacheExchange. exchanges: [dedupExchange, cache, fetchExchange, subscriptions], ... });

Now since we are using graphcache we can remove a lot of code from the Todos component since that custom logic to track subscriptions is now redundant.

export const Todos = () => { const [todosResult] = useQuery({ query: TodosQuery }); useSubscription({ query: TodoSubscription }); if (todosResult.fetching) return <p>Loading...</p>; if (todosResult.error) return <p>Oh no... {todosResult.error.message}</p>; return ( <ul> {todosResult.data.todos.map(({ id, text, complete, updatedBy }) => ( <Todo key={id} text={text} id={id} complete={complete} disabled={todosResult.fetching} updatedBy={updatedBy} /> ))} </ul> ); };

This is all we need to listen for updated entities and react to them.

Conclusion

In this blog post we've touched on what normalization means, how to identify entities, how to resolve entity data and links, and how to update the data.

There's more to graphcache to achieve full-offline functionality. We'll be tackling this in subsequent posts.

All of this is also documented under a dedicated chapter around this cache.

Related Posts

urql DevTools: Introducing Explorer View

October 17, 2019
In its current state, the urql DevTools make it easy to understand how these responses come in, but you’re still not able to observe your GraphQL data as a whole. We wanted a simpler way for users to see what data their app is currently consuming with a clear overview of which exact query produced which response, without the need to check the network tab manually. We also wanted to see that data update in real time and include any metadata such as cache outcomes as part of the view.

How to urql, Part 1

February 10, 2020
When picking a GraphQL client for React, many default to using Apollo or Relay, but now there’s a new kid on the block rising in popularity over the last year: Its name is urql. It's not as packed with features as other GraphQL clients. Instead, urql aims to be minimal and highly customizable. This blog post series will start by walking you through getting started with urql, and then move on to more advanced topics like subscriptions, normalised caching, etc.

How to urql, Part 2: Authentication & Multiple Users

February 26, 2020
In the last blog-post we covered the basics on how to query and mutate our data; in real-world applications, there's more to it. In this post, we'll cover setting an authentication token and handling multiple users interacting with the same data.