Skip to content

Latest commit

 

History

History
317 lines (197 loc) · 20.2 KB

File metadata and controls

317 lines (197 loc) · 20.2 KB

FusionCache logo

🔀 Cache Levels: L1 and L2

⚡ TL;DR (quick version)
To ease cold starts and/or help with horizontal scalability (multiple nodes with their own local memory cache) it's possible to setup a 2nd level, known as L2. At setup time, simply pass any implementation of IDistributedCache and a serializer: the existing code does not need to change, it all just works.

When our apps restarts and we are using only the L1, the 1st level (memory), the cache will need to be repopulated from scratch since the cached values are stored only in the memory space of the apps themselves.

This problem is known as cold start and it can generate a lot of requests to our database.

Cold Start

When our services need to handle more and more requests we can scale vertically, meaning we can make our servers bigger. This approach though can only go so far, and after a while what we need is to scale horizontally, meaning we'll add more nodes to split the traffic among.

Cold Start

But, when scaling horizontally and using only the memory level (L1), each memory cache in each node needs to be populated indipendently, by asking the same data to the database, again generating more requests to the database.

As we can see both of these issues will generate more database pressure: this is something we need to handle accordingly.

Luckily, FusionCache can help us.

🔀 Second Level

FusionCache allows us to have 2 caching levels, transparently handled by for us:

  • 1️⃣ L1 (memory): it's a memory cache and is used to have a very fast access to data in memory, with high data locality. You can give FusionCache any implementation of IMemoryCache or let FusionCache create one for you
  • 2️⃣ L2 (distributed): is an optional distributed cache and it serves the purpose of easing a cold start or sharing data with other nodes

Everything required to have the 2 levels communicate between them is handled transparently for us.

Since L2 is distributed, and we know from the fallacies of distributed computing that stuff can go bad, all the issues that may happend there can be automatically handled by FusionCache to not impact the overall application, all while (optionally) tracking any detail of it for further investigation (via Logging and OpenTelemetry).

Any implementation of the standard IDistributedCache interface will work (see below for a list of the available ones), so we can pick Redis, Memcached or any technology we like.

Because a distributed cache talks in binary data (meaning byte[]) we also need to specify a serializer: since .NET does not have a generic interface representing a binary serializer, FusionCache defined one named IFusionCacheSerializer. We simply provide an implementation of that by picking one of the existing ones, which natively support formats like Json, Protobuf, MessagePack and more (see below) or create our own.

In the end this boils down to 2 possible ways:

  • L1: FusionCache will act as a normal memory cache
  • L1+L2: if we also setup an L2, FusionCache will automatically coordinate the 2 levels, while gracefully handling all edge cases to get a smooth experience

Of course in both cases you will also have at your disposal the added ability to enable extra features, like fail-safe, advanced timeouts and so on.

Finally we can even execute the distributed operations in the background, to make things even faster: we can read more on the related docs page.

⚙️ Level-specific options

If needed, we can use different options for each level (L1 & L2), like:

  • MemoryCacheDuration
  • DistributedCacheDuration

For example, by specifying a lower MemoryCacheDuration we can refresh a value in L1 from L2 more frequently (but if we use a backplane that would not be necessary).

Another example is when we want to skip read/write operations on each level via:

  • SkipMemoryCacheWrite
  • SkipMemoryCacheRead
  • SkipDistributedCacheWrite
  • SkipDistributedCacheRead

In general, if there's a WhateverOption and it makes sense to be able to speficy a different value for each level, we can expect to find a MemoryCacheWhateverOption and a DistributedCacheWhateverOption and that, if a level-specific value is not provided, the normal one will be used.

📢 Backplane (more)

When using multiple nodes for horizontal scalability we can use an L2 as a shared cache for all the nodes to use.

But each L1 cache in each node may become out of sync with the other nodes after a change on a specific node: to solve this, it is suggested to also use a backplane.

All the existing code will remain the same, it's just a 1 line change at setup time.

Read here for more.

🧬 Diagrams (more)

Good, good, so FusionCache takes care of coordinating everything between L1, L2 and maybe the backplane, if enabled.

But... it can still be complex to grasp all of that at once, right? Wouldn't it be nice to be able to visualize what we just said? Yes, it would so: diagrams!

FusionCache flow diagrams

Read here for more.

✉️ Wire Format Envelope

Something that may surprise at first is that what ends up in L2 is not only the values we want to cache: there's more.

This is because to allow FusionCache to do all the things that it does, some extra bits of informations are needed, like the timestamp at which an entry has been created, or the entry's tags to support Tagging or more.

When we think about it, we need to put that data somewhere so it makes total sense right?

So how is it stored? Well the value + the metadata is put into a structure we can call the envelope, which really is the cache entry itself.

And this, in turn, means that if we do this:

cache.Set("foo", 123, tags: ["tag-1", "tag-2"])

and we are using, for example, a JSON serializer and Redis as L2, what ends up inside our beloved Redis instance for the cache key "foo" will NOT be this:

123

but more something like this:

{
  "value": 123,
  "timestamp": 123456789,
  "tags": ["tag-1", "tag-2"],
  // MORE METADATA HERE...
}

As said this all makes sense when we think about it, but sometimes it may surprise people looking into their Redis instance for the first time and finding "something more".

🗃 Wire Format Versioning

When working with the memory cache, everything is easier: at every run of our apps or services everything starts clean, from scratch, so even if there's a change in the structure of the cache entries used by FusionCache there's no problem.

The distributed cache, instead, is a different beast: when saving a cache entry in there, that data is shared between different instances of the same applications, between different applications altogether and maybe even with different applications that are using a different version of FusionCache.

As seen above what gets saved into our L2 is not just the value itself, but a structure which contains the value + some metadata, but what happens when the structure of the cache entries (envelope) needs to change to evolve FusionCache?

See, the problem is that a distributed cache is kinda like a database in this regard, meaning we can save data there, stop the app, change something, restart it and now the app will try to deserialize an old version of our data structures into a new one, and this can create problems.

So, how can FusionCache managed this?

Easy, by using an additional cache key modifier for the distributed cache, so that if and when the version of the cache entry structure changes, there will be no issues serializing or deserializing a new version of the data.

In practice this means that when doing something like this:

cache.Set("foo", 123, tags: ["tag-1", "tag-2"])

the actual cache key that will be used inside of our Redis instance will NOT be just:

foo

but in reality something like:

v2:foo

This has been planned from the beginning, and is the way to manage changes in the wire format (envelope) used in the distributed cache between updates: it has been designed in this way specifically to support FusionCache to be updated safely and transparently, without interruptions or problems, even when used by multiple app instances with different versions of FusionCache.

So what happens when there are 2 versions of FusionCache running on the same distributed cache instance, for example when two different apps share the same distributed cache and one is updated and the other is not?

Since the old version will write to the distributed cache with a different cache key than the new version, this will not create conflicts during the update, and it means that we don't need to stop all the apps and services that works on it and wipe all the distributed cache data just to do the upgrade (not very pragmatic).

At the same time though, if we have different apps and services that use the same distributed cache shared between them, we need to understand that by updating only one app or service and not the others will mean that the ones updated will read/write using the new distributed cache keys, while the non-updated ones will keep read/write using the old distributed cache keys.

Again, nothing catastrophic, but something to consider.

Since we are talking about L2 cache keys, something else to keep in mind is that if we are using a CacheKeyPrefix (see here) that is also combined to form the final cache key used inside our distributed cache.

This means that, if we specified a CacheKeyPrefix like "MyPrefix:", when we do this:

cache.Set("foo", 123, tags: ["tag-1", "tag-2"])

the cache key used inside our L1 (memory cache) will be this:

MyPrefix:foo

while the cache key used inside our L2 (distributed cache) will be:

v2:MyPrefix:foo

As one last note to recap: why does the cache key inside L1 doesn't need the extra "v2:" prefix? Because it's a memory cache, meaning that at every restart of our app the memory cache will be empty, everything will start from scratch and therefore we can't have issues of old VS new structures.

💾 Disk Cache (more)

In certain situations we may like to have some of the benefits of a 2nd level like better cold starts (when the memory cache is initially empty) but at the same time we don't want to have a separate actual distributed cache to handle, or we simply cannot have it: a good example may be a mobile app, where everything should be self contained.

In those situations we may want a distributed cache that is "not really distributed", something like an implementation of IDistributedCache that reads and writes directly to one or more local files.

Is this possible?

Yes, totally, and there's a dedicated page to learn more.

↩️ Auto-Recovery (more)

Since the distributed cache is a distributed component (just like the backplane), most of the transient errors that may occur on it are also covered by the Auto-Recovery feature.

We can read more on the related docs page.

📦 Packages

There are a variety of already existing IDistributedCache implementations available, just pick one:

Package Name License Version
Microsoft.Extensions.Caching.StackExchangeRedis
The official Microsoft implementation for Redis
MIT NuGet
Microsoft.Extensions.Caching.SqlServer
The official Microsoft implementation for SqlServer
MIT NuGet
Microsoft.Extensions.Caching.Cosmos
The official Microsoft implementation for Cosmos DB
MIT NuGet
MongoDbCache
An implementation for MongoDB
MIT NuGet
Community.Microsoft.Extensions.Caching.PostgreSql
An implementation for PostgreSQL
MIT NuGet
MarkCBB.Extensions.Caching.MongoDB
Another implementation for MongoDB
Apache v2 NuGet
EnyimMemcachedCore
An implementation for Memcached
Apache v2 NuGet
NeoSmart.Caching.Sqlite
An implementation for SQLite
MIT NuGet
AWS.AspNetCore.DistributedCacheProvider
An implementation for AWS DynamoDB
Apache v2 NuGet
Aerospike.Extensions.Caching
An implementation for Aerospike
Apache v2 NuGet
Microsoft.Extensions.Caching.Memory
An in-memory implementation
MIT NuGet

As for an implementation of IFusionCacheSerializer, pick one of these:

Package Name License Version
ZiggyCreatures.FusionCache.Serialization.NewtonsoftJson
A serializer, based on Newtonsoft Json.NET
MIT NuGet
ZiggyCreatures.FusionCache.Serialization.SystemTextJson
A serializer, based on the new System.Text.Json
MIT NuGet
ZiggyCreatures.FusionCache.Serialization.NeueccMessagePack
A MessagePack serializer, based on the most used MessagePack serializer on .NET
MIT NuGet
ZiggyCreatures.FusionCache.Serialization.ProtoBufNet
A Protobuf serializer, based on one of the most used protobuf-net serializer on .NET
MIT NuGet
ZiggyCreatures.FusionCache.Serialization.CysharpMemoryPack
A serializer based on the uber fast new serializer by Neuecc, MemoryPack
MIT NuGet
ZiggyCreatures.FusionCache.Serialization.ServiceStackJson
A serializer based on the ServiceStack JSON serializer
MIT NuGet

👩‍💻 Example

As an example let's use FusionCache with Redis as a distributed cache and Newtonsoft Json.NET as the serializer:

PM> Install-Package ZiggyCreatures.FusionCache
PM> Install-Package ZiggyCreatures.FusionCache.Serialization.NewtonsoftJson
PM> Install-Package Microsoft.Extensions.Caching.StackExchangeRedis

Then, to create and setup the cache manually, we can do this:

// INSTANTIATE A REDIS DISTRIBUTED CACHE
var redis = new RedisCache(new RedisCacheOptions() { Configuration = "CONNECTION STRING" });

// INSTANTIATE THE FUSION CACHE SERIALIZER
var serializer = new FusionCacheNewtonsoftJsonSerializer();

// INSTANTIATE FUSION CACHE
var cache = new FusionCache(new FusionCacheOptions());

// SETUP THE DISTRIBUTED 2ND LEVEL
cache.SetupDistributedCache(redis, serializer);

If instead we prefer a DI (Dependency Injection) approach, we should simply do this:

services.AddFusionCache()
    .WithSerializer(
        new FusionCacheNewtonsoftJsonSerializer()
    )
    .WithDistributedCache(
        new RedisCache(new RedisCacheOptions { Configuration = "CONNECTION STRING" })
    )
;

Easy peasy.

⚠️ Catch Serialization Issues Early On

Due to how serialization to L2 works, we should make sure our serialization configuration is correct (e.g. for JSON serializers, use the correct TypeNameHandling setting): for instance, abstract types or interfaces cannot be deserialized, because they can't be instantiated (eg: a concrete type is needed).

To catch (de)serialization issues earlier during development, we can configure a locally available distributed cache like an in-memory one or a SQLite-based one:

Here's an example:

builder.Services.AddFusionCache()
  .WithDefaultEntryOptions(new FusionCacheEntryOptions {
    SkipMemoryCacheRead = true,
  })
  .WithDistributedCache(
    new MemoryDistributedCache(Options.Create(new MemoryDistributedCacheOptions()))
  )
  .WithSerializer(
    new FusionCacheNewtonsoftJsonSerializer() // OR ANOTHER ONE OF YOUR CHOOSING
  );

Other ways to catch these issues:

  1. use ♊ Auto-Clone
  2. write proper unit tests to check serialization and deserialization of all your types