Cache miss attack

Caching is awesome but it doesnโ€™t come without a cost, just like many things in life.

One of the issues is ๐‚๐š๐œ๐ก๐ž ๐Œ๐ข๐ฌ๐ฌ ๐€๐ญ๐ญ๐š๐œ๐ค. Please correct me if this is not the right term. It refers to the scenario where data to fetch doesn't exist in the database and the data isnโ€™t cached either. So every request hits the database eventually, defeating the purpose of using a cache. If a malicious user initiates lots of queries with such keys, the database can easily be overloaded.

The diagram below illustrates the process.

Two approaches are commonly used to solve this problem:

๐Ÿ”นCache keys with null value. Set a short TTL (Time to Live) for keys with null value.

๐Ÿ”นUsing Bloom filter. A Bloom filter is a data structure that can rapidly tell us whether an element is present in a set or not. If the key exists, the request first goes to the cache and then queries the database if needed. If the key doesn't exist in the data set, it means the key doesnโ€™t exist in the cache/database. In this case, the query will not hit the cache or database layer.


If you enjoyed this post, you might like our system design interview books as well.

SDI-vol1: https://amzn.to/3tK0qQn

SDI-vol2: https://amzn.to/37ZisW9