DEV Community

Chetan
Chetan

Posted on • Edited on

Learn Cachdedeing widdth Code

Caching is a

technique used to store the result of computation so that future queries for the result can be served by returning the computed result instead of completing the result again and returning dededit


Adab is running in an e-commerce store. Whenever a customer requests the product page. Adab's Django server performs the following steps to computer the HTML to be sent back to the customer.

  1. Get product details, product seller d

etails, product reviews from the PostgreSQL database.

  1. Get the products that are bought together with the product by querying the neo4j graph database.
  2. Create the HTML using data and the product template with help of the Django template engine.

Adab's store is receiving thousands of requests per second during peak season.

Adab noticed that several requests time

out and take longer to process which is caused due to the computation it takes to create HTML for the product. Adab's responsibility is to serve his customers to the best of his ability as he is receiving payment from them.

Adab wants to reduce latency what should he do?


Caching is the solution to Adab's problem. Whenever a customer requests a product page. We can compute the HTML for the product and store it in the cache and return the HTML.
On subsequent requests for the product, we can return the results from th
e cache.

In the event of a change in product details, products reviews, seller details, or bought together product (can be detected by running a cron job using celery in Python), the data for the product changes so to prevent serving stale data we can simply delete the product's HTML from cache. This is called cache invalidation.

Caching is an effective technique to reduce the latency of our backend. When you are using caching you should avoid the fallacy of Caching most responses of your API.
Usually, web applications follow the Pareto principle meaning 20 percent of the API points are requested 80 percent of the time.

Now adab may decide to store the seller's detail page and the user's order page in the cache because Adab believes that this will reduce latency for these endpoints.

Surely it will reduce latency for these endpoints but Adab should also know that he will need to invalidate the user's cache whenever a new order is placed, the seller's cache needs to be invalidated whenever a change occurs in the seller's model or the product's model. He will need to write code to keep the cache and database in sync with each other.
Simplicity should always strive for both in life and in code. User's orders and seller's page could be created when asked for by the customer. This will simplify Adab's architecture as he will not need to write code to keep the database and cache in sync with each other.

Few Examples where caching is a good solution.

  • eCommerce sites product page.
  • Question page of a Q/A site (eg: StackOverflow).
  • Course page of a course selling site.
  • Reddit Thread.

Usually, most applications are read-heavy so caching can be used in several applications.

Let us see how we can implement caching in a single server.
In a single server, a cache can be implemented using a python dictionary.

Cache needs to support get, set, and delete operations.
Implementation in python

class Cache:
    data = {}

    def get(self, key):
        if key in self.data:
            return self.data[key]
        else:
            return None

    def set(self, key, value):
        self.data[key] = value

    def delete(self, key):
        if key in self.data:
            del self.data[key]
Enter fullscreen mode Exit fullscreen mode

Caching in the multi-server environment.
In a multi-server environment, we need a central place to store cache that Central place is called a cache web server.
Two cache web servers are Redis and Memcache.

Redis also has persistence built into it meaning it will also store the cache in memory and disk as well. now in the event of a power outage in a data center whenever Redis boots the app again it will fill in the cache into memory from the disk.

Redis is an interesting technology as it also supports pub-sub, acting as an event broker, counter increment decrement. I want you to read the documentation of Redis and learn about it as it will be helpful to you. The docs also contain a tutorial in which Twitter is built using only Redis.

In case you are reading it in the browser, bookmark Redis documentation now.

Code to install and run Redis in docker container using docker-compose is as follows.

version: "3.8"
services:
  redis:
    image: redis:6.2.6-alpine
    ports:
      - 6379:6379
Enter fullscreen mode Exit fullscreen mode

Implementation in JavaScript

import { createClient } from "redis"

async function connect() {
  const client = createClient({
    url: "redis://localhost",
  })

  client.on("error", (err) => console.log("Redis Client Error", err))

  await client.connect()
  return client
}

async function main() {
  const client = await connect()

  await client.set("fruit", "guava")
  console.log(await client.get("fruit"))
  client.del("fruit")
}
Enter fullscreen mode Exit fullscreen mode

Whenever you are connecting to Redis you are connecting to it via the network. Assuming your application servers and cache server is in a single data center which is the usual case you will see a latency of 0-9 ms usually.
The latency of a request is 200 to 400 ms from Delhi to Mumbai. 209ms is approximately equal to 200ms.

A Central web server caching using Redis is the solution that is usually used by engineers. Avoid the fallacy of optimizing for the 9ms using a self-engineered complex algorithm for the sake of simplicity.

Now coming to the final solution to Adab's problem:
When a request for a product page is received.
We will get it from the cache in case the page does not exist in the cache, we will create it and store it in the Redis cache, then we will return the response.

Python Code

def get_product_page(product_id):
    key = f'product#{product_id}'
    result = cache.get(key)
    if result is None:
        page = create_page(product_id)
        cache.set(key, page)
        result = page
    return result
Enter fullscreen mode Exit fullscreen mode

In the above code, we have namespaced the product ID by prepending with 'product#'
This technique of namespacing is commonly used in NoSQL databases. With these techniques, multiple models can be stored in the cache.
For example, if Adab decides to store the JSON response for popular products for each category. He can use namespacing to store it the key will be of format

'category#{category_name}#popular'

Questions:
Q) Write one benefit and one downside of caching.
A) Benefit:
Reduces Latency
Downside:
Additional code to keep database and cache in sync with
each other

Q) Twitter, YouTube, StackOverflow, Reddit are read-heavy or write-heavy?
A) Read heavy

Q) Name two cache web servers?
A) Redis and Memcache

Q) Most web applications are read-heavy or write-heavy.
A) Read heavy.

Q) Which technique is used in the NoSQL database to store multiple models?
A) Namespacing

Q) Write three use cases of caching.
A)

  • The product page of an eCommerce Store
  • Course page of an online education website
  • Question page of a Q/A site.

Q) Which of the following are bad candidates for caching.
a. User's orders in JSON format for an e-commerce store.
b. The stock price of a stock
c. Reddit thread
A) a, b

Q) Write the docker-compose file to run Redis.
A)

version: "3.8"
services:
  redis:
    image: redis:6.2.6-alpine
    ports:
      - 6379:6379
Enter fullscreen mode Exit fullscreen mode

Q) Write code to connect to Redis using node JS.
Set a key-value pair for 'fruit' and 'guava'.
Now get the value for 'fruit' key.
A)

import { createClient } from "redis"

async function connect() {
  const client = createClient({
    url: "redis://localhost",
  })

  client.on("error", (err) => console.log("Redis Client Error", err))

  await client.connect()
  return client
}

async function main() {
  const client = await connect()

  await client.set("fruit", "guava")
  console.log(await client.get("fruit"))
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)