Boosting MySQL database performance

with Amazon ElastiCache (Redis OSS)

Module 4: Caching and Best Practices

Connect to Redis and learn two techniques for storing and retrieving data.

What You Will Learn

  • Two techniques for storing and retreiving data (When choosing which method to use in your application, select the one that simplifies your architecture based on your data access patterns.)

 Time to Complete

60 minutes

Implementation

  • Back at the ElastiCache Dashboard:

    4.1 — Select “Redis” on the left pane.

    Select Redis on the left pane

    4.2 — Select the Redis Cluster you created for this tutorial.

    Select the Redis Cluster

    4.3 — Copy the Primary Endpoint.

    In the examples, each time an endpoint is mentioned you should use the hostname of your Configuration Endpoint.

    Copy the Primary Endpoint

    4.4 — From your EC2 instance, enter the Python interactive interpreter:

    syntax: shell

    $ python
    

    4.5 — Now run these commands to test the connection to your Redis node.

    If the commands hang, please see the note following the example.

    syntax: python

    syntax: python
    >>> import redis
    >>> client = redis.Redis.from_url('redis://endpoint:6379')
    >>> client.ping()
    
    True

    Note: If it hangs, it means you are being blocked by the Security Group settings. Verify that your EC2 instance has access to the security group assigned to your ElastiCache instance. For example, let’s say your EC2 instance was assigned to the default security group. You can now modify the security group of your Amazon ElastiCache instance and add a Custom TCP rule allowing connections on port 6379 from any instance in the default security group:

    In Source, you can start typing the name of the security group and you’ll be able to click on the Security Group ID. If you need to learn more about Security Groups, you can check the documentation or the Security Group Rules Reference.

    Edit inbound rules
  • In the repository you will find some Python code that you can run in your EC2 instance. But first you need to configure some environment variables:

    syntax: shell

    $ export REDIS_URL=redis://your_redis_endpoint:6379/
    $ export DB_HOST=your_mysql_endpoint
    $ export DB_USER=admin
    $ export DB_PASS=your_admin_password
    $ export DB_NAME=tutorial

    Note that the values for mysql_endpoint, redis_endpoint, and password are those that you saved in the previous steps.

  • The first of the two methods implemented in the code sample works by caching a serialized representation of the SQL query result. The following Python snippet illustrates the logic:

    syntax: python

    def fetch(sql):
    
      result = cache.get(sql)
      if result:
        return deserialize(result)
      else:
        result = db.query(sql)
        cache.setex(sql, ttl, serialize(result))
        return result

    First, the SQL statement is used as a key in Redis, and the cache is examined to see if a value is present. If a value is not present, the SQL statement is used to query the database. The result of the database query is stored in Redis. The ttl variable must be set to a sensible value, dependent on the nature of your application. When the ttl expires, Redis evicts the key and frees the associated memory. This code is available in the tutorial repository and you can run it as is, but feel free to add print statements here and there if you want to see the value of a variable at a certain point in time.

    In terms of strategy, the drawback of this approach is that when data is modified in the database, the changes won’t be reflected automatically to the user if a previous result was cached and its ttl has not elapsed yet.

    An example of how you would use the fetch function:

    syntax: python

    print(fetch("SELECT * FROM planet"))

    The result would be:

    syntax: python

    [{'id': 10, 'name': 'Mercury'},
     {'id': 11, 'name': 'Venus'},
     {'id': 12, 'name': 'Earth'},
     {'id': 13, 'name': 'Mars'},
     {'id': 14, 'name': 'Jupiter'},
     {'id': 15, 'name': 'Saturn'},
     {'id': 16, 'name': 'Uranus'},
     {'id': 17, 'name': 'Neptune'}]

    Of course, this is a very basic example, but your application can benefit a great deal by implementing this caching pattern where there’s no difference between a result coming from the cache and a result coming straight from the database.

  • The second example you will implement maps a database record to a Redis hash:

    syntax: python

    def planet(id):
    
      key = "planet:" + str(id)
      result = cache.hgetall(key)
    
      if result:
          return result
    
      else:
          sql = "SELECT `id`, `name` FROM `planet` WHERE `id`=%s"
          result = db_record(sql, (id,))
    
          if result:
              cache.hmset(key, result)
              cache.expire(key, ttl)
          return result

    The keyspace in Redis is flat, but there’s a convention for simulating structure by using colon separated strings. In the example, the key for the record with ID 1 will be with “planet:1”. While this snippet is good enough for exhibiting a common pattern, more abstraction is possible: one module could be in charge of generating the keys, another could take care of building the SQL string, etc. Furthermore, chances are there are tools built for that purpose in the programming language you use.

    The example retrieves records either from the cache or from the database, and similarly there could be a function in charge of persisting an object to the database.

  • In the two examples you used a Time To Live or ttl, after which Redis evicts the key. While this is good enough in most cases, you may want to remove stale data from the cache as soon as possible. If that’s your use case, make sure you check other options like the write-through caching strategy. Links for more information are provided at the end of this tutorial. Worth mentioning in case you are curious: while the examples use the EXPIRE command, Redis also provides the EXPIREAT, which lets you specify a precise date and time when a key should be evicted. It takes as a parameter an absolute Unix timestamp (i.e., seconds elapsed since January 1, 1970).

  • When the amount of data exceeds the configured maxmemory setting, Redis has different ways of responding depending on the selected eviction policy. By default, ElastiCache (Redis OSS) is configured to remove from memory the least recently used keys with a ttl set. The eviction policy parameter is called maxmemory-policy, and the default value in ElastiCache is volatile-lru. Another interesting option for this use case is the volatile-ttl policy, which instructs Redis to reclaim memory by removing those keys with the shortest ttl.

  • Once you have implemented this strategy, make sure to test your application to find the best value for the ttl and the best eviction strategy. Check the performance of your application with an empty cache and with a full cache.

Was this module helpful?

Clean Up and Next Steps