Learn about new open-source Redis 5 features such as streams, sorted sets pop operations, cluster manager inside redis-cli, and HyperLogLog algorithm.
Congratulations to the community on the release of Redis 5, the latest GA version of open-source Redis. Since its initial release in 2009, open-source Redis has evolved beyond a caching technology to an easy to use, fast, in-memory data store, which provides versatile data structures and sub-millisecond responses for high-performance applications. Now, with the introduction of streams, Redis 5 moves beyond data structures and offers some extremely useful algorithms. This blog post discusses the new features in Redis 5.
The new stream data type is the highlight of this release. It's a dynamic feature that deserves its own blog post—and in time it will have it.
A stream in Redis is conceptually like a list, where you can push and pop values. Unlike lists, in Redis streams, you can lookup elements by ID. Something special about Redis streams is that they can handle consumer groups, a feature that allows a group of clients to cooperate when consuming elements from a stream. Redis streams are implemented with Rax, a radix-tree library created by Salvatore Sanfilippo that he released independently. It's a memory-efficient implementation optimized for fast lookups and range queries. This new data type is the perfect building block for message brokers, message queues, unified logs, and chat systems. Surely we will see many creative uses in the near future.
There are many implementations of message queues on top of Redis 4.x and earlier versions, but they lack some features that are possible using streams with consumer groups. For example, in Redis streams, consumers in a group have to acknowledge the processing of an item with the XACK command. They can also claim ownership of a pending message with the XCLAIM command. For more information, read the blog describing how to use Redis streams to implement some common patterns.
Redis 5 incorporates new commands for popping elements from the start and the end of the sorted set, treating it as if it were a list in some use cases. A use case for this feature can be using sorted sets as a data structure to manage bids for an online auction where the lowest bid gets eliminated after each round. The command ZPOPMAX returns and removes elements with the highest score, ZPOPMIN returns and removes elements with the lowest score. In both cases, if you don't pass a COUNT argument, a default value of 1 is used. Both commands have a blocking version, BZPOPMAX and BZPOPMIN.
The modules API incorporates a cluster message bus abstraction. This is extremely powerful, as it provides the infrastructure for building distributed systems: a node can send a message to a particular node, or to all nodes. Another addition to the modules API is the timer. A module can now create high precision timers, where it can configure a duration in milliseconds and a callback.
The need for these new API calls comes from the idea of implementing Disque as a Redis module. A Disque comeback should be welcomed because even if it never got enough traction, it proved extremely useful and well-thought-out.
Note: Amazon ElastiCache for Redis doesn't provide support for Redis modules.
Under the hood
Improved active memory defragmentation
One of the highlights of the previous release was the fact that Redis gained the capability to defragment the memory while online. The way it works is very clever: Redis scans the keyspace and, for each pointer, asks the allocator if moving it to a new address would help to reduce the fragmentation. This release ships with what can be called active defrag 2: It's faster, smarter, and has lower latency. This feature is especially useful for workloads where the allocator cannot keep the fragmentation low enough, so the strategy is for both Redis and the allocator to cooperate. For this to work, the Jemalloc allocator has to be used. Luckily, it's the default allocator on Linux.
A better and faster algorithm for HyperLogLogs
Redis got a nice contribution by Otmar Ertl, a mathematician from Austria. At the International Mathematical Olympiad, he won bronze in 1999 and silver in 2000. Later, he discovered a way to improve the cardinality estimation of data sets in HyperLogLog sketches and published a paper about it in 2017. Not only that, but he also wrote the patch for Redis and sent a pull request, which was merged soon after. It means that we now have a faster and better algorithm for HyperLogLogs. You can check the paper here.
Inline help for redis-cli
In Redis 4, you could type MEMORY HELP in redis-cli and read about the available MEMORY subcommands. Now you can find that same kind of help for many other commands. If all you need is a quick reference, you can skip a trip to redis.io and send HELP as the command's only argument. For example, when exploring the new streams, you can type XINFO HELP to see the options at your disposal.
Based on discussions in the Redis community, the Redis contributors have decided to change the replication terminology from master/slave to master/replica. Because Redis always tries to be backward compatible, every command and configuration directive with the word "slave" now has a counterpart with the word "replica." There are some exceptions, like Sentinel log output, where changing the event name from "slave" to "replica" would render it incompatible with previous versions. If you want to rename your configuration to use the new terminology, you can run CONFIG REWRITE. Running this translates all the problematic directives to their new names.
The work is not finished. The idea is to eventually change the terminology at the protocol level, but a complete renaming might take years to complete.
Note: Amazon ElastiCache parameter groups now refer to the updated terminology, so you should take that into account when upgrading.
Programming as art
Georg Nees was a pioneer in computer art and generative graphics. In 1965, he enlightened the world by presenting the first computer-designed work of art at the Computer Graphik exhibition. Three years later, he created Schotter, arguably his most famous artistic piece. As of today, it is also part of Redis. Starting with Redis 5, there is a new command called LOLWUT that outputs Georg Nees' Schotter. The output of the LOLWUT command changes with each version, with the only purpose of being amusing. It's a good way to remind us that programs are a medium and the end can be art. It's also a good way to commemorate the 50-year anniversary of Schotter.
Future Redis major versions will be released every one or one and a half year. So, instead of bumping the minor version number (for example, going from 4.0 to 4.2), the version number will go from Redis 5 to Redis 6, then to Redis 7, and so on.
Although Redis 5 is mostly backward-compatible, there are just a couple details to keep in mind if you are migrating from 4.x to 5.0.
All cluster tools that were part of redis-trib.rb are now part of redis-cli. If you want more information about Redis cluster-related commands, you can run redis-cli --cluster help. Although the feature set is the same, it's nice that it doesn't depend on Ruby anymore. As a side effect of porting it to C, it also feels snappier.
Note: Amazon ElastiCache does not support the redis-trib and redis-cli resharding tools. The resharding APIs provided by the service work out of the box without any changes.
The RDB file format now includes metadata necessary for expiring keys according to the LFU/LRU algorithms. In the past, after a recovery from a snapshot, that information was lost. Now, after a restart, the evictions proceed as expected (not applicable to Amazon ElastiCache). Although Redis 5 is capable of reading old RDB files, previous Redis versions won't be able to read the new format.
If you parse Redis logs, make sure that you check everything is correct for your use case, because some formatting and messages have changed. For example, the logs now list the year. If you parse dates from the log, this change might break your scripts.
The maxmemory setting is now ignored by replicas. It goes into effect only when a replica is promoted to master either manually or as a result of a failover. If you want your replicas to have an enforceable maxmemory limit, you have to configure it manually. The reason for the change is that sometimes writes were inconsistent if they fit in the master but not in the replica. The new behavior fixes that. However, now you have to keep an eye on the memory used by replicas if you don't want any surprise. The eviction of keys is now handled only by the master. When a key expires, it sends DEL commands to the replica. If your setup has a writable replica, or if for some reason you want to revert to the old behavior, you can change it with this configuration directive:
Note: This isn't an upgrade consideration for Amazon ElastiCache, as consistency was enforced in versions prior to 5.0. Also, Amazon ElastiCache doesn't support changing the replica-ignore-maxmemory setting.
Lua scripts are no longer replicated directly. Instead, their effects are sent to the replicas and to the AOF log. If you really need the scripts to be sent with EVAL/EVALSHA, you can turn the old behavior back on with the configuration directive lua-replicate-commands. This setting is not documented (Amazon ElastiCache documents this setting in the parameter group), and exists only to provide a transition period because the old behavior will be removed in Redis 6.
This blog post covers key open-source Redis 5 features such as streams, sorted sets pop operations, improved memory management, cluster manager inside redis-cli, and inline help in redis-cli. It also discusses HyperLogLog improvements, naming and versioning changes, and upgrade considerations. In the future, we plan to put out more blog posts about Redis 5 features such as streams, so stay tuned!
Fully managed Redis on AWS
Amazon offers a fully managed Redis service, Amazon ElastiCache for Redis, available for trial at no cost with the AWS Free Tier. Amazon ElastiCache for Redis makes it easy to set up, operate, and scale Redis deployments in the cloud. With Amazon ElastiCache, you can deploy internet-scale Redis deployments in minutes, with cost-efficient and resizable hardware capacity.
Get started with free Amazon ElastiCache for Redis in three easy steps: