Advertisement

Redis: Understanding the Open Source Data Store’s Primary Uses and Challenges

By on

Click to learn more about author Bassam Chahine.

Redis, which stands for “REmote Dictionary Server,” is a speed-optimized in-memory data store most often used as a cache. Redis has data structure versatility — from strings, lists, dictionaries, and sets, to support for approximate counting, geolocation, and stream processing. While Redis is configured as a cache by default, it persists data to disk and can readily serve as a database, message broker, or queue.

From a capability perspective, open source Redis offers particularly high performance, high availability, and the schema-less flexibility to support a rather wide range of data types and use cases. Redis is currently deployed by more than 20 percent of professional developers according to the 2020 Stack Overflow Developer Survey and has held strong as developers’ “most loved” database technology for the past four years in that same survey.

How Redis Works and Key Advantages

Redis data is both in-memory and persists to disk because it can write to disk in two different formats. A binary format represents data in-memory and can be reloaded following a restart. Another “append-only file” (AOF) format captures commands to the database and can replay those commands to bring a restarted Redis instance to its previous state. By storing data in-memory, Redis eliminates the latency of accessing disk storage — providing sub-millisecond read/write operations and performing millions of operations per second.

Redis’ disk persistence enables “warm restarts” that allow a Redis cache to take position in front of slower infrastructure and pass traffic through at a pace that infrastructure can handle. Disk persistence also allows developers to use Redis as a primary database for applications where the data size and risk profile are well understood and compatible. Additionally, Redis provides clustering, either through horizontal clustering that enables asynchronous data replication across servers, or a primary/replica server setup. Horizontal clustering is most appropriate when using Redis as a cache, maximizing storage, and relying on warm restarts to mitigate node failures. In database use cases, a primary/replica architecture enables the Redis cluster to match demand with dynamic scalability, while delivering uninterrupted availability. Parallel processing across replica servers increases read performance as well.

Importantly, Redis enables developers to leverage data with minimal code by utilizing native data structures and the more than 100 open source clients available. Redis supports Java, C, C++, C#, Python, PHP, Ruby, JavaScript, Node.js, Go, R, and more, allowing developers to code in their preferred language.

Redis is also 100 percent open source, guaranteeing total freedom from vendor or technical lock-in, support for open data formats, and the backing of a robust and active community.

Redis Use Cases and Features

Intelligent Caching, Data Expiration, and “Eviction Policies”: In its primary use case as a cache, Redis enables applications to deliver rapid responses to users by keeping frequently accessed data in-memory. Redis developers can mark data structures with a Time To Live (TTL), controlling the number of seconds until that data is removed. Developers can also configure intelligent “eviction policies” to remove data with the shortest TTL before other data. Alternatively, removal can be based on the least recently used (LRU) or least frequently used (LFU) metrics. These intelligent caching patterns enable optimized user experiences and productivity.

Streams and Stream Processing: Redis 5.0 introduced streams and stream processing, inspired by Apache Kafka. Much like Kafka topics, Redis can assign consumer groups to process streams of work. If a consumer fails to acknowledge work completion, other consumers take on the work. This Kafka-like behavior in-memory supports responsive experiences with non-blocking user interfaces.

Publication and Subscription Messaging (Pub/Sub): With pub/sub messaging, messages are passed to all subscribers currently listening. This enables awareness of load across infrastructure and applications, supporting use cases such as notifications and gaming scoreboards.

Lua Scripting: Redis is able to execute scripts in the Lua language. Developers can create custom scripts to add their own features to Redis.

Geolocation Features: Redis offers geolocation data structures and commands — especially useful for applications like ride-sharing apps that require location data. This data stores sets of latitude and longitude coordinates. Redis enables queries to determine the distance between objects and other useful solutions.

Hyperloglog: Redis includes the Hyperloglog data structure, which enables approximate counts of set sizes that use far less memory versus storing complete set counts.

Longest Common Substring (LCS): Redis 6.0 added the LCS algorithm, enabling quick comparisons between two strings to extract the longest common substring. For example, comparing “pineapple” and “red apple” yields “apple.”

Redis Challenges and Inappropriate Use Cases

Configuring Redis as a Primary Database: It’s certainly possible to implement Redis as an effective primary database, but doing so requires careful and rather challenging configuration work. To function as a database, Redis must be configured for high availability and cannot restart empty. Take particular care in changing these and other default options intended for cache use cases.

Using the KEYS Command: Because the Redis KEYS command affects all keys in a database, developers can struggle to leverage it without impacting performance. Best advice: Simply avoid using the KEYS command.

Using REST Connections: Developers familiar with other databases often complete requests by opening and closing connections to handle each command in a REST style. Redis is designed differently: Use continuous open connections, or performance will suffer.

Caching Static Assets: Redis isn’t appropriate for caching static assets, such as images or videos. Instead, deliver those assets with a web server or content delivery network (CDN).

Storing Crucial Data: Redis isn’t an ideal canonical data store in scenarios where data size exceeds the cluster’s memory capacity. A database with a high replication factor to multiple data centers, such as open source Apache Cassandra, is a better choice for crucial data.

Finally, Redis is ideal for low-latency use cases. Otherwise, databases that can store to disk (again, like Cassandra) may be better options.

Conclusion

Redis is a particularly impressive and deservedly popular open source data technology that lends itself to many, many use cases. That said, getting the most out of Redis means ensuring that each implementation matches what the data store technology can deliver and that Redis is properly configured to do what it can do best.

Leave a Reply