In-Memory Databases: An Overview

In-memory databases work faster than databases with disk storage. This is because they use “internal” optimization algorithms, which are simpler and faster, and this type of system requires fewer CPU instructions than a disk storage system. Additionally, accessing data that has been stored “in-memory” eliminates the need for seek time while searching for data. As a consequence, several data warehouse vendors are switching to in-memory technology to speed up the processing of data. The cloud also presents an opportunity for using in-memory databases.

Traditionally, data has been stored on disk drives, with RAM used for short-term memory while the computer is in use. In-memory database architecture uses a database management system that relies primarily on a computer’s main memory (RAM), and is organized by an in-memory database management system (IMDBMS). In-memory database (IMDB) architecture requires a management system designed to use the computer’s main memory as the primary location to store and access data, rather than a disk drive.

Though in-memory database systems do have broad uses, they are used primarily for real-time applications requiring high performance technology. The use cases for these systems include applications for real-time responses, such as with the finance, defense, telecom, and intelligence industries. Applications requiring real-time data access such as streaming apps, call center apps, reservations apps, and travel apps also work well with IMDBMS.

The two primary reasons in-memory databases have not historically been popular have to do with costs and a lack of ACID (atomicity, consistency, isolation, and durability) compliance. The lack of “durability,” refers to the IMBDs loss of memory, should the electricity be cut. Also, RAM has, historically, been fairly expensive, and this has stunted the growth and evolution of in-memory databases. Recently, the cost of RAM has begun to drop, making IMBDs more affordable.

Memory vs. Storage

Storage is for data that is currently not being used, but has been recorded on a hard disk, can be saved indefinitely and be recalled as needed. Data stored on a disk is permanent unless erased. Hard drive storage is generally used for long-term storage purposes. Traditionally, hard drives were designed to save much larger amounts of data than RAM. That situation is changing.

RAM is a physical component, not a software program. It uses computer chips (integrated circuits) that are soldered to the main logic board, or, as is true of many personal computers, uses a plug-in system for the easy upgrading of memory modules (also known as DRAM modules). Using an IMDB instead of a disk drive system provides the following benefits:

RAM can be increased to improve performance with relative ease.

Additional RAM allows a computer to do more at once (but does not actually make it faster).

Additional RAM improves the switching between different applications and allows multiple applications to be open without causing the system to become sluggish.

It uses less power than disk drives.

There are two basic types of RAM: DRAM (Dynamic Random Access Memory) and SRAM (Static Random Access Memory). RAM has been used as a form of short-term memory for computer use. The word used to describe RAM’s loss of memory when the electricity is cut off is “volatile.”

DRAM: The term “dynamic” indicates that memories must constantly be refreshed. DRAM is generally used as the main memory in computers. RAM must be refreshed thousands of times each second.

SRAM: Typically used as a system cache. (A smaller, faster memory that is closer to a processor core.) It stores copies of regularly used data from its main memory and is described as “static” because it does not need to be refreshed. However, SRAM is also volatile and loses its memories when the power is cut.

Scaling

Currently, IMDGs provide a simple, cost-effective way to provide scalability. An IMDG allows scaling simply by adding a new RAM. Adding memory is described as “vertical scaling” and involves increasing the capacity of a system, allowing it to handle more transactions. This is the simplest, fastest way of increasing capacity without significantly changing the system architecture. Also, databases that can scale out, while offering a view of the data, can make working with containers significantly easier.

NVRAM

RAM comes with a significant and obvious problem. It loses data during a power outage (or if it becomes unplugged), causing great frustration for its human users. Non-volatile random-access memory (NVRAM) describes a computer memory capable of holding data even after power to the memory has been cut.

At present, the most popular form of NVRAM is called flash memory. Flash memory is non-volatile computer storage that can be deliberately erased and reprogrammed. It is a memory chip for storing and transferring data from one digital device to another. Flash memory can be electronically re-programmed or erased. It can be found in digital cameras, MP3 players, USB flash drives, and solid-state drives.

A significant advance in NVRAM technology is the floating-gate transistor, providing erasable, programmable, read-only memory (EPROM). The floating-gate transistor consists of a gate terminal, protected by high-quality insulation (acting as a switch) for a grid of transistors. The EPROM could be erased and re-set by applying ultraviolet light. This technology was recently replaced with the EEPROM, which uses electricity to reset the memories. New concepts for NVRAM include:

Ferroelectric RAM (F-RAM): A random-access memory, very similar to DRAM, but uses a thin ferroelectric film whose atoms change polarity, resulting in a switch. Memory is retained when the power is cut off.

Magnetoresistive RAM (MRAM): Uses magnetic elements and operates similar to the core.

Phase change RAM (PRAM): Uses the same tactics as writable CDs, but readings are based on changes in electrical resistance, instead of optical properties.

Millipede memory: Basically a punched card created by nanotechnology.

Nano RAM: Based on carbon nanotube technology.

In-Memory Database Management System (IMDBMS)

A thorough understanding of an organization’s needs and priorities is crucial in determining the best choice of database architectures. IMDBMSs (sometimes abbreviated to “main memory database systems) use a variety of approaches and techniques to provide in-memory database processing.

Modern IMDBMS not only store data in the memory, but also perform operations within the memory. All the data might be stored in memory, but may be in a compressed format, optimizing access and data storage. The DBMS can be designed to offer hybrid capabilities, such as combining the functions of a disk drive and in-memory technologies to maximize performance and minimize costs.

To assure the durability of data within an IMDBMS, it must periodically be transferred from the volatile memory to a more persistent, long-term form of storage. One method for this is called “transaction logging,” with timed snapshots of the in-memory data sent to some form of non-volatile storage. Should the system fail (and be rebooted), the database can be reset, with most of the current data still available.

The Cloud and IMBD

The cloud provides an excellent environment for getting the most out of in-memory computing. A cloud environment offers organizations the ability to access large amounts of RAM at will. This approach can help organizations avoid the expense of an on-premises in-memory computer.

The cloud can also provide an environment that makes in-memory storage more reliable through the use of redundant hosts and virtual machines using automatic failover. With these measures, the disruption of RAM will not lead to a data loss. These protective measures are more difficult to develop in an on-premises computer system. Combining the cloud and in-memory computing provides an excellent way to maximize the benefits an in-memory system.

ATTEND OUR LIVE ONLINE DATA MANAGEMENT FUNDAMENTALS COURSE

Data Topics