The Data Logjam Facing DevOps and Digital Transformation

By on

Click to learn more about author Paul Stanton.

Cloud Computing and DevOps are dominant themes in computing today.   AWS, Azure, Google, and others compete to provide greater agility, while reducing enterprise cost of building, running, and protecting information services.    Progress is being made in DevOps strategies, with new leadership and organizational culture, combined with tools and technologies.   Surveys show growing investment and other transformational strategies.

Progress for many, however, is blocked by inadequate access to relational data.  The Gartner Group underscores the role of relational data in the “Operational Database Magic Quadrant 2017” (available courtesy of Microsoft here).

Gartner states:

Through 2020, relational technology will continue to be used for at least 70% of new applications and projects.” 

The problem is that tooling for creating, managing, and delivering database environments has progressed little in the past decade.   Industry surveys highlight the gap between modern Continuous Integration processes and support for relational data environments:

  • Puppet reports in the State of DevOps 2017, “high performers measure feature branch life and integration in hours (page 41).”
  • Puppet reports in the same survey, that “high performers include systems of record (ie., systems with relational database back-ends (page 45).”
  • Dell reports in research conducted by Unisphere in State and Adoption of DevOps, that “over 80% of enterprises refresh database environments for Dev/Test 2x monthly or less (page 19).”
  • RightScale reports in the State of the Cloud 2017, that Docker containers have risen to the #1 enterprise tool chain for DevOps and software delivery strategies (page 25).

The contrast between the Docker container provisioning of Java and .NET workloads in seconds, against days or weeks required for database environments highlights the chasm facing many DevOps initiatives.    Without dramatic changes in delivery of relational database environments, many DevOps and Digital Transformation initiatives will simply fail.

Let’s take a look at the most promising strategies for alleviating the data logjam.

Flash-Based Storage

Clouds automate delivery of Virtual Machines on demand, and Docker containers can be provisioned with Java and .NET workloads in seconds.   Database environments average hundreds of GBs or more, and high performance flash based SANs are commonly used to improve data delivery.

An all flash SAN with a fast network can deliver 2 GB/second, enabling delivery of a VM with 500 GB of data in 10 minutes or less.  This is a huge step forward for most organizations, but the buy-in cost for flash storage begins at $100,000, or more.  Use is also complicated by the lack of data masking, which becomes another step in the workflow.

Database Clones and Snapshots

Storage Array Networks (SANs) have supported fast writable snapshots for the past two decades.  Snapshots are writable, provisioned in seconds, and consume minimal storage and are excellent for Dev/Test use.   But, these capabilities are largely unused due to complex scripting required to provision snapshots, storage LUNs, and mount points.

Fortunately, a new generation of storage vendors including Cohesity, Rubrik, and others, are delivering storage systems with data access with restful APIs.   These systems also provide incremental snapshots, with the goal being to eliminate (as much as possible) Full backups.   Customer feedback is positive, and they are a big step forward for secondary storage access.  Cost of ownership continues to be a challenge for organizations with budget constraints, and data masking remains as an extra step for sensitive data.

The primary drawback to storage centered strategies for many is their reliance on a UNIX operating system.   Most storage systems run on Solaris UNIX, with the ZFS file system for fast snapshot support.  Dedicated UNIX storage administrators come as part of the package.

Windows-Based Database Cloning and Containers

In the past year Windocks has pioneered software based database cloning.   Windocks runs on standard Windows Servers, allowing SQL Server DBAs to create and manage complex database images.  Windows database clones utilize the same designs as Storage Array Networks, and deliver writable database environments in seconds, with minimal storage.   The combination has allowed Windocks to grow rapidly, and provide data delivery support for a fraction of the cost of storage systems.

In addition to database cloning, Windocks is also an independent port of Docker’s source to Windows, supporting .NET, and all editions of SQL Server 2008 onward with containers.   The combination of database clones with containers is compelling, as terabyte class environments are delivered in seconds, with multi-tier application environments.   Teams work on a shared server with isolated containers, and simplify operations with an average reduction in VMs used of 5:1 or more.

Interestingly, Windocks also simplifies use SAN hosted snapshots, with a SAN ready container that automates the provisioning and use of SAN based snapshots.   This approach allows customers to extend the useful life of their NetApp and EqualLogic (and other) SANs, and has been a boon for organizations with a mix of varied storage systems due to growth through mergers and acquisitions.

Data Delivery with Data Governance with Regulatory Compliance?

In the midst of operational challenges in provisioning of relational data, organizations are also contending with Data Governance and Regulatory Compliance.   Windocks delivers fast access to relational data with a data images that are versioned, and resides in an auditable image repository.

Windocks images support a single database, or can include fifty or more databases in a single image.   Data masking is implemented during the image build, delivering a usable environment that complies with privacy and other policies.  On completion, data images are versioned, and stored in an auditable image repository that uniquely addresses Regulatory Compliance concerns.   This is a unique design that offers significant value to organizations that are struggling with audit support for data usage.

A Wake Up Call for CIOs and Cloud Architects

We are in the early days of DevOps and Digital Transformation, and it’s time for CIOs and Cloud Architects to engage to address the Relational data logjam.    The “go to” solution for many organizations will mirror current investments and organizational structure.   For many the solution will “double down” on storage systems and their ongoing costs of administration.   For organizations that want a new cloud-native software solution on Windows, that enhances Data Governance and Compliance, Windocks presents a new approach.

Leave a Reply