Loading...
You are here:  Home  >  Data Blogs | Information From Enterprise Leaders  >  Current Article

Docker Containers and Database Cloning for DBAs, Data Governance, and IT Decision Makers: An Introduction

By   /  February 5, 2018  /  1 Comment

Click to learn more about author Paul Stanton.

The industry is buzzing about Docker Containers and their rapid adoption.   It’s not widely known, but MySQL, Postgres, and NoSQL are among Docker’s most popular images.  Microsoft has joined the fray with Docker as a feature of Windows Server 2016 and Windows 10, and SQL Server 2017 Linux containers.

This is the first in a series of articles on Docker, it’s use for Development and Test, Reporting, and Data Governance.   The series will be interesting to Developers, DBAs, Data Governance, Architects, and IT decision makers.

  • Part 1: Introduction to Docker Containers
  • Part 2: Advanced use of Docker Containers and Database Cloning
  • Part 3: Case Studies on Docker Container Use
  • Part 4: Docker Containers and Data Governance

Dockerfiles, Containers, and Images

Containers provide process and user isolation and originated on BSD UNIX around 2000.   The concept surged in popularity with the Docker open source project in 2013 and has become one of the most popular open source projects ever.   Docker now enjoys industry-wide support from Red Hat, Google, AWS, Microsoft, and hundreds of other firms.

Docker is an elegant design for application run times, as well as packaging, distribution, and management.   The workflow begins with a Dockerfile that is a plain text configuration file that defines an image or container.   An image is immutable and portable, providing guaranteed compatibility when shared.   Docker images are also immutable and contribute to Data Governance, which we’ll cover later in this series.

The Docker design includes a supervisory “daemon” (or Windows Service), Dockerfiles, a DockerHub image registry, as well as locally stored images, and containers.

An image is built referencing a dockerfile (step 1), which pulls the details of the source image from a local cache (step 2), and on completion is saved as a local “new_image” (Step 3).    A container is created using a >docker run command on the “new_image” (step 4).

This workflow has become incredibly popular for .NET and Java development, and is growing in popularity for provisioning database environments.

A dockerfile can add databases or scripts, yielding a modified image that is used for testing.  A sample dockerfile begins with a reference to an image (in this case mssql-2012), and then a network attached database is mounted, and finishes by running a data masking script on the database.

Docker is known for speed, as containers are delivered in seconds (sometimes less), for assured compatibility when images are shared, and for increased system utilization.   Where private data centers may reach 30% utilization with virtualization, use with containers can double that or more.

Challenges of Relational Databases

It’s common to hear debates over the suitability of Docker for stateful apps.   Docker was designed for stateless applications and horizontal scalability, with containers deleted and replaced as needed.   Databases are not suited for this approach, and Docker is evolving to support the needs of stateful enterprise apps.

First, the relatively short-lives of database environments in development and test, and many reporting/Bi uses, are a good fit for database containers.  There are also expanding options for support of the data environments, including in-container, mounted, and cloned.   Containers are being put into production use on Linux, and as they mature on Windows we can expect increased production usage, particularly as production workflow support is further developed.

Each container has a private file system, and databases run in the local file system are “in container.”   This works well for smaller databases (each container with a copy of the data), but changes in the database are lost when the container is deleted.   A more popular approach is to use external mount points either on the local host or over a network.     Changes in data are persisted, and databases can be mounted to a fresh container as needed.

A third option combines mount points with Virtual Hard Drives (VHDs) that are exposed as clones to the container.   A full byte copy image is built, using either snapshots or backups, and in turn is used to deliver clones.   Clones provide full read/write support, with changes captured using a Copy on Write design, and are delivered in seconds irrespective of size.   A Terabyte-sized database is delivered in roughly 30 seconds, and require less than 40 MB of storage.    The benefit of database cloning includes greater speed, as well as storage and network bandwidth savings.

Working with SQL Server Containers

Docker is used via a command line interface, with commands, options, and parameters.  In the example below a container is built using a dockerfile located in the \windocks\samples directory.   The context of the dockerfile is displayed, and the details of the resulting container are returned on completion.

Each container includes a unique container ID, port at which it is accessed, and in the case of SQL Server containers can include an optional sa password.  The final step is to start the container, which references a subset of the container ID (2-3 digits are generally sufficient for a unique match).

Once started, the container is accessed via SQL Server Management Studio, as a named SQL Server instance.   In the case of local access on the host, reference the local loopback address and port (127.0.0.1,10001, using a comma separator).  For access from remote clients, reference the container host IP address.

Results and Resources

Containers are a dramatic step forward for support of developers and test, and the ability to support SQL Server in containers should be a boon for Windows based software development. SQL Server environments can be provided daily, with the latest production data, using simple automated processes.

To explore Microsoft’s Docker Windows containers on Windows Server 2016 or Pro and Enterprise editions of Windows 10, visit Microsoft at:  https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-docker/configure-docker-daemon

To explore use of Docker on Windows 8 and 8.1, and Windows 10, Windows Server 2012 or Windows Server 2016, download a free Windocks Community Edition at:  https://www.windocks.com/community-docker-windows

 

Photo Credits: Windocks

 

About the author

Paul Stanton is a former Director at Microsoft, with over 30 years of experience in the technology industry, and focused on solutions for enterprise data management and delivery. Windocks delivers virtualized database environments for organizations around the globe, enabling organizations to access, manage, and protect data faster, more efficiently, and simply than existing systems. Database cloning combined with Docker based containers enables modern software development and delivery, and reporting. Windocks is the first open, modern platform for enterprise data delivery.

  • GS

    There is `MountDB` command in DockerFile which is shown in example. What that command is supposed to do?

You might also like...

AI Infuses the Next Generation of Web Application Development

Read More →