30 August 2017

My intro to Docker - Part 4 of 5

In the previous posts about Docker, here, here and here, we’ve looked at what Docker is, how to set up a basic container and how to set up a stack of containers using docker-compose. One thing we haven’t talked about is the fact that most projects use some form of persistent data store, and the most common store, at least in my world, is a relational database of some sort. So this time, I want to cover something that might seem slightly odd…setting up an MS SQL Server…on Linux…in Docker.

Yes, you heard me right… I’m going to show you how to set up an MS SQL Server instance in a Linux-based Docker container. Something that wouldn’t have been possible, in any way, not too long ago, but Microsoft “recently” released a version of MS SQL Server that runs on Linux, which is really cool. And running it in Docker just makes sense!

Running MS SQL Server in a Docker container

Starting a SQL Server instance in a Docker isn’t that hard, but there are a couple of things that need to be set up for it to work.

of all, SQL Server can only run on a machine that has at least 3,25 Gb of memory. And when running Docker for Windows (or Mac), that isn’t the case by default. So the first step is to update the Docker VM host to have more memory. Luckily, this is a piece of cake to do. Just go down to the systray and right-click the little Docker icon (the whale), choose Settings and then Advanced settings. In here, you can just pull the slider to update the available memory to be at least 3,25 Gb, and then click Apply. This will cause the VM that runs in Hyper-V to restart, and when it comes back online, it will have more memory for us to play with.

Note: Don’t go in and update the VM in the Hyper-V manager…

Next, SQL Server requires a couple of environment variables to be set up, for it to start up without any user interaction. First of all, it requires a variable called ACCEPT_EULA, that should be set to Y, to confirm that we accept the EULA. And then we need a admin password to be set up using the SA_PASSWORD variable. So to run a new SQL Server instance, you can just run

docker run --name sql_server -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=MyVeryS3cr3tPassw0rd!' -p 1433:1433 –d microsoft/mssql-server-linux

This will start a new SQL Server instance inside a detached container, mapping port 1433 on the host, to 1433 on the container. This means that once the container is up and running, and SQL Server has started (which takes a few seconds) you can just open a SQL Server Management Studio instance and connect to localhost. Note: If you already have SQL Server installed on your machine, port 1433 is likely already in use, so you will have to map some other port on the host, and then use that new port when connecting SSMS, using the format <hostname>, <port number>.

And if you don’t want to use SSMS, or you are on a Mac, you can connect to the container interactively, and run queries, using this command

docker exec -it sql_server /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P MyVeryS3cr3tPassw0rd!

This will attach your command line to the SQL Server instance, allowing you to run commands like this

select * from sys.Tables
go

And when you are done, you can just run

exit

to detach.

However, it kind of sucks to have to attach like this, or using SSMS, and set up a database every time you start a container based on the SQL Server image. It would be much nicer to have the database set up for us when the container starts instead. And the easiest way to do this, is to create our own SQL Server Docker image, that contains the start up functionality we need.

Seeding the database

You’ll need to add a couple of files for this to work so I suggest creating a new directory to hold them. Inside that directory, we need a new dockerfile, in which we need to add the following

FROM microsoft/mssql-server-linux

ENV ACCEPT_EULA=Y
ENV MSSQL_SA_PASSWORD=MyPassw0rd!

RUN mkdir -p /app
WORKDIR /app COPY . /app
RUN chmod +x /app/import-data.sh

EXPOSE 1433 
ENTRYPOINT ["/bin/bash", "./entrypoint.sh"]

So what’s going on here? Well, first of all, it defines that the image should use the microsoft/mssql-server-linux image as the base image. Then it sets the required environment variables using the ENV keyword. And since I need a set up script and some other files, I go ahead and create a new directory called /app, making it the current working directory using the WORKDIR keyword. The files from the current directory is then added to that directory using the COPY keyword. However, since I’m going to use a shell script, I need to run chmod to set the permissions to allow it to be executed. Finally it exposes port 1433, which is used by SQL, before setting up an entrypoint telling the image to execute the entrypoint.sh script using bash.

That configures the image, but what’s inside the entrypoint.sh file? Well, not much…

/app/import-data.sh & /opt/mssql/bin/sqlservr

It just calls another shell script called import-data.sh, and then starts SQL Server by calling sqlservr. And yes, it needs to be in this order, even if it makes little sense trying to import data before SQL Server has started. But, the last thing being executed has to keep running to keep the container running, and the shell script will only run some commands and then terminate, while sqlservr will block the execution and keep running.

The import-data.sh file isn’t much more complicated

echo "Sleeping"
sleep 20s

echo "Setting up table"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P MyPassw0rd! -d master -i setup.sql

echo "Importing data in Products table"
/opt/mssql-tools/bin/bcp DemoData.dbo.Products in "/app/Products.csv" -c -t',' -S localhost -U sa -P MyPassw0rd!

Note: Currently, there is no way to add start up scripts to SQL Server for Linux, but hopefully there will be in the future. That way we won’t have to do a stupid sleep to get things to work…

After the 20 seconds, it uses sqlcmd to attach to the local SQL Server instance, using the sa account, and then runs a script called setup.sql, which we’ll have a look at in a minute… And finally it uses bcp to import the contents of a file called Products.csv into a table called Products in a database called DemoData, which is created in the setup.sql script that looks like this

CREATE DATABASE DemoData;
GO

USE DemoData;
GO

CREATE TABLE Products (ID int, ProductName nvarchar(max));
GO

and just to be thorough, and show it all, the Products.csv file looks like this

1,Skateboard
2,Kite
3,Parachute
4,Computer

With the dockerfile in place, we can create a new image by running

docker build --tag sql_server .

naming the image sql_server. Finally, we and start up a new container based on that image, by executing

docker run –it –p 1433:1433 --rm sql_server

This should end up with a running SQL Server instance, with a database called DemoData, containing a table called Products with some data. And as long as you don’t stop the container, you should be able to connect to it as usual using SSMS (or whatever tool you normally use) and look at the database.

However…this set up kind of sucks! It will set up a new database, with the same hardcoded data, every time you spin up a new container based on this image. Not playing very well together with the idea that containers can be ephemeral, and be taken down and moved around at any given time… Unless, all you need is a database with a fixed dataset. But in most cases, you want to be able to use your database to store stuff, but adding, updating and removing things. So how do we go about solving this?

Adding persistence

To be able to persist data between when containers are moved around, updated etc, we need a storage location that isn’t tied to the container. Anything inside a container will be blown away when the container is deleted, which is what will happen if the container is moved for example. So in general, we don’t want to persist permanent things inside the container. Instead we want to persist permanent things external to the container.

There are 2 different ways to add external storage to a container. The first way is to bind something called a volume to the container, which basically means that we take a directory on the host, or somewhere persistent, and map it to a path inside the container.

Note: If you come from a Windows background, which I guess you probably do, you need to remember that Linux doesn’t use drive letters etc to access files. Instead they use mount points, which is basically just a path where some form of storage is located. So you could see a bound volume as a mapped network drive in Windows, but the mapped drive just gets a path…

Note 2: This doesn’t work for this special scenario for Docker for Mac. Binding volumes work, but apparently SQL has some issues with it. More information can be found here. So if you are on a Mac, you need to look at the “Alternate persistence” section below.

Note 3: Even if you are on a Windows machine, it might be a good idea to have a look at the “Alternate persistence” section below as well…

There are different drivers out there that allow you to use different storage mechanisms for your bound volumes, but for this demo, I’ll use a directory on the host as a volume.

To add a volume using the docker run command, you just need to add a –v parameter with the value set to <host directory>:<container path>. So running

docker run –v c:\DockerVolume\:/dockervolume myimage

Will start a new container using the myimage image, with the path /dockervolume/ inside the container, mapped to the c:\DockerVolume\ directory on the host. So anything inside the c:\DockerVolume\ directory will be available inside the container, and anything written to /dockervolume/ in the container, will be persisted to the c:\DockerVolume\ directory. And since that is a directory on the host, which hopefully won’t disappear, the data will be persisted between container starts.

Using this, we can go ahead and make the SQL storage a bit better. So, iInstead of creating a new database in every container, we can attach an existing database on start instead.

Step one, is to create the new database in some way. In my case, I just used SSMS and my local SQL Server instance to create a new database that replicates the one used in the previous example. Then I detached that database from my local SQL Server instance. This gave me a DemoData.mdf and DemoData_log.ldf file located somewhere similar to C:\Program Files\Microsoft SQL Server\MSSQL13.SQLEXPRESS\MSSQL\DATA (depends on what instance name you are using and so on…but you get the point). Then copied those 2 files, to a new directory called c:\MyDockerVolume\.

With the mdf and ldf files in place, we need to update the set up script to attach them to SQL Server, instead of creating a new database. To start out, we can update the setup.sh file to this

echo "Sleeping"
sleep 20s 

echo "Attaching Database"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P MyPassw0rd! -d master -i setup.sql
echo "Done Attaching Database"

As you can see, I have removed the CSV-file import, and then we need to update the setup.sql to this

USE [master]
GO

CREATE DATABASE [DemoData] 
    ON ( FILENAME = N'/sql_storage/DemoData.mdf' ), ( FILENAME = N'/sql_storage/DemoData_log.ldf' )
    FOR ATTACH
GO

It uses CREATE DATABASE to create a new database called DemoData, attaching the DemoData.mdf and DemoData_log.ldf files from a directory called /sql_storage/. This means that when the container starts up, it will start SQL Server, and then run the setup.sql script, setting up a new database using the two SQL files. All we need to remember is to make sure that we add a volume that contains the files. So to try this out, we first need to create a new image since the scripts have changed

docker build --tag sql_mounted .

and then we can create a new container using that new image

docker run -it -p 1433:1433 -v c:\MyDockerVolume:/sql_storage --rm sql_mounted

Note: Make sure you map the correct folder. The above command maps c:\DockerVolume\ which is just what I use…

Tip: If you get problems opening your files in the container, make sure that all the file permissions are set up properly on the host…

As long as you container is up and running, you should be able to connect through it using SSMS or whatever tool you use to work with SQL Server. And if you add or remove any data, it will be persisted even if you remove the container and start it back up again. So you should be good to go.

Alternate persistence

The second way to persist files between container instances, is by using something called a data volume container. This is basically a mapped storage location called a volume, managed by Docker. And the way you set that up, is by first creating a new volume by calling

docker volume create sql_data

This will create a new empty volume called sql_data.

Then you need to add the files you want in the volume, by copying them from somewhere. In this case, the host. However, to do that, you need to create a container and attach the volume to that, and then copy the files to the volume through that container.

So, to do that, you can run

docker run -v sql_data:/sql_storage --name temp alpine

to create a new container called temp, with the sql_data volume attached to /sql_storage/.

Next, the files are copied using docker, by calling

docker cp ./Data/DemoData.mdf temp:/sql_storagedocker cp ./Data/DemoData_log.ldf temp:/sql_storage

which will copy the DemoData.mdf and DemoData_log.ldf files to the containers /sql_data/ directory, which then maps to the sql_data volume. And when that is done, you can go ahead and delete the temporary container

docker rm temp

Now, the volume should contain all of the required files, so running

docker run -it -p 1433:1433 –v sql_data:/sql_storage --rm sql_mounted

will start up a new SQL container, but this time with the /sql_storage/ path mapped to the volume we just created.

The volume is then persisted separately from the container. So when the container is removed, the volume is persisted, read for re-use.

Note: For a real scenario, you would just skip the -it and add -d to have the container running in the background instead.

To view your volumes on the host, just run

docker volume ls

and if you want to remove a volume, you just run

docker volume rm <volume name>

That’s pretty much it! Hopefully you now have an idea of how to store your data permanently, as well as how you can get SQL Server up and running inside a Docker container running Linux. Who would have thought A, that I would ever write about Linux and command lines, and B, about running Microsoft SQL Server on Linux!? It’s a brave new world!

The next post covers Docker Swarm and how we can deploy to a cluster of machines. Very cool stuff!

My intro to Docker - Part 4 of 5

Running MS SQL Server in a Docker container

Seeding the database

Adding persistence

Alternate persistence

My intro to Docker - Part 5 of 5

My intro to Docker - Part 3 of 5