Docker, Amazon EC2 and continuous builds using Jenkins

Published on February 10, 2016 by Jan Haderka



Magnolia is a fairly big sized project. This time, I wanted to check into this blog with something fresh — so I decided to take a stab at our build setup to see if I can squeeze a few extra minutes out of the build time, and to see how much effort it takes.

 

It is also a modular project, featuring modules for all kinds of functionality, so that:

  • customers can assemble a solution that suits their specific needs.
  • customers can replace any part that they want to behave differently with some certainty that they are not breaking the complete system.
  • they can estimate how much work such customization would be, by looking at the size of the module.
  • the development team and Magnolia (the company) can release in smaller chunks.
  • testing and maintenance is simplified.

One of the tools that is super important in the modular world (while always important, it is critical in this case) is having continuous integration. It allows you to build all those different modules together, to create the final assembly and to test any change on it in order to be sure that you are not breaking the product by a new fix or feature. Plus; with as many modules as we have, and with all their dependencies, building them all in the right order takes a while. So every minute counts here.

 

One thing everyone will recommend when you mention Jenkins and Amazon is to look at the AWS plugin. I surely did, but decided against using it in the end. The plugin itself is good and might fit your needs, but it would be spawning the new instance from the image you have for the build and shutting it down afterwards. In general, this saves the costs of running EC2 instances too much when not building that often at the expense of starting up the instance and waiting for it at the beginning. However in our case where the build queue is quite full on a regular day, it doesn’t help and a permanently available instance is more useful.

 

Another challenge that our builds impose on the infrastructure is use of various other tooling that needs to be there. Those extra tools we need are actually “just” 2 :D

  • Xvfb to run UI tests
  • cron to cleanup sometimes hanging stale instances of firefox from the above mentioned UI tests

 

And of course a bunch of other things that need to be there like git, (maven and java will be pulled in by jenkins on its own), ssh access, certain versions of Firefox that we require to run all the tests and so on.

 

So in theory, just like our own internal vmware based VMs that pose as Jenkins slaves, I could have just re-created image on Amazon using pretty much the same script, open the port and be done with it. But where would the fun be in that, right?

Actually, and for a more prosaic reason, I opted for yet another approach here. The reason being the issues that we run into with our current slave instances occasionally where they just get filled up with space, or corrupted or whatever and need to be recreated. It is a piece of infrastructure that is crucial to our main business, but not the piece that gets anyone excited. Just like break pads on your car, you want to have good ones, but when they're worn out, you want to be able to throw them away and replace them as quickly as possible … or swap for even better ones without too much worry if you decide to spend a weekend on the racing track.

 

So how can you get infrastructure that is simpler and faster to spin up or replace when failed (or when changing configuration of it) than VM (be that vmware or EC2 or any other private/public cloud))? The answer is obvious, isn’t it? Don’t use whole VMs but just the containers.

 

Hence the final setup of having a VM in EC2 running docker container that is our Jenkins slave.

 

One extra, originally unexpected benefit was that in order to get all the extra performance for the build, I was more generous when choosing the VM to use and only after running a couple of builds I realised that on that particular VM, I can deploy multiple containers and emulate multiple slave instances running the build w/o sweating too much. Sweet.

 

Enough talking, here's what it actually took.

 

First, local setup:

- you need to install docker (plenty of guides on the net to do that, choose one for your platform)

- you need to install Amazon CLI (again this is something fairly well documented so I won’t go into that)

- setup couple of env variables if you haven’t done so as part of the above steps, namely

$AWS_ACCESS_KEY should be set to your access key

$AWS_SECRET_KEY should be set to your secret key

$AWS_VPC_ID  should be set to your VPC id. You will find one (or can create one) in your AWS account, under VPC. Or you can create one w/ CLI you have installed in previous step. Or maybe you are lucky and Docker will be able to create one for you when creating the VM, though that didn’t work for me

$AWS_ZONE should be set to your availability zone, typically having value of just A or B

$AWS_REGION should be set to your region, typically something like us-west-1 Make sure to NOT include availability zone in the name (the letter at the end). If you do, you will likely end up w/ cryptic error from docker/amazonec2 driver.

$AWS_INSTANCE_TYPE should be set to your instance type,say  m4.large The micro instances are most likely going to be too tiny for your needs if you are building up anything of size and complexity similar to Magnolia ;)

 

Second, let’s create the VM:

docker-machine -D create \

--driver amazonec2 \

--amazonec2-access-key $AWS_ACCESS_KEY \

--amazonec2-secret-key $AWS_SECRET_KEY \

--amazonec2-vpc-id $AWS_VPC_ID \

--amazonec2-zone $AWS_ZONE \

--amazonec2-region $AWS_REGION \

--amazonec2-instance-type $AWS_INSTANCE_TYPE \

--amazonec2-root-size 10 \

jenkins

 

The command above is fairly trivial if you know how to read it. First you tell docker machine to create one for you using amazon EC2 driver. Then you provide the number of params for the driver, the 2nd last one being the root size of the disk and the last one being the name under which you will refer to that machine in Docker.

 

Third, what did I do third?
 

Oh yes, let's create our Docker container. We need to write a “descriptor” of some sort for that. One that Docker likes to call Dockerfile

I’m sure you saw a fair amount of those before coming here so don’t expect any surprises here 😃

First some stuff to describe the image, what is it based on (ubuntu in our case) and who maintains it (that would be me)

# This Dockerfile is used to build an image containing basic stuff to be used as a Jenkins slave build node.
FROM ubuntu:latest

MAINTAINER rah003

Next, we make sure any installation process later will not try to ask our script any questions and that the base image we got is up to date:

# General config
ENV DEBIAN_FRONTEND noninteractive

# Make sure the package repository is up to date.
RUN apt-get -q -y update
RUN apt-get -q -y upgrade

Yeah this is where we live and the server should think so as well

# Set timezone

RUN ln -sf  /usr/share/zoneinfo/Europe/Zurich /etc/localtime

And this is the language we speak. This stuff might not be strictly necessary, but I have ran into some weird issues with file encoding earlier so you might want to keep it in if you rely on UTF-8 just like we do

# Set the locale
RUN apt-get install -y locales
RUN locale-gen en_US.UTF-8
RUN /usr/sbin/update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
RUN apt-get remove -y locales

We will need SSH server later so jenkins can connect to the slave

# Install a basic SSH server
RUN apt-get install -y openssh-server
RUN sed -i 's|session    required     pam_loginuid.so|session    optional     pam_loginuid.so|g' /etc/pam.d/sshd
RUN mkdir -p /var/run/sshd

Jenkins will typically pull in JVM it needs to run, but we install one anyway

# Install JDK 8 (latest edition)

RUN apt-get install -q -y software-properties-common
RUN add-apt-repository ppa:webupd8team/java
RUN apt-get update

RUN echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf-set-selections

RUN apt-get install -y -q oracle-java8-installer

This was some creepy weird stuff that kept happening during builds and was causing failures. You might not need it, but if you experience error mentioned in the comment, this bit will save you 😃

# To prevent: java.lang.UnsatisfiedLinkError: /usr/local/java/jre1.8.0_XX/lib/i386/xawt/libmawt.so: libXtst.so.6: cannot open shared object file: No such file or directory

RUN apt-get install libxtst6
# In case you run your java in 32 bit mode (does anyone still do that?)

# RUN apt-get install libxtst6:i386

As previously stated, we need Xvfb

# Install xvfb

RUN sudo apt-get -q -y install xvfb

Screen size and display settings related to Xvfb, but as you will see later, I don’t actually need it since I’m setting it other way, but in case it helps anyone. And yes, I was using display number 10 to make sure it doesn’t conflict with any other displays anyone might want to have installed. Call me paranoid if you like!

# Run xvfb w/ specific display number
#ENV XVFBARGS ":10 -screen 0 1024x768x24 -ac +extension GLX +render -noreset"
ENV XVFBARGS ":10 -screen 0 1280x1024x16 -ac +extension GLX +render -noreset"

ENV DISPLAY :10

We install Firefox. And we install it despite the fact that we override it by an older version just in step below. Why? Surprisingly, it pulls in other libraries that FF needs and that would otherwise not be installed (or you would have to dig them out and force install manually).

# Install Firefox

RUN apt-get install -y firefox

Now we add our own old firefox. As you can see I have manually downloaded FF and install it from local folder to cut build time (yeah, I had to redo the image like 100 times when trying out all things), you might as well download it as part of the build if you want to and if you trust that the link will be always there (internet never forgets right? 😀 )

# Or add manually forced version for example one downloaded from here:
# http://releases.mozilla.org/pub/firefox/releases/31.8.0esr/linux-x86_64/en-US/firefox-31.8.0esr.tar.bz2
ADD firefox-31.8.0esr.tar.bz2 /opt/
# ADD above has already also unzipped the archive for us
RUN mv /opt/firefox /opt/firefox-31
RUN mv /usr/bin/firefox /usr/bin/firefox-latest

RUN ln -s /opt/firefox-31/firefox /usr/bin/firefox

We install the git

# Install git
RUN apt-get -q -y install git

Now we create special user for jenkins to use and make sure he doesn’t run out of file handles to use since that could happen w/ default Derby DB used as backend for JackRabbit used by tests in Magnolia’s build

# Add user jenkins to the image
RUN groupadd jenkins
RUN useradd -m -g jenkins -s /bin/bash jenkins

RUN echo "jenkins   hard     nofile          5000" >> /etc/security/limits.conf

Let’s also create a kinda-longish password for this user (and immediately forget about it as we don’t want to use password based login ever)

# Generate secure password for jenkins user

RUN apt-get -q -y install makepasswd

RUN echo jenkins:`makepasswd --chars 1112` | chpasswd

RUN apt-get remove -y makepasswd

Host keys and rest of sshd config we need also will

# Create the ssh host keys needed for sshd
RUN ssh-keygen -A

# Fix sshd's configuration for use within the container. See VW-10576 for details.
RUN sed -i -e 's/^UsePAM .*/UsePAM no/' /etc/ssh/sshd_config

RUN sed -i -e 's/^PasswordAuthentication .*/PasswordAuthentication yes/' /etc/ssh/sshd_config

# Standard SSH port

EXPOSE 22

Let’s copy over (== burn into the image, don’t even think about putting anything sensitive here) also some settings and scripts we use

# add maven/jenkins config
ADD magnolia_maven_settings.xml /home/jenkins/magnolia_maven_settings.xml
ADD check_procs /home/jenkins/check_prods

Set the cronjobs

# Set crontab job for cleanup of stale FF instances
RUN apt-get install -q -y cron
ADD crontab.file /home/jenkins/crontab.file
RUN crontab /home/jenkins/crontab.file
RUN rm /home/jenkins/crontab.file
RUN touch /var/log/cron.log
RUN chmod 666 /var/log/cron.log

RUN echo "* * * * * root /home/jenkins/check_procs" >> /etc/crontab
RUN echo "* * * * * root echo hiya >> /var/log/cron.log 2>&1" >> /etc/crontab
RUN echo "#" >> /etc/crontab

RUN chmod 666 /etc/crontab

Give all files we have created to our jenkins user to use and change at whim

# Make sure jenkins user owns all his stuff

RUN chown -R jenkins:jenkins /home/jenkins

And now for the last bit. Docker allows you to have your container do only one thing. Our one thing is being a jenkins slave, but to be good slave, we need this container to have 3 different “skills” so to say - it needs to run cron, run Xvfb and run sssh. To do all that, we install a service called supervisor to ensure slave does all three things at once

# supervisor to run multiple services (cron, xvfb & sshd)
RUN apt-get -q -y install supervisor
RUN  mkdir -p /var/log/supervisor
RUN  mkdir -p /etc/supervisor/conf.d
ADD supervisor.conf /etc/supervisor.conf

ADD supervisor-conf/* /etc/supervisor/conf.d/

and let’s start supervisor to run the slave as our very last command

CMD ["supervisord", "-c", "/etc/supervisor.conf"]

 

That was it. Kinda long, but it does it all. To finish our 3rd step, we need to build the container:

docker build -t rah003/jenkins-slave .

This command will build a container called jenkins-slave and place it in my docker repo. It’s public so you are free to use it to base your stuff on it if you want to, but there’s no guarantee it will always work the same (unless you tell me you depend on it). For your own stuff, you will need to create your own docker account and your own repos and place your containers in there.

 

Now step four. The last and final piece of magic

We deploy our container in our previously created VM. And we do it with the following command (when you tinker with this stuff, remember docker can be very particular about param ordering so you better keep the order. A special sort of troubles will arise if you try to put anything at the end of the line as it will be passed to the container rather than to the docker herself).

docker run --name mgnl-jenkins-slave-1 -p 1234:22 -d -t -v /permdisk:/home/jenkins/permdisk -v /keystore/.ssh:/home/jenkins/.ssh rah003/jenkins-slave

Yikes, what is all that?

We tell Docker to run container that we want to be named mgnl-slave-1 and to expose its port 22 (remember we exposed it from the container in the Dockerfile above) and map it as port 1234 (feel free to choose one that's harder to remember :D). With the -v params will also mount two volumes, one as the /home/jenkins/permdisk directory and one as /home/jenkins/.ssh directory. The first one is fairly big and we keep local maven repo there so every time I kill the container I don’t have to re-download all the artefacts. The second is fairly small and contains just private key of the slave and list of known hosts and usual .ssh stuff. It is also encrypted volume. Why is it here and not in the image itself? Well, image is public and I like to keep private key … well … private. As for the volumes, those are normal Amazon EBS (or EFS) volumes. Whether one or the other depends on how many slaves you have and whether you want to share repo between them (to save space and download time) or not ( to make sure they don’t influence each other). Also EFS volumes are a tad more pricey (3x more last time I checked).

 

But wait, how did those volumes end up attached to the container? Fair enough, there is another piece of magic.

Fifth, optional step

You don’t have to have that step in case you burn everything in your container or in case your build obtains information by other means and you want to wipe disk when redeploying container every time. But if you do, here is what you have to do:

- create volume in AWS console or via CLI tools

- attach volume to the VM created for you by docker machine in 2nd step (called jenkins in our case) again via AWS console or via CLI:

c2-attach-volume $AWS_VOL_ID -i $AWS_INSTANCE_ID -d /dev/sdf --region $AWS_REGION

- login to the docker managed vm

docker-machine ssh jenkins

while there, format your disk and mount it permanently and give access to whomever needs to use it (that would be jenkins user living in a container in our case)

sudo mkfs -t ext4 /dev/xvdf

sudo mkdir /keystore

sudo mount /dev/xvdf /keystore

echo "/dev/xvdf    /keystore      ext4    defaults,nofail      0 2"  | sudo tee --append /etc/fstab

sudo chmod 777 /keystore

And so we are done. Obviously I had done this step twice for the 2 volumes I was mounting.

One little nugget you might want to take away from this is - and what you also see above - instead of mounting whole volume that lives in /keystore mount point, I’m actually mounting subfolder of it. Why? Because I need the owner of .ssh directory to be my jenkins user (yeah, ssh is restrictive that way) and there’s no way for me to assign mount point in the outer system to the jenkins user that lives only within boundaries of the container. So now there you know it too.

 

tl;dr

What we did:

  • Create VM in Amazon EC2
  • Build Docker container with all Jenkins slave needs
  • Attach volumes to the container w/ all the sensitive data that are not part of the container
     

Still need to do:

  • Expose port 1234 from VPC in AWS console (Remember we exposed port 22 of the container, mapped it to port 1234 of Docker machine, now we are telling Amazon to let it through the firewall)
  • Create new slave in our jenkins master and give it IP address of the Docker machine and port we have just exposed in VPC. You might want to use Elastic IP for that to keep IP fixed (specially since its free as long as your VM is running)
  • Start the build on jenkins and let our new powerful slave to run it

 

Pitfalls:

  • Some tests behaved differently in Docker container than in plain ubuntu installation, but in all cases it was tests not being flexible enough so they would have been failing in other systems, e.g. on Windows as well.
  • Config is kind of big and gets some getting used to. Hunting for issues unless you are familiar w/ the infrastructure and did it all step by step might feel overwhelming
     

Benefits:

  • Fast and powerful slaves to offload your build work to
  • Can run multiple slaves per VM if necessary. E.g. to expose another slave on same VM:

docker run --name mgnl-jenkins-slave-2 -p 2234:22 -d -t -v /permdisk:/home/jenkins/permdisk -v /keystore/.ssh:/home/jenkins/.ssh rah003/jenkins-slave

All it took was to change the name of the slave and port mapping, of course you would also need to tell your VPC to expose the port and add it to Jenkins as new slave.

  • Can quickly drop and remove corrupted slaves simply by running:

docker stop mgnl-jenkins-slave-2 && docker rm mgnl-jenkins-slave-2

… and recreate by running first command

  • Can replicate setup on different VM, in different availability zone or different data centre fairly quickly, just by rerunning command to create VM and VPC and command to deploy container to that VM.
  • It's super fast. While creation of VM takes roughly 5 minutes of the time, deploying container in it is less than 5 seconds

 

All right, sounds great, how much build time did you save?

  • For some more complicated and computation intensive jobs, they went down from 20 to 12 minutes or from 16 minutes to 11 minutes so on average between 30-40%
  • For UI intensive jobs, initially we got down from 1h 35 minutes to 47 minutes, but wish test failures. Test revealed that failure was due to test being executed faster than the action it was expecting the server to perform (that was a new one 😀 ) so we had to introduce some waitUntil() hooks in the code and slow the build down, but still it runs consistently at 57-58 minutes now so we are still about 38% faster.

Was it worth the effort?

Only time will tell, money wise it is about same cost as our previous infrastructure, build time wise it looks good right now. I’ll surely share more info when I have it as well as experience and back fire that I will get for the choices made in setting this whole thing up.



Comments



{{item.userId}}   {{item.timestamp | timestampToDate}}

About the author Jan Haderka

Jan is Magnolia's Chief Technology Officer. He has been developing software since 1995. On this blog, he'll write about Magnolia's connectors, integrations and coding ...with the odd Magnolia usability and development tip thrown in for good measure. He lives in the Czech Republic with his wife and three children. Follow him on Twitter @rah003.


See all posts on Jan Haderka

Demo site Contact us Free trial