Published on February 10, 2016 by Jan Haderka
Magnolia is a fairly big sized project. This time, I wanted to check into this blog with something fresh — so I decided to take a stab at our build setup to see if I can squeeze a few extra minutes out of the build time, and to see how much effort it takes.
It is also a modular project, featuring modules for all kinds of functionality, so that:
One of the tools that is super important in the modular world (while always important, it is critical in this case) is having continuous integration. It allows you to build all those different modules together, to create the final assembly and to test any change on it in order to be sure that you are not breaking the product by a new fix or feature. Plus; with as many modules as we have, and with all their dependencies, building them all in the right order takes a while. So every minute counts here.
One thing everyone will recommend when you mention Jenkins and Amazon is to look at the AWS plugin. I surely did, but decided against using it in the end. The plugin itself is good and might fit your needs, but it would be spawning the new instance from the image you have for the build and shutting it down afterwards. In general, this saves the costs of running EC2 instances too much when not building that often at the expense of starting up the instance and waiting for it at the beginning. However in our case where the build queue is quite full on a regular day, it doesn’t help and a permanently available instance is more useful.
Another challenge that our builds impose on the infrastructure is use of various other tooling that needs to be there. Those extra tools we need are actually “just” 2 :D
And of course a bunch of other things that need to be there like git, (maven and java will be pulled in by jenkins on its own), ssh access, certain versions of Firefox that we require to run all the tests and so on.
So in theory, just like our own internal vmware based VMs that pose as Jenkins slaves, I could have just re-created image on Amazon using pretty much the same script, open the port and be done with it. But where would the fun be in that, right?
Actually, and for a more prosaic reason, I opted for yet another approach here. The reason being the issues that we run into with our current slave instances occasionally where they just get filled up with space, or corrupted or whatever and need to be recreated. It is a piece of infrastructure that is crucial to our main business, but not the piece that gets anyone excited. Just like break pads on your car, you want to have good ones, but when they're worn out, you want to be able to throw them away and replace them as quickly as possible … or swap for even better ones without too much worry if you decide to spend a weekend on the racing track.
So how can you get infrastructure that is simpler and faster to spin up or replace when failed (or when changing configuration of it) than VM (be that vmware or EC2 or any other private/public cloud))? The answer is obvious, isn’t it? Don’t use whole VMs but just the containers.
Hence the final setup of having a VM in EC2 running docker container that is our Jenkins slave.
One extra, originally unexpected benefit was that in order to get all the extra performance for the build, I was more generous when choosing the VM to use and only after running a couple of builds I realised that on that particular VM, I can deploy multiple containers and emulate multiple slave instances running the build w/o sweating too much. Sweet.
Enough talking, here's what it actually took.
- you need to install docker (plenty of guides on the net to do that, choose one for your platform)
- you need to install Amazon CLI (again this is something fairly well documented so I won’t go into that)
- setup couple of env variables if you haven’t done so as part of the above steps, namely
$AWS_ACCESS_KEY should be set to your access key
$AWS_SECRET_KEY should be set to your secret key
$AWS_VPC_ID should be set to your VPC id. You will find one (or can create one) in your AWS account, under VPC. Or you can create one w/ CLI you have installed in previous step. Or maybe you are lucky and Docker will be able to create one for you when creating the VM, though that didn’t work for me
$AWS_ZONE should be set to your availability zone, typically having value of just
$AWS_REGION should be set to your region, typically something like
us-west-1 Make sure to NOT include availability zone in the name (the letter at the end). If you do, you will likely end up w/ cryptic error from docker/amazonec2 driver.
$AWS_INSTANCE_TYPE should be set to your instance type,say
m4.large The micro instances are most likely going to be too tiny for your needs if you are building up anything of size and complexity similar to Magnolia ;)
docker-machine -D create \
--driver amazonec2 \
--amazonec2-access-key $AWS_ACCESS_KEY \
--amazonec2-secret-key $AWS_SECRET_KEY \
--amazonec2-vpc-id $AWS_VPC_ID \
--amazonec2-zone $AWS_ZONE \
--amazonec2-region $AWS_REGION \
--amazonec2-instance-type $AWS_INSTANCE_TYPE \
--amazonec2-root-size 10 \
The command above is fairly trivial if you know how to read it. First you tell docker machine to create one for you using amazon EC2 driver. Then you provide the number of params for the driver, the 2nd last one being the root size of the disk and the last one being the name under which you will refer to that machine in Docker.
Oh yes, let's create our Docker container. We need to write a “descriptor” of some sort for that. One that Docker likes to call Dockerfile
I’m sure you saw a fair amount of those before coming here so don’t expect any surprises here 😃
First some stuff to describe the image, what is it based on (ubuntu in our case) and who maintains it (that would be me)
# This Dockerfile is used to build an image containing basic stuff to be used as a Jenkins slave build node.
Next, we make sure any installation process later will not try to ask our script any questions and that the base image we got is up to date:
# General config
ENV DEBIAN_FRONTEND noninteractive
# Make sure the package repository is up to date.
RUN apt-get -q -y update
RUN apt-get -q -y upgrade
Yeah this is where we live and the server should think so as well
# Set timezone
RUN ln -sf /usr/share/zoneinfo/Europe/Zurich /etc/localtime
And this is the language we speak. This stuff might not be strictly necessary, but I have ran into some weird issues with file encoding earlier so you might want to keep it in if you rely on UTF-8 just like we do
# Set the locale
RUN apt-get install -y locales
RUN locale-gen en_US.UTF-8
RUN /usr/sbin/update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
RUN apt-get remove -y locales
We will need SSH server later so jenkins can connect to the slave
# Install a basic SSH server
RUN apt-get install -y openssh-server
RUN sed -i 's|session required pam_loginuid.so|session optional pam_loginuid.so|g' /etc/pam.d/sshd
RUN mkdir -p /var/run/sshd
Jenkins will typically pull in JVM it needs to run, but we install one anyway
# Install JDK 8 (latest edition)
RUN apt-get install -q -y software-properties-common
RUN add-apt-repository ppa:webupd8team/java
RUN apt-get update
RUN echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf-set-selections
RUN apt-get install -y -q oracle-java8-installer
This was some creepy weird stuff that kept happening during builds and was causing failures. You might not need it, but if you experience error mentioned in the comment, this bit will save you 😃
# To prevent: java.lang.UnsatisfiedLinkError: /usr/local/java/jre1.8.0_XX/lib/i386/xawt/libmawt.so: libXtst.so.6: cannot open shared object file: No such file or directory
RUN apt-get install libxtst6
# In case you run your java in 32 bit mode (does anyone still do that?)
# RUN apt-get install libxtst6:i386
As previously stated, we need Xvfb
# Install xvfb
RUN sudo apt-get -q -y install xvfb
Screen size and display settings related to Xvfb, but as you will see later, I don’t actually need it since I’m setting it other way, but in case it helps anyone. And yes, I was using display number 10 to make sure it doesn’t conflict with any other displays anyone might want to have installed. Call me paranoid if you like!
# Run xvfb w/ specific display number
#ENV XVFBARGS ":10 -screen 0 1024x768x24 -ac +extension GLX +render -noreset"
ENV XVFBARGS ":10 -screen 0 1280x1024x16 -ac +extension GLX +render -noreset"
ENV DISPLAY :10
We install Firefox. And we install it despite the fact that we override it by an older version just in step below. Why? Surprisingly, it pulls in other libraries that FF needs and that would otherwise not be installed (or you would have to dig them out and force install manually).
# Install Firefox
RUN apt-get install -y firefox
Now we add our own old firefox. As you can see I have manually downloaded FF and install it from local folder to cut build time (yeah, I had to redo the image like 100 times when trying out all things), you might as well download it as part of the build if you want to and if you trust that the link will be always there (internet never forgets right? 😀 )
# Or add manually forced version for example one downloaded from here:
ADD firefox-31.8.0esr.tar.bz2 /opt/
# ADD above has already also unzipped the archive for us
RUN mv /opt/firefox /opt/firefox-31
RUN mv /usr/bin/firefox /usr/bin/firefox-latest
RUN ln -s /opt/firefox-31/firefox /usr/bin/firefox
We install the git
# Install git
RUN apt-get -q -y install git
Now we create special user for jenkins to use and make sure he doesn’t run out of file handles to use since that could happen w/ default Derby DB used as backend for JackRabbit used by tests in Magnolia’s build
# Add user jenkins to the image
RUN groupadd jenkins
RUN useradd -m -g jenkins -s /bin/bash jenkins
RUN echo "jenkins hard nofile 5000" >> /etc/security/limits.conf
Let’s also create a kinda-longish password for this user (and immediately forget about it as we don’t want to use password based login ever)
# Generate secure password for jenkins user
RUN apt-get -q -y install makepasswd
RUN echo jenkins:`makepasswd --chars 1112` | chpasswd
RUN apt-get remove -y makepasswd
Host keys and rest of sshd config we need also will
# Create the ssh host keys needed for sshd
RUN ssh-keygen -A
# Fix sshd's configuration for use within the container. See VW-10576 for details.
RUN sed -i -e 's/^UsePAM .*/UsePAM no/' /etc/ssh/sshd_config
RUN sed -i -e 's/^PasswordAuthentication .*/PasswordAuthentication yes/' /etc/ssh/sshd_config
# Standard SSH port
Let’s copy over (== burn into the image, don’t even think about putting anything sensitive here) also some settings and scripts we use
# add maven/jenkins config
ADD magnolia_maven_settings.xml /home/jenkins/magnolia_maven_settings.xml
ADD check_procs /home/jenkins/check_prods
Set the cronjobs
# Set crontab job for cleanup of stale FF instances
RUN apt-get install -q -y cron
ADD crontab.file /home/jenkins/crontab.file
RUN crontab /home/jenkins/crontab.file
RUN rm /home/jenkins/crontab.file
RUN touch /var/log/cron.log
RUN chmod 666 /var/log/cron.log
RUN echo "* * * * * root /home/jenkins/check_procs" >> /etc/crontab
RUN echo "* * * * * root echo hiya >> /var/log/cron.log 2>&1" >> /etc/crontab
RUN echo "#" >> /etc/crontab
RUN chmod 666 /etc/crontab
Give all files we have created to our jenkins user to use and change at whim
# Make sure jenkins user owns all his stuff
RUN chown -R jenkins:jenkins /home/jenkins
And now for the last bit. Docker allows you to have your container do only one thing. Our one thing is being a jenkins slave, but to be good slave, we need this container to have 3 different “skills” so to say - it needs to run cron, run Xvfb and run sssh. To do all that, we install a service called supervisor to ensure slave does all three things at once
# supervisor to run multiple services (cron, xvfb & sshd)
RUN apt-get -q -y install supervisor
RUN mkdir -p /var/log/supervisor
RUN mkdir -p /etc/supervisor/conf.d
ADD supervisor.conf /etc/supervisor.conf
ADD supervisor-conf/* /etc/supervisor/conf.d/
and let’s start supervisor to run the slave as our very last command
CMD ["supervisord", "-c", "/etc/supervisor.conf"]
That was it. Kinda long, but it does it all. To finish our 3rd step, we need to build the container:
docker build -t rah003/jenkins-slave .
This command will build a container called jenkins-slave and place it in my docker repo. It’s public so you are free to use it to base your stuff on it if you want to, but there’s no guarantee it will always work the same (unless you tell me you depend on it). For your own stuff, you will need to create your own docker account and your own repos and place your containers in there.
We deploy our container in our previously created VM. And we do it with the following command (when you tinker with this stuff, remember docker can be very particular about param ordering so you better keep the order. A special sort of troubles will arise if you try to put anything at the end of the line as it will be passed to the container rather than to the docker herself).
docker run --name mgnl-jenkins-slave-1 -p 1234:22 -d -t -v /permdisk:/home/jenkins/permdisk -v /keystore/.ssh:/home/jenkins/.ssh rah003/jenkins-slave
Yikes, what is all that?
We tell Docker to run container that we want to be named mgnl-slave-1 and to expose its port 22 (remember we exposed it from the container in the Dockerfile above) and map it as port 1234 (feel free to choose one that's harder to remember :D). With the -v params will also mount two volumes, one as the /home/jenkins/permdisk directory and one as /home/jenkins/.ssh directory. The first one is fairly big and we keep local maven repo there so every time I kill the container I don’t have to re-download all the artefacts. The second is fairly small and contains just private key of the slave and list of known hosts and usual .ssh stuff. It is also encrypted volume. Why is it here and not in the image itself? Well, image is public and I like to keep private key … well … private. As for the volumes, those are normal Amazon EBS (or EFS) volumes. Whether one or the other depends on how many slaves you have and whether you want to share repo between them (to save space and download time) or not ( to make sure they don’t influence each other). Also EFS volumes are a tad more pricey (3x more last time I checked).
But wait, how did those volumes end up attached to the container? Fair enough, there is another piece of magic.
You don’t have to have that step in case you burn everything in your container or in case your build obtains information by other means and you want to wipe disk when redeploying container every time. But if you do, here is what you have to do:
- create volume in AWS console or via CLI tools
- attach volume to the VM created for you by docker machine in 2nd step (called jenkins in our case) again via AWS console or via CLI:
c2-attach-volume $AWS_VOL_ID -i $AWS_INSTANCE_ID -d /dev/sdf --region $AWS_REGION
- login to the docker managed vm
docker-machine ssh jenkins
while there, format your disk and mount it permanently and give access to whomever needs to use it (that would be jenkins user living in a container in our case)
sudo mkfs -t ext4 /dev/xvdf
sudo mkdir /keystore
sudo mount /dev/xvdf /keystore
echo "/dev/xvdf /keystore ext4 defaults,nofail 0 2" | sudo tee --append /etc/fstab
sudo chmod 777 /keystore
And so we are done. Obviously I had done this step twice for the 2 volumes I was mounting.
One little nugget you might want to take away from this is - and what you also see above - instead of mounting whole volume that lives in /keystore mount point, I’m actually mounting subfolder of it. Why? Because I need the owner of .ssh directory to be my jenkins user (yeah, ssh is restrictive that way) and there’s no way for me to assign mount point in the outer system to the jenkins user that lives only within boundaries of the container. So now there you know it too.
What we did:
Still need to do:
docker run --name mgnl-jenkins-slave-2 -p 2234:22 -d -t -v /permdisk:/home/jenkins/permdisk -v /keystore/.ssh:/home/jenkins/.ssh rah003/jenkins-slave
All it took was to change the name of the slave and port mapping, of course you would also need to tell your VPC to expose the port and add it to Jenkins as new slave.
docker stop mgnl-jenkins-slave-2 && docker rm mgnl-jenkins-slave-2
… and recreate by running first command
All right, sounds great, how much build time did you save?
Was it worth the effort?
Only time will tell, money wise it is about same cost as our previous infrastructure, build time wise it looks good right now. I’ll surely share more info when I have it as well as experience and back fire that I will get for the choices made in setting this whole thing up.
Jan is Magnolia's Chief Technology Officer. He has been developing software since 1995. On this blog, he'll write about Magnolia's connectors, integrations and coding ...with the odd Magnolia usability and development tip thrown in for good measure. He lives in the Czech Republic with his wife and three children. Follow him on Twitter @rah003.
See all posts on Jan Haderka