What does the new Docker Swarm announement mean for Kubernetes?

This field of container orchestration is moving incredibly fast even by the normal standards of software development. There has been a cambrian explosion of container startups and competition is heating up in a really big way. This is good for innovation, but makes it difficult to choose a technology. As such, I’m keeping my eye on both Docker and Swarm.

 

My goal was to choose an orchestration technology and to commit to a technology that is innovative, stable, and would be maintained for while. I’ve decided that working in a healthy community was critical to fulfill all three objectives. I’ve chose Kubernetes after a long technical, community and business evaluation (formally with Kismatic and previously Mesosphere) of different container orchestration solutions. However, as other container cluster management options become available, it’s important to recognize what capabilities they provide and compare them to the strengths of Kubernetes.

 

So let’s take a moment and look at the most recent release of Docker (version 1.12) which now competes directly with Kubernetes with SwarmKit (based on Swarm) now part of the core of Docker and providing the capability of instantiating a Swarm cluster directly from the console.

 

Worth noting is that once you create a new Swarm cluster it also creates the swarm manager which in turn creates a certificate authority (if an external store isn’t provided) so for now transparent security is built-in directly.

 

The command line console also has the ability to join a node to an existing Swarm cluster as either the manager or worker and seamlessly the worker can be promoted to a manager and a manager can be demoted back to the role of worker as needed providing much needed additional flexibility. The Swarm manager uses the RAFT protocol to elect a leader and determine consensus which is very similar to how Kubernetes works today with its internal use of etcd. Also worth pointing out is how the Swarm workers use a gossip protocol to communicate their respective state amongst themselves so Docker users no longer require external entities or key-value stores to keep track of the cluster topology.

Also new to this most recent Docker release is the concept of a logical service — consisting of 1-to-many container instances and with the introduction of this logical view it makes management of services much easier overall. You can now create and update as well as scale a service which results in containers being either deployed updated or ultimately destroyed when no longer required.


Yet one weakness in the Docker 1.12 release in my opinion is its service discovery, which works quite elegantly in Kubernetes. Important to note, that the notion of a “Service” proxy for containers has existed in Kubernetes since the beginning of the project you simply connect to service name in your cluster and Kubernetes will make sure you reach the correct pod (or one or more containers)(s) behind the service. Kubernetes is also designed to be modular and extensible so that its components can be easily swappable, which allows for some interesting opportunities to tailor its use to your needs overall.

 

This new release from Docker will definitely face competition from Kubernetes, which is intended to help automate the deployment, scaling, and operation of containers across clusters of hosts. Already many companies are using Kubernete because of its ultra strong community. Kube, as the community calls it, is also gaining widespread acceptance from enterprise customers that are looking to build containerized applications using the new cloud native paradigms.

 

Kubernetes describes itself as a way to “manage a cluster of containers as a single system to accelerate development and simplify operations”. Kubernetes is open source, but also community developed and stewarded by the CNCF. This is fundamentally different than Docker/Swarm, which is ultimately controlled by a single start up and and is not governed by an open source community. Kubernetes is awesome because it brings Google’s decade plus experience of running containers at scale, Red Hat’s years of experience deploying and managing open source in the enterprise, the nimble development experience of CoreOS as well as advantages from many, many other organizations and community members.

 

Because of a powerful and diverse community, Kubernetes is as flexible as a Swiss Army Chainsaw. You can run Kubernetes on bare metal or on just about any cloud provider out there. Another amazing feature of Kubernetes is that it supports both Docker and Rocket containers as well as providing the ability to address additional cluster runtimes moving forward.

 

The wonderful experience and drive of the community cements our dedication to our choice and it’s place in the overall container orchestration space. Just the shear velocity of the project is amazing and the community is extremely vibrant.

 

So, in the end, I’m choosing to rally behind Kubernetes. It was the most robust solution we tried and we’re confident that it’ll support us as we grow in the future. Red Hat along with others are looking forward to providing Windows support for Kubernetes and the ability to run Windows containers directly as well. But it’s important to keep in mind that the other cluster orchestration services aren’t necessarily bad but as I stated earlier this field is moving quite fast and we want to ensure that we’re working with the most active as well as stable and mature project available to us. We’ve been extremely happy with Kubernetes and been using it in production for awhile now in fact ever since the 1.0 release.

 

We’re excited about the 1.3 release of Kubernetes and the new PetSet (was nominal services) feature providing the new stateful primitives for running your pods which need strong identity and storage capabilities. Looking forward to everything to come with the addition of cluster federation (a.k.a. “Ubernetes”) in Kubernetes 1.3 as well. I for one am very grateful to the entire Kubernetes community for everything they’ve done on this project thus far and everything that they continue to do. It’s truly an amazing piece of technology and a great building block for my needs.

Excerpts on the new Swarm capabilites from: https://lostechies.com/gabrielschenker/2016/06/21/dockercon-2016-day-2-presentations/ by Gabriel Schenker used with express permission.

Docker and Vagrant Development on OS X Yosemite

Vagrant

Vagrant is an amazing tool for managing virtual machines via a simple to use command line interface.

Install

Vagrant uses Virtualbox to manage the virtual dependencies by default. (You can directly download virtualbox and install or use homebrew for it.) But I like using VMware Fusion 7 Professional with the Vagrant VMware provider.

I’m assuming you know how to download and install VMware Fusion the typical way.

brew install caskroom/cask/brew-cask
brew cask install vagrant
brew cask install vagrant-manager
vagrant plugin install vagrant-vmware-fusion
vagrant plugin license vagrant-vmware-fusion license.lic
vagrant box add precise64_vmware http://files.vagrantup.com/precise64_vmware.box
vagrant init precise64_vmware
vagrant up
vagrant ssh

SSHFS

Installation

An easy-to-use installer package for the latest version of SSHFS can be downloaded from the SSHFS repository’s download section. The package installs a self-contained (as in “does not depend on external libraries”) version of SSHFS. It supports Mac OS X 10.5 (Intel, PowerPC) and later.

Note: This build of SSHFS is based on the “FUSE for OS X” software, that is not contained in the installer package and has to be installed separately. The latest release of “FUSE for OS X” can be downloaded from http://osxfuse.github.com.

Macfusion

To use Macfusion with the newer “FUSE for OS X”-based version of SSHFS, put Macfusion in your Applications folder and run the following commands in Terminal. See 3. under “Frequently Asked Questions” for more information as to why you might want to use Macfusion.

cd /Applications/Macfusion.app/Contents/PlugIns/sshfs.mfplugin/Contents/Resources
mv sshfs-static sshfs-static.orig
ln -s /usr/local/bin/sshfs sshfs-static

I ran into a problem though. I mount some of my servers via SSH, and even though the SSH account has write access to some files, OS X doesn’t let me open them with the standard “permission denied” error. This is because the user on the server has another UID than the local user on my mac. To get around this issue, I’ve entered the following line into the Extra Options (Advanced) field of MacFusion:

-o idmap=user -o uid=501 -o gid=501

This maps the remote UIDs to match those of the local system. If you’re on a mac, your user ID will most likely be 501. If not, make sure you enter the right ID.

A few more customizations:

cd ~/vagrant/
vagrant ssh
useradd -d /home/preilly -m preilly -s /bin/bash -c "Patrick Reilly"
vim /etc/hostname #change to vagrant
vim /etc/hosts #change to vagrant
ifconfig | grep "inet addr" #take note of address (non loopback)
exit
vim /etc/hosts #add vagrant entry with previous ip from ifconfig
ssh vagrant
mkdir -p /home/preilly/.ssh
vim authorized_keys
chmod 700 /home/preilly/.ssh/
chmod 640 .ssh/authorized_keys
exit
ssh -A vagrant

 

Git

Git uses your username to associate commits with an identity. The git config command can be used to change your Git configuration, including your username.
sudo apt-get install git
git config --global user.name "Patrick Reilly"
git config --global user.email "patrick@kismatic.io"

So now I can use the Macfusion menu item to mount my Vagrant image as a local volume:

cd /Volumes/vagrant/

and use the editor of my choice to work with my home directory in Vagrant.

Docker

Prerequisites

Docker requires a 64-bit installation regardless of your Ubuntu version. Additionally, your kernel must be 3.10 at minimum. The latest 3.10 minor version or a newer maintained version are also acceptable.

Kernels older than 3.10 lack some of the features required to run Docker containers. These older versions are known to have bugs which cause data loss and frequently panic under certain conditions.

To check your current kernel version, open a terminal and use uname -r to display your kernel version:

$ ssh -A vagrant
$ uname -r
3.2.0-29-virtual
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install linux-image-generic-lts-trusty
$ sudo reboot

Get the latest Docker package

$ sudo apt-get install apparmor

$ wget -qO- https://get.docker.com/ | sh
$ sudo usermod -aG docker preilly

Verify docker is installed correctly.

$ sudo docker run hello-world

Installing Go

$ sudo apt-get install curl git mercurial make binutils bison gcc build-essential
$ bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
$ sudo apt-get install bison
$ gvm install go1.4
$ gvm use go1.4 --default
# I got bit by this issue: https://github.com/moovweb/gvm/issues/124
$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:git-core/ppa
$ sudo apt-get update
$ sudo apt-get install git
$ gvm use go1.4 --default

This is my current development environment on my MacBook. I’d really like to get others feedback or suggestions as well.

The Datacenter is the Computer

Using containers I can easily ship applications between machines and start to think of my cluster as a single computer. Each machine acts as additional CPU cores with the ability to execute my applications and run an operating system, but the goal is not to interact with the locally installed OS directly. Instead we want to treat the local OS as firmware for the underlying hardware resources.

Now we just need a good scheduler.

The Linux kernel does a wonderful job of scheduling applications on a single host system. Chances are if we run multiple applications on a single system the kernel will attempt to use as many CPU cores as possible to ensure that our various applications run in parallel.

When it comes to a cluster of machines the job of scheduling applications becomes an exercise for the operations team. Today for many organizations scheduling is handled by the fine folks running that team. Yet, unfortunately the use of a human scheduler requires humans to keep track of where applications are running. Sometimes this means using complicated error-prone spreadsheets or a configuration management tool with Puppet. Either way these tools don’t really offer the robust scheduling that is necessary to react to these real time events. This is where Kubernetes fits in.

If you think of the datacenter in this way then Kubernetes would be it’s datacenter operating system.

Kubernetes on MesosTry It Now

The inspiration for this post came from Kelsey Hightower (@kelseyhightower).

Myriad is a framework for scaling YARN clusters on Mesos

Myriad is a mesos framework designed for scaling YARN clusters on Mesos.
Myriad can expand or shrink one or more YARN clusters in response to events as per configured rules and policies.

The name Myriad means, countless or extremely great number. In context of the project, it allows one to expand overall resources managed by Mesos, even when the cluster under mesos management runs other cluster mangaers like YARN.

Myriad allows Mesos and YARN to co-exist and share resources with Mesos as the resource manager for the datacenter. Sharing resources between these two resource allocation systems improves overall cluster utilization and avoids statically partitioning resources amongst two separate clusters/resource managers.

Roadmap

Myriad is a work in progress.

  • Support multiple clusters
  • Custom Executor for managing NodeManager
  • Support multi-tenancy for node-managers
  • Support unique constraint to let only one node-manager run on a slave
  • Configuration store for storing rules and policies for clusters managed by Myriad
  • NodeManager Profiles for each cluster
  • High Availability mode for framework
  • Framework checkpointing
  • Framework re-conciliation

https://github.com/mesos/myriad

Open-Source Service Discovery

The problem seems simple at first: How do clients determine the IP and port for a service that exist on multiple hosts?

When developing and running resource-efficient distributed systems like
Apache Mesos (a cluster manager) that simplifies the complexity of running applications on a shared pool of servers, this is a very important decision to make.

Jason Wilder has looked at a number of general purpose, strongly consistent registries (Zookeeper, Doozer, Etcd) as well as many custom built, eventually consistent ones (SmartStack, Eureka, NSQ, Serf, Spotify’s DNS, SkyDNS).

Many use embedded client libraries (Eureka, NSQ, etc..) and some use separate sidekick processes (SmartStack, Serf).

Interestingly, of the dedicated solutions, all of them have adopted a design that prefers availability over consistency.

Please read this really nice writeup by Jason Wilder to learn more.

http://jasonwilder.com/blog/2014/02/04/service-discovery-in-the-cloud/

PHP Next Generation

The PHP Group has put up a post about the future of PHP. They say, ‘Over the last year, some research into the possibility of introducing JIT compilation capabilities to PHP has been conducted. During this research, the realization was made that in order to achieve optimal performance from PHP, some internal API’s should be changed. This necessitated the birth of the phpng branch, initially authored by Dmitry Stogov, Xinchen Hui, and Nikita Popov. This branch does not include JIT capabilities, but rather seeks to solve those problems that prohibit the current, and any future implementation of a JIT capable executor achieving optimal performance by improving memory usage and cleaning up some core API’s. By making these improvements, the phpng branch gives us a considerable performance gain in real world applications, for example a 20% increase in throughput for WordPress. The door may well now be open for a JIT capable compiler that can perform as we expect, but it’s necessary to say that these changes stand strong on their own, without requiring a JIT capable compiler in the future to validate them.’

Keybase.io

I’ve been trying out keybase.io and you can find me at key­base.io/­preillyme.
I think it might be point­ing a use­ful way for­ward on private-by-default com­mu­ni­ca­tion and, for what it does, it gets a lot of things right.

The prob­lem · We’d like to be con­fi­dent that the mes­sages we send across the net  —  email, chat, SMS, what­ev­er  —  are se­cure. When we say “secure” we mean some com­bi­na­tion of “nobody can read them but the per­son who’s sup­posed to” and “the per­son read­ing them can be sure who sent them.” ¶

In prin­ci­ple, this should be easy be­cause of Public-key cryp­tog­ra­phy, which has been around for a while, is re­li­able enough to pow­er ba­si­cal­ly 100% of the fi­nan­cial trans­ac­tions that cross the internet, and for which there’s ex­cel­lent open-source soft­ware that any­one can use for free.

Get­ting cryp­to in place for mail and oth­er mes­sages has been tough, for a few rea­son­s. First, how do you find some­one else’s key re­li­ably, where by “reliably” I mean not just find it, but be­lieve that it’s re­al­ly theirs?

Se­cond, most mes­sages these days live in the cloud (G­mail, Face­book, Twit­ter, etc) and the cloud own­ers like to have them un­en­crypt­ed to help them to ad­ver­tise bet­ter.
So, they’re prob­a­bly not re­al­ly all that mo­ti­vat­ed to help make mes­sages se­cure.

Now, I know that se­cure email is pos­si­ble, and that https con­nec­tions to Face­book and Google and Hot­mail are help­ful, but right now to­day, most mes­sag­ing isn’t very se­cure.

Key­base · Key­base.io does a few sim­ple things: ¶

  • Keeps a di­rec­to­ry of keys that you can look up by a sim­ple name. Since I’m an ear­ly adopter I got “preillyme”, but in prac­tice your email ad­dress would work fine.
  • Lets you prove that the own­er of a key al­so owns a par­tic­u­lar Twit­ter han­dle and Github ac­coun­t. In prac­tice, since I tend to be­lieve that the peo­ple I know are as­so­ci­at­ed with cer­tain Twit­ter/Github ac­counts, I’m in­clined to be­lieve that the keys re­al­ly be­long to them.
  • Lets you en­crypt mes­sages so they can on­ly be read by one par­tic­u­lar per­son, lets you sign them to prove that they could on­ly have come from you, and the in­verse; de­crypt and signature-check.
  • Does all this in a sim­ple web page that’s easy to use, or in a geek-friendly command-line in­ter­face.

So, the idea is that if there’s a mes­sage you want to send, and you want it to be a se­cret, you vis­it key­base.io, paste your text in, en­crypt it for the per­son you’re send­ing it to, sign it, and then copy­/­paste it in­to an email or chat or Face­book mes­sage or what­ev­er. The per­son at the oth­er end copy­/­pastes it in­to key­base.io and re­vers­es the pro­cess and would you look at that, you’ve just prac­ticed se­cure com­mu­ni­ca­tion!

Yeah, it would be bet­ter if this were al­ready built in­to ev­ery mes­sag­ing pro­gram that everyone us­es, and you got it by press­ing a but­ton; or bet­ter stil­l, if ev­ery­thing were al­ways encrypt­ed.

But in the in­ter­im, while this may be a lit­tle klunky, it’s aw­ful­ly sim­ple and easy to un­der­stand; and it works with any­thing that can be used to send a chunk of text from any­where to any­where. So I’m actually pret­ty im­pressed.

In greater depth · Here are a few more tech­ni­cal rea­sons why I like what I see at Key­base. ¶

  • There’s the abil­i­ty to “track” an­oth­er user, which does all the cryp­to check­ing and signs the re­sult, so in fu­ture you can do a quick check whether anything’s changed. This speeds things up and re­moves a few threat mod­el­s.
  • There’s al­so a command-line clien­t, which should be very com­fort­ing for the para­noid. Per­haps the most wor­ry­ing threat mod­el is that some­one shows up at Keybase’s of­fice and, us­ing ei­ther ma­li­cious tech­nol­o­gy, a Na­tion­al Se­cu­ri­ty Agency let­ter, ar­ranges to com­pro­mise their soft­ware; the first time you type your passphrase in­to that com­pro­mised soft­ware, your se­cu­ri­ty is gone. But if you use the command-line clien­t, the ad­ver­sary has to com­pro­mise your own com­put­er to get at you.
  • The ac­tu­al cryp­tog­ra­phy soft­ware is all GPG and Scryp­t; what Key­base of­fers is pipefit­ting and a di­rec­to­ry and some util­i­ties. So the cryp­to part ought to be believably se­cure.
  • It’s all open-source and there on Github. Very com­fort­ing.
  • There’s al­so a REST API, which at first glance looks very sen­si­ble to me.
  • In prin­ci­ple, once the API is locked down, any­one could im­ple­ment a Keybase-style di­rec­to­ry  —  for ex­am­ple to serve a par­tic­u­lar com­mu­ni­ty of trust  —  and mes­sag­ing tools could be taught how to work with any old in­stance.
  • The peo­ple who built this are the ones who built OkCupid, which suggests that their tech­ni­cal chops may well be up to the task.

A wor­ry · You can al­so store your pri­vate key, en­crypt­ed with your passphrase, in the Key­base di­rec­to­ry. This makes cer­tain things eas­i­er and quick­er, but it makes that one par­tic­u­lar threat mod­el, where a bad per­son com­pro­mis­es the soft­ware, even scari­er, be­cause they have your pri­vate key the first time you type your passphrase in­to the com­pro­mised soft­ware.

Trade-offs · If you delete your stored pri­vate key, it means you have to use the command-line client rather than the web in­ter­face. Which is way less civilian-friendly. This is a very, very in­ter­est­ing trade-off. I’m think­ing Key­base is go­ing to have to pub­lish some­thing about their le­gal and po­lit­i­cal de­fen­sive mea­sures. ¶

If you’re us­ing the command-line key­base tool on OS X, you can store your passphrase in the Mac key­chain, so any com­mands that need your passphrase Just Work. So for peo­ple who are handy with the com­mand line, it’s ac­tu­al­ly more con­ve­nient then the Web for­m, which re­quires you to type in the passphrase, or paste it from your pass­word man­ager.