Improving Docker with Unikernels: Introducing HyperKit, VPNKit and DataKit

We’ve been working hard to build native Docker for Mac and Windows apps to ensure that your Docker experience  is as seamless as possible on the most popular developer operating systems. Docker for Mac and Windows include everything required to spin up a Linux Docker container that efficiently bridges storage and networking from the host into the Docker containers. They work transparently on both MacOS X and Windows, and require no other third party software.

Docker has always been built on open-source foundations: Solomon Hykes is presenting a keynote today at OSCON 2016 about the incremental revolution that the firehose of collaborative open source development has enabled throughout Docker’s history.  Today, we are adding to our existing open source contributions by open sourcing the core technology that powers the Docker for Mac and Windows desktop applications!

Building Docker for Mac and Windows has required integrating hardware virtualization, embedded operating systems and unikernel technology, all without exposing this magic to the end user. Let’s take a look under the hood of our applications to understand what some of this source code does, and give you a better of idea of how to contribute to it or use it in your own projects.

When you run Docker for Mac, it spins up a lightweight hypervisor that exists solely to run a single, embedded Linux instance that includes the latest stable release of Docker Engine. Unlike most hypervisors, this requires no special admin privileges since it uses the included Hypervisor Framework (available since OSX 10.10). The Docker application also bundles libraries that supply the Docker VM with host networking and storage capabilities that map intelligently between Linux and OSX/Windows semantics.

Screen shot 2016-05-18 at 7. 19. 27 am. Png
Today, we are excited to announce the open-sourcing of these discrete components, the same source code we use in the release builds of Docker for Mac and Windows. The new components are:

  • HyperKit ™: A lightweight virtualization toolkit on OSX
  • DataKit ™: A modern pipeline framework for distributed components
  • VPNKit ™: A library toolkit for embedding virtual networking

Each of these kits can be used independently or together to form a complete product such as Docker for Mac or Windows.  This is just the beginning: we will open more components in the future as they mature (e.g. the filesystem framework).  They all have a set of curated Pioneer Projects for beginners to take on: HyperKit ™, DataKit ™, and VPNKit ™.

Screen shot 2016-05-18 at 7. 19. 47 am. Png

HyperKit

HyperKit is based around a lightweight approach to virtualization that is possible due to the Hypervisor framework being supplied with MacOS X 10.10 onwards. HyperKit applications can take advantage of hardware virtualization to run VMs, but without requiring elevated privileges or complex management tool stacks.

HyperKit is built on the xHyve and bHyve projects, with additional functionality to make it easier to interface with other components such as the VPNKit or DataKit. Since HyperKit is broadly structured as a library, linking it against unikernel libraries is straightforward. For example, we added persistent block device support that uses the MirageOS QCow libraries written in OCaml.

How can you contribute?

There are three great areas for contribution:

  • Support for booting more guest operating systems. Linux is the only “first class” operating system supported at the moment. FreeBSD does boot, but requires running the installer and so isn’t as seamless. Patches exist to add more BIOS support to boot Windows, OpenBSD, or NetBSD, but require more testing.
  • Support for more high-level language bindings. Because the HyperKit is structured as a library, it can be interfaced with high-level languages using their normal foreign function interfaces.
  • Hypervisor features. Several traditional hypervisor features such as suspend/resume, live relocation and support for hardware performance counters are not supported. These need to be added in the same library style as the rest of the codebase, in order to ensure that HyperKit remains lightweight and easy to embed.

We will ensure that any contributions are structured such that they can be submitted to their respective upstream projects.

How else can you use it?

Any applications that need to spin up specialised or short-lived virtual machines can benefit from linking against HyperKit. These could be conventional operating systems such as Linux, or some of the unikernel projects once they have been ported to HyperKit.

DataKit

DataKit is a toolkit to coordinate processes with a git-compatible filesystem interface. It revisits the UNIX pipeline concept and the Plan9 9P protocol, but with a modern twist: streams of tree-structured data instead of raw text. DataKit lets you define complex workflows between loosely coupled processes using something as simple as shell scripts interacting with a version controlled file-system.

DataKit is a rethinking of application architecture around data flows, bringing back the wisdom of Plan 9’s “everything is a file”, in the git era where “everything is a versioned file”. Since we are making use of DataKit and 9P heavily in Docker for Mac and Windows, we are also open sourcing go-p9p, a modern, performant 9P library for Go.

How else can you use it?

There is a sample project using DataKit to create a Continuous Integration system in 50 lines of shell scripts in this repository: github.com/docker/datakit/tree/master/ci

The README also covers DataKit integration with GitHub. DataKit can be used in any situation where you need to coordinate processes around data, and shines when it is around versioned data.

How can you contribute?

GitHub PR support in DataKit is still quite basic, this is an area that could use additional contributions. DataKit could be used for a very broad set of use cases: share how you use it in your projects.

VPNKit

The VPNKit is a networking library that translates between raw Ethernet network traffic and their equivalent socket calls in MacOS X or Windows. It is based on the MirageOS TCP/IP unikernel stack, and is a library written in OCaml. VPNKit is useful when you need fine-grained control over networking protocols in user-space, with the additional convenience of being extensible in a high-level language.

How can you contribute?

VPNKit provides an interception point for all container traffic going through Docker for Mac or Windows. It could be extended with support for packet capture and inspection, protocol proxying to filter for particular traffic patterns, or even HTTP protocol visualisation for debugging web applications.

How else can you use it?

If VPNKit had support for more endpoint types, it could also be used to test network traffic without the overhead of actually generating and transmitting it.  It could also be used to build lightweight overlay networks between application components.

Next Steps

While the VPNKit and DataKit started life as quite specialised components in Docker for Mac and Windows, we are excited by the possibilities enabled by open sourcing them. The ideas here are by no means exhaustive, and we are looking forward to hearing about your own projects. Please file issues in their respective bug trackers as you come across them, or if you wish to discuss a particular idea.

And if you are at OSCON please come meet and collaborate with the maintainers of these projects in our OSCON Contribute session on Thursday 3 to 6 PM in Meeting Room 6. You can find more details about the internals of Docker for Mac and Windows in the slides for the talk I gave yesterday at OSCON.

If you haven’t already, please sign up for the Docker for Mac and Windows beta and send us feedback to make it better as we head towards general availability.  Finally, we would once again like to thank all of the open source efforts that made this release possible. The Docker for Mac and Windows acknowledgements list the hundreds of contributions that we use directly in our product, and we hope that you will also be able to check out and benefit from today’s releases in your own creations.

[getanews color=”F3E5D9″ h1=”Docker for Mac and Windows Beta” newsletter=”” btn=”yes” btn_text=”Sign up for the beta!” btn_url=”https://beta.docker.com” btn_target=”_blank” ]An integrated, easy-to-deploy environment for building, assembling, and shipping applications.[/getanews]


Learn More about Docker

Feedback

0 thoughts on "Improving Docker with Unikernels: Introducing HyperKit, VPNKit and DataKit"