Boxen Overview

Powering Workstations With Git & Puppet

Created by  Jim Moore / @jdiggerj

Why Boxen?

"Infrastructure as code" is accepted for our servers, but why don't we apply it to our workstations?

Boxen is a tool that enables not just automated installation of software, but also configuration of our desktops

Boxen puts the developer in control of their desktop (as if they weren't anyway) but also scales across companies, teams, and even home vs work computers

Consistency & Repeatability

No more "follow the steps on the wiki at ..."

Consistently configure your workstation environment

Remove manual steps

Computers Are Faster Than Humans

Small tax on initial customizations

Graceful iterative customizations

Pays back the FIRST time it is re-used

Install apps AND configure them
including a lot of the settings that normally require manual changes

Geek Cool

A tool meant for developers/power-users

Reinforce DevOps mantra "infrastructure is code"


Everything is versionable (thus rollbacks) and diff-able


Differences between environments can easily be handled

Apply server configuration technology to your workstation

Desktops are "Special"

How do you automate installation/config on a Mac?
(or any "desktop" OS)?

Lack of exposed software versioning

Installation assumes single user

Installation assumes a GUI


  • Mac OS X 10.8+ only
  • XCode CLI (slightly tricky pre-Mavericks)

Getting Started: The Bootstrap

sudo mkdir -p /opt/boxen
sudo chown ${USER}:staff /opt/boxen
git clone [your_boxen_repo] /opt/boxen/repo
cd /opt/boxen/repo
./script/boxen [--no-fde]

* defaults to Full Disk Encryption

Alternate: Boxen Web

Primary Components


GitHubversion control and user management
Puppetsystem configuration
Puppet Librarianmodule management for Puppet
Homebrewprovides "most" of the software
Ruby (rbenv)most of the scripting

Software Package Providers

There are many "providers" of software. Some examples:

homebrewthe default for Boxen, it's one of the most fully capable; primarily limited to open-source because it wants to compile the code for your machine
brewcaskthe preferred way to install binary programs
appdmgif *.dmg contains *.app to drag into /Applications
appdmg_eulaif *.dmg asks for a license agreement when it is opened and contains *.app to drag into /Applications
pkgdmgif *.pkg contains installer
compressed_appif *.zip to unzip & place into /Applications

Other sources include gem, npm, pip, ...

Provider Features

The OS X packaging providers are, in general, incredibly "limited"

By definition, providers install software. Some can do more:


Providers (other than homebrew & macports) track package installation at /var/db/.puppet_{provider}_installed_{package_name}


Therefore, to re-install, you must delete the file above before running boxen again

Boxen Limitations

The primary limitation is that it's not (yet?) able to install software from the Apple App Store


Otherwise install and configure your entire system automatically without manual steps

The time cost is primarily download time

Customize & Iterate

  • Edit
    • modules
    • packages
    • people
    • ...
  • re-run /opt/boxen/script/boxen

Puppet Terms

Basic Structure

(Highly, highly simplified)

ModuleEffectively Puppet's "library" unit
ManifestThe file containing the description/script
ResourceA unit of configuration (e.g., "file", "package")
ClassA singleton managing Resources
Defined TypeCan have more than one instance managing Resources

Puppet Glossary

In general, "Module" has at least one "Manifest" which has at least one "Class"

Puppet Terms:

Every resource has:

  • Type (such as package, file, exec, ...)
  • Title
  • >1 attributes
    • e.g., an "ensure" attribute with a value of "present"


Resources can (and often do) depend on other resources

Puppet will build a dependency graph to make sure everything is applied in the correct order

Each resource title must be unique in the DAG

Puppet Terms:

Facts (Facter)Descrete information about the machine (FQDN, IP addr, OS, etc.) gathered by "facter"
Puppet LibrarianUsed for managing Puppet Modules
HieraHierarchical data: a flexible way of providing configuration data based on "facts"
Profiles & RolesNot covered here, but a Puppet Enterprise convention for organizing Hiera data

Boxen Terms

UserDefined by the "fact": ::boxen_user
ProjectA grouping meant for "make sure people on this team have at least this configuration"; you can have multiple "projects" applied to a machine


Packages are the primary way software is installed.

Simple example:

package { 'gradle':
    ensure   => "installed", # *
    provider => "homebrew",

package { "IntelliJ-IC-12.1.4":
    provider => 'appdmg_eula',
    source   => "",

* typically defaults to "present". Valid values vary by resource type & provider. May include installed, latest, absent, or a specific version/version range (e.g., ">= 1.12")

Other Resources


Puppet File Documentation

$home = "/Users/${::boxen_user}" # 1

file { "${home}/.zshrc":
  source => 'puppet:///modules/people/jdigger/zshrc', # 2

file { "${home}/.zshenv":
  content => template('people/jdigger/zshenv.erb'), # 3
  1. define a variable based on a "fact"
  2. a static file from a module
  3. set the content from a template (Ruby ERB by default)


GitHub Repository

$home = "/Users/${::boxen_user}"
$srcdir = "${home}/src"

repository { "${srcdir}/git-process" :
  source   => '',
  path     => "${srcdir}/git-process",
  provider => 'git',

Ruby Gem

$ruby_version = '1.9.3'

ruby::gem { "git-process for ${ruby_version}": # 1
    gem     => 'git-process',
    ruby    => $ruby_version, # 2
    version => '~> 2.0', # 3
  1. because there could be potentially more than one installation of the gem (in the various rbenv versions) it's a good idea to put the Ruby version in the resource name so it is guaranteed to be unique
  2. the version of Ruby (using rbenv) to install the gem into
  3. of course gem versions can use the semantic versioning support of Ruby Gems


example of configuring Adium

property_list_key { 'Adium users':
  path       => "${home}/Library/Application Support/Adium 2.0/Users/Default/Accounts.plist",
  key        => 'Accounts',
  value      => [
      'Service'  => 'GTalk',
      'UID'      => $gtalk_name,
      'Type'     => 'libpurple-jabber-gtalk',
      'ObjectID' => '1',
      'Service'  => 'Yahoo!',
      'UID'      => $yahoo_name,
      'Type'     => 'libpurple-Yahoo!',
      'ObjectID' => '2',
      'Service'  => 'AIM',
      'UID'      => $aim_name,
      'Type'     => 'libpurple-oscar-AIM',
      'ObjectID' => '3',
  value_type => 'array',


boxen::osx_defaults { 'scrollbars always on':
  domain => 'NSGlobalDomain',
  key    => 'AppleShowScrollBars',
  value  => 'Always',
  user   => $::boxen_user,

osx::recovery_message { 'If this Mac is found, please call 555-555-5555': }

include osx::finder::unhide_library

See the defaults man page

An awesome list of available settings/tools can be found at OSXDefaults


Puppet Exec Documentation

$source_tgz =
    '' # 1
exec { 'install gjslint': # 2
    command => "easy_install ${source_tgz}",
    user    => 'root', # 3
    creates => '/usr/local/bin/gjslint', # 4
  1. should come in as part of a class definition...
  2. arbitrary resource name: if "command" not given, this is used, but generally best to give it a "meaningful" name
  3. the user to sudo as when executing the command
  4. before running the command the existence of this file is checked; if it's there it's assume this has already run

Instead of "creates", you can use "onlyif"/"unless" to run a command, such as

unless => 'grep root /usr/lib/cron/cron.allow 2>/dev/null'

Defining Classes

The primary way of referencing groups of Resources is via Classes.

A Class may take parameters for configuration.

from puppet-intellij (slide-ware version)

class intellij($edition='community', $version='13.1.1') {
  case $edition {
    'community': { $edition_real = 'IC' }
    'ultimate': { $edition_real = 'IU' }
    default: { fail('Class[intellij]: parameter edition must be community or ultimate') }
  package { "IntelliJ-IDEA-${edition_real}-${version}":
    provider => 'appdmg_eula',
    source   => "${edition_real}-${version}.dmg",

Defining Classes (annotated)

class intellij($edition='community', $version='13.1.1') {
  case $edition { # 1
    'community': { $edition_real = 'IC' }
    'ultimate': { $edition_real = 'IU' }
    default: { fail('Class[intellij]: parameter edition must be community or ultimate') } # 2
  package { "IntelliJ-IDEA-${edition_real}-${version}": # 3
    provider => 'appdmg_eula', # 4
    source   => "${edition_real}-${version}.dmg", # 5
  1. Conditionals for setting "private" variable
  2. Able to fail the configuration before anything is applied
  3. Variable substitution. (Resource names must be unique)
  4. Install from .dmg, auto-accepting the EULA
  5. Where to download the installation package from

Calling Classes

Explicitly calling with parameters

class { "intellij":
    edition => 'community',
    version => '13.1.1',

Calling with default parameters

class { "intellij": }

Importing with default parameters

import "intellij"

import form has the advantage over class {:} in that class {:} can only appear once in your entire graph. import will add the class resource if it's not yet defined, or ignore it if not.

Calling Classes with Hiera

Want simple resource declarations AND centralized/powerful configuration data (like versions)? That's what Hiera is for... 

Importing with parameter lookup

import "intellij"

Hiera YAML configuration

intellij::edition: 'ultimate'
intellij::version: '13.1.2'

Separates usage from configuration

Defined Types

Defined Types are similar to Classes, providing scoping, resource management, ability to pass in parameters, etc.

Unlike Classes, you can have multiple instances of Defined Types.

Because of that, you can't use include or tools like Hiera to configure them.

from the boxen/puppet-sublime_text_2 modules

define sublime_text_2::package($source) {
  require sublime_text_2::config
  repository { "${sublime_text_2::config::packagedir}/${name}":
    source => $source

If you're trying to decide between declaring something class or define, err toward class

Resource Defaults

repo/manifests/site.pp (simplified)

Exec {
  group => 'staff',
  user  => $boxen_user,
  path  => [
    '/usr/bin', '/bin', '/usr/sbin', '/sbin',

File {
  group => 'staff',
  owner => $boxen_user,

Package {
  provider => homebrew,

Repository {
  provider => git,

Service {
  provider => ghlaunchd,

Brief Structure

Boxen default installation at /opt/boxen

All user edits to the repo subdirectory (/opt/boxen/repo/)

Editing /opt/boxen/repo

Ordered roughly by likelihood that you'll modify something.

modules/the Puppet "modules" to load; most customization happens here
modules/peoplecontains the module associated with a specific person 
Puppetfilecontrol file for Puppet Librarian 
hiera/data-based configuration 
modules/projectscontains the module associated with projects 
manifests/site.ppsets the defaults for the loaded modules
config/contains some files for tweeking more advanced features

Parts to Avoid Editing in /opt/boxen/repo

Directories you should not directly interact with:

script/contains the primary scripts, including boxen itself 
shared/, vendor/cache for Librarian modules
bin/the shims for installed Ruby Gems, including Puppet
../where Boxen installs a lot of its other control files
*.lockcontrol files for version locking 

The *.lock files are great when you understand them, and generally stay out of your way even when you don't

But they can be a bit of a pain when you merge in remote changes to the project...

See the FAQ

Boxen User Edits

Directory: /opt/boxen/repo/modules/people

By default {user} is based on your GitHub user ID

However, you can leverage parts and pieces from other users

manifests/{user}.ppprimary manifest for a user
manifests/{user}/where your "subclasses" go
files/{user}/static resources for the user
templates/{user}/templatized resources for the user
spec/unit tests for the manifests

Example User Manifest


class people::jdigger {
    include people::jdigger::dotfiles
    include people::jdigger::bin
    include people::jdigger::applications
    include people::jdigger::ruby
    include people::jdigger::git
    include people::jdigger::sublime_text_2
    include people::jdigger::osx

Just as it's good practice in OOD to delegate low-level details to subclasses, you should do the same in Puppet class design as well

Example "Sub Class"


class people::jdigger::sublime_text_2 {
  include 'sublime_text_2'# 1
  $home = "/Users/${::boxen_user}"
  file { "${home}/Library/Application Support/Sublime Text 2/Packages/User":
    ensure => 'directory',
    owner  => $::boxen_user,
    mode   => '0755',
  } # 2
  -> # 3
  file { "${home}/Library/Application Support/Sublime Text 2/Packages/User/Preferences.sublime-settings":
    source  => 'puppet:///modules/people/jdigger/sublime-settings',
  } # 4
  1. make use of the "general" sublime_text_2 module
  2. ensure that the user preferences directory exists
  3. shorthand to declare that the following resource needs the previous one to be applied first
  4. I want Sublime Text 2 to behave the same for me regardless of what machine I'm on, so any changes I make to the settings are done in the module

Puppet Librarian: Puppetfile

Puppet Librarian usage/syntax

some standard entries

github "dnsmasq",  "1.0.1"
github "gcc",      "2.0.100"
github "git",      "2.3.0"
github "homebrew", "1.6.2"

Where github is a function that translates

github "git", "2.3.0"


mod "git", "2.3.0", :github_tarball => "boxen/puppet-git"

Puppet Librarian: Finding Modules

Search for Module Names

Boxen Facter Extensions

snippet from shared/lib/facter/boxen.rb

dot_boxen   = "#{ENV['HOME']}/.boxen"
user_config = "#{dot_boxen}/config.json"

require "boxen/config"
config = Boxen::Config.load

facts["github_login"]  = config.login
facts["github_email"]  =
facts["github_name"]   =
facts["github_token"]  = config.token

facts["boxen_home"]     = config.homedir
facts["boxen_srcdir"]   = config.srcdir
facts["boxen_repodir"]  = config.repodir
facts["boxen_reponame"] = config.reponame
facts["boxen_user"]     = config.user

The "cached" values for those are in /opt/boxen/config/boxen/defaults.json

You can set your own personal/private custom Facts in ~/.boxen/config.json

Use Case: Private vs Personal

Typical personal configuration goes in modules/${::github_login}, which is public to a boxen repo

This is generally very good thing

However, some facts do not belong in a public location (Passwords, SSH keys, OAuth tokens, etc.)

For configuration that can easily be turned into "data" -- especially if it would be used in multiple places when configuring the system (e.g., user names and passwords) -- the ~/.boxen/config.json file is perfect

For configuration that's more complex, like your ~/.ssh directory, create a private[BitBucket repository], or encrypted .zip in Dropbox, or ...

Keep it simple and secure: You want it to be easily accessible before the rest of your system is set up



  • Use the --noop option
  • Don't install/configure your machine except via Boxen (this is a culture shift, but worth it)
  • The --debug output has a wealth of information, but can be overwhealming
  • Don't forget it's all code and backed by git: use the forking and branching processes you normally would
  • Design your classes using parameters and facts, keeping Hiera in mind
  • Especially if you're creating modules "for realz" (or are just suitably paranoid), run in a VM: Creating an ISO

Infrastructure As Code

Don't forget it's all code and backed by git

Use the forking and branching processes you normally would

merge with the upstream (e.g., boxen/our-boxen) often

git remote add boxen
git fetch --all
git merge boxen/master

When you get merge conflicts on *.lock files, see the FAQ

Make use of other people's modules, and post your own!

Re-use by Module

Create modules/packages/ for general-purpose Puppet classes to share across users or projects

Publish useful packages

OSS benefits all, and the "packages" module and its like should be considered temporary/stop-gap

This also works well for "proprietary" software/configurations (though those may be published on an internal repo instead of a public one)

Avoid 'node default'

The default repo/manifests/site.pp contains a node default section that will get loaded up on every machine

There's two problems with it:

  • The "out of the box" list contains a bunch of stuff that people may or may not care about having loaded up on their machine (e.g. NodeJS, ngnix, 4 versions of Ruby)
  • node default doesn't allow changing configuration values, etc.

It's a legacy of Puppet 1.0, long before much more flexible mechanisms like Hiera

Use Hiera

A much better approach is replace 'node default' in 'site.pp' with

if hiera_array('classes', undef) {

Then, if you want to make sure everyone's got Ruby set up, set hiera/common.yaml

  - ruby::global
ruby::global::version: "2.1.1"
    ensure: v20140420
    source: sstephenson/ruby-build

Hiera Power

With a simple shim you can do things like...


    source: 'facelessuser/BracketHighlighter'
    source: 'SublimeCodeIntel/SublimeCodeIntel'
    source: 'kemayo/sublime-text-git'
    source: 'jisaacks/GitGutter'


    source: 'SublimeText/AsciiDoc'
    source: 'revolunet/sublimetext-markdown-preview'
    source: 'dzhibas/SublimePrettyJson'
    source: 'russCloak/SublimePuppet'

Hiera Structure

For example in repo/config/hiera.yaml

:merge_behavior: deeper
  - yaml
  :datadir: "%{::boxen_home}/repo/hiera"
  - "users/%{::github_login}/nodes/%{::hostname}"
  - "users/%{::github_login}/nodes/common"
  - "users/%{::github_login}"
  - "projects/%{::boxen_project_01}"
  - "projects/%{::boxen_project_02}"
  - "projects/%{::boxen_project_03}"
  - "projects/%{::boxen_project_04}"
  - "projects/%{::boxen_project_05}"
  - "projects/%{::boxen_project_06}"
  - "projects/%{::boxen_project_07}"
  - "projects/%{::boxen_project_08}"
  - "projects/%{::boxen_project_09}"
  - "projects/%{::boxen_project_10}"
  - "projects/common"
  - "common"


More advanced, but it can be worth borrowing some techniques from Puppet Enterprise


class people::jdigger::applications ($system_roles = undef) {
  $_system_roles = hiera_array('people::jdigger::system_roles', [])
  $roles = $system_roles ? { undef => $_system_roles, default => $system_roles}
  include people::jdigger::applications::general
  if member($roles, 'work') {
    include 'people::jdigger::applications::work'
  if member($roles, 'personal') {
    include 'people::jdigger::applications::personal'


boxen::security::require_password: false
  - personal


class people::jdigger::applications::personal {
  include 'calibre'
  include 'steam'



Easy to start

Easy to iterate

Fast return on investment

Installation + Configuration

All the advantages of source control

Q & A