May 16, 2024

Self-Host Hashicorp Vault Secrets Server with Docker

Posted on May 16, 2024  •  16 minutes  • 3294 words  • Other languages:  Deutsch

Recently, I have been evaluating Hashicorp’s Vault Server and set it up on several machines in a simple setup. Since the instructions on the Internet are somewhat scattered, I document my approach in the hope that it may help others.

Environment

I will not be using Kubernetes or Docker Swarm (in both, you can solve it quite nicely with the appropriate placement of containers if you master the storage challenge). The reason we’re using separate Docker instances has historical reasons in the project. The advantage is that I can document the individual steps quite well 😇

Our setup contains three nodes/servers. Each runs Docker and they are connected via VLAN:

The servers run Ubuntu 22.04. Those who prefer not to use Docker can also install Vault as a service - the process is not too difficult, you just need to adjust the commands below accordingly. The version of Vault used is 1.16.

Creating Certificates

First, we create the certificates for encrypted communication with Vault (and between the instances). To do this, we set up a Certificate Authority (CA) and certificates for each Vault node. On any server, we execute the following:

sudo mkdir /etc/vault
cd /etc/vault

sudo tee extfile.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
x509_extensions = v3_ca
prompt = no

[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = RootCA

[ v3_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical,CA:true
keyUsage = critical, digitalSignature, cRLSign, keyCertSign
EOF

sudo tee extfile01.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no

[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app01

[req_ext]
subjectAltName = @alt_names

[alt_names]
IP.1 = 192.168.100.111
IP.2 = 127.0.0.1
DNS.1 = app01
DNS.2 = vault
EOF

sudo tee extfile02.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no

[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app02

[req_ext]
subjectAltName = @alt_names

[alt_names]
IP.1 = 192.168.100.112
IP.2 = 127.0.0.1
DNS.1 = app02
DNS.2 = vault
EOF

sudo tee extfile03.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no

[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app03

[req_ext]
subjectAltName = @alt_names

[alt_names]
IP.1 = 192.168.100.113
IP.2 = 127.0.0.1
DNS.1 = app03
DNS.2 = vault
EOF

This created the extension files for OpenSSL. These contain necessary extensions for the certificates required by Vault. Without them, Vault refuses to communicate with the other instances. It is important to specify all possible own DNS names and IPs of the Vault instances. If you want to access Vault via additional IPs or DNS names, the lists must be extended accordingly!

Now we can create the actual certificates for the servers:

cd /etc/vault

# CA
sudo openssl genrsa -out ca.key 4096
sudo openssl req -new -x509 -days 3650 -key ca.key -out ca.crt -config extfile.cnf

# Host 01
sudo openssl genrsa -out vault01.key 4096
sudo openssl req -new -key rsa:4096 -key vault01.key -out vault01.csr -config extfile01.cnf
sudo openssl x509 -req -days 3650 -in vault01.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault01.crt -extensions req_ext -extfile extfile01.cnf

# Host 02
sudo openssl genrsa -out vault02.key 4096
sudo openssl req -new -key rsa:4096 -key vault02.key -out vault02.csr -config extfile02.cnf
sudo openssl x509 -req -days 3650 -in vault02.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault02.crt -extensions req_ext -extfile extfile02.cnf

# Host 03
sudo openssl genrsa -out vault03.key 4096
sudo openssl req -new -key rsa:4096 -key vault03.key -out vault03.csr -config extfile03.cnf
sudo openssl x509 -req -days 3650 -in vault03.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault03.crt -extensions req_ext -extfile extfile03.cnf

So, we first create the CA certificate and then the keys and certificates for each instance. The certificates are valid for about 10 years. In a production environment, it is advisable to use certificates with a shorter validity period - or to set up a dedicated authority for issuing such certificates, such as step-ca .

Finally, we test whether the certificates also contain the necessary extensions:

openssl x509 -noout -text -in vault01.crt | grep -A 1 "Subject Alternative Name"
openssl x509 -noout -text -in vault02.crt | grep -A 1 "Subject Alternative Name"
openssl x509 -noout -text -in vault03.crt | grep -A 1 "Subject Alternative Name"

All the necessary IPs and DNS names should be listed in the output.

Important: In the end, all created files must be copied to the other servers, e.g., via rsync or scp. At the end of this step, I assume the certificates will be on all servers in the /etc/vault folder.

Also important: The services must be able to communicate with each other. For this, the TCP ports 8200 and 8201 must be open in the VLAN. Here are example commands for UFW:

sudo ufw allow proto tcp from 192.168.100.0/24 to any port 8200
sudo ufw allow proto tcp from 192.168.100.0/24 to any port 8201

Auto-Unsealing with AWS KMS

If you want to run Vault in a production system, you want it to unseal automatically on restarts. There are several possibilities to implement this:

I have experimented a bit and ultimately ended up with AWS KMS. It is quite affordable at 1 USD per month and is easy to set up (if you know where to click in AWS 😱). The other solutions are feasible but more complex:

Here’s a brief guide for AWS:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "VaultKMSUnseal",
			"Effect": "Allow",
			"Action": [
				"kms:Decrypt",
				"kms:Encrypt",
				"kms:DescribeKey"
			],
			"Resource": "*"
		}
	]
}

This gives us four pieces of data:

Starting Vault on the individual Servers

app01

Wir start on app01 - the procedure is similar on the other machines. We need the key data from AWS and execute the following commands:

access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1

cd /etc/vault

# Config
sudo tee vault.hcl << EOF
cluster_addr  = "https://192.168.100.111:8201"
api_addr      = "https://0.0.0.0:8200"

storage "raft" {
  path = "/vault/data"
  
  retry_join {
    leader_api_addr = "https://app02:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault02.crt"
    leader_client_key_file = "/vault/config/vault02.key"
  }
  retry_join {
    leader_api_addr = "https://app03:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault03.crt"
    leader_client_key_file = "/vault/config/vault03.key"
  }
}
listener "tcp" {
  address = "0.0.0.0:8200"
  tls_cert_file = "/vault/config/vault01.crt"
  tls_key_file = "/vault/config/vault01.key"
}
seal "awskms" {
  region     = "${kms_region}"
  access_key = "${access_key}"
  secret_key = "${secret_key}"
  kms_key_id = "${kms_key_id}"
}
EOF

sudo chmod 600 vault.hcl

# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app

# Volumes
docker volume create vault
docker volume create vault_log

# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data

# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.111:8200:8200 \
   -p 192.168.100.111:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
   --add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
   docker.io/hashicorp/vault server

So, what is happening here?

This should get the server running. You can check the log to see if everything is okay:

docker logs vault

The log will currently contain a lot of error messages, which is okay and normal. First, we set up the other nodes.

app02

Commands only:

access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1

cd /etc/vault

sudo tee vault.hcl << EOF
cluster_addr  = "https://192.168.100.112:8201"
api_addr      = "https://0.0.0.0:8200"

storage "raft" {
  path = "/vault/data"
  
  retry_join {
    leader_api_addr = "https://app01:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault01.crt"
    leader_client_key_file = "/vault/config/vault01.key"
  }
  retry_join {
    leader_api_addr = "https://app03:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault03.crt"
    leader_client_key_file = "/vault/config/vault03.key"
  }
}
listener "tcp" {
  address = "0.0.0.0:8200"
  tls_cert_file = "/vault/config/vault02.crt"
  tls_key_file = "/vault/config/vault02.key"
}
seal "awskms" {
  region     = "${kms_region}"
  access_key = "${access_key}"
  secret_key = "${secret_key}"
  kms_key_id = "${kms_key_id}"
}
EOF

sudo chmod 600 vault.hcl

# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app

# Volumes
docker volume create vault
docker volume create vault_log

# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data

# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.112:8200:8200 \
  -p 192.168.100.112:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
  --add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
  docker.io/hashicorp/vault server

Start app03

Commands only:

access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1

cd /etc/vault

sudo tee vault.hcl << EOF
cluster_addr  = "https://192.168.100.113:8201"
api_addr      = "https://0.0.0.0:8200"

storage "raft" {
  path = "/vault/data"
  
  retry_join {
    leader_api_addr = "https://app01:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault01.crt"
    leader_client_key_file = "/vault/config/vault01.key"
  }
  retry_join {
    leader_api_addr = "https://app02:8200"
    leader_ca_cert_file = "/vault/config/ca.crt"
    leader_client_cert_file = "/vault/config/vault02.crt"
    leader_client_key_file = "/vault/config/vault02.key"
  }
}
listener "tcp" {
  address = "0.0.0.0:8200"
  tls_cert_file = "/vault/config/vault03.crt"
  tls_key_file = "/vault/config/vault03.key"
}
seal "awskms" {
  region     = "${kms_region}"
  access_key = "${access_key}"
  secret_key = "${secret_key}"
  kms_key_id = "${kms_key_id}"
}
EOF

sudo chmod 600 vault.hcl

# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app

# Volumes
docker volume create vault
docker volume create vault_log

# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data

# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.113:8200:8200 \
  -p 192.168.100.113:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
  --add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
  docker.io/hashicorp/vault server

Initialize Vault

After Vault is now running on all nodes, we can initialize the service. To do this, we run a temporary container on any server - this has the advantage that nothing remains in the user’s history after the work is completed.

docker run --rm -ti --network=app -v /etc/vault:/vault/config:ro -e VAULT_ADDR=https://vault:8200 \
  -e VAULT_CACERT=/vault/config/ca.crt -P docker.io/hashicorp/vault ash

The container is temporary and interactive and in the same network as our Vault service. We also need the configuration folder and set two environment variables for the vault command below. The -P is important so that we do not block our port range above. The image uses the ash shell.

In the shell, we initialize the cluster:

vault operator init

The keys and the initial root token should be stored in a secure location!

With that, Vault is ready for use and should remain accessible even after restarting individual containers. To test this, you can check the status (in the temporary container):

vault operator status

You can restart vault on your host (docker restart vault) and should see that Vault should be unsealed a short while later.

Test: Access and Go Snippets

Here a small dive into how Go could call your Vault servers.

Putting Data into Vault

We store some data in Vault for this purpose. Once again, we log in to any node and create a temporary container for Vault administration.

docker run --rm -ti --network=app -v /etc/vault:/vault/config:ro -e VAULT_ADDR=https://vault:8200 \
  -e VAULT_CACERT=/vault/config/ca.crt -P docker.io/hashicorp/vault ash

In the container, we need to set the root token that we obtained during initialization:

export VAULT_TOKEN=MyTOken

To test, we first create a secret that our program should retrieve:

vault secrets enable -version=2 -path=app -description="Application secrets" kv

vault kv put -mount=app apiKey key=PaipCijvonEtdysilgEirlOwUbHahachdyazVopejEnerekBiOmukvauWigbimVi

The application requires access, which we create using Approle:

vault auth enable approle

echo 'path "app/data/apiKey" {
  capabilities = ["read"]
}' | vault policy write myapp -

vault write auth/approle/role/myapp token_ttl=1h token_max_ttl=8h secret_id_ttl=0 token_policies="myapp"
# read data
vault read auth/approle/role/myapp/role-id
# create secret id
vault write -force auth/approle/role/myapp/secret-id

We receive two UUIDs in return, namely the role ID and the secret. Both are required in the application. In a live environment, the role can be further restricted by setting IP ranges from which the application can access - I have omitted this here to facilitate testing. Let’s try the login right away:

vault write auth/approle/login role_id="123" secret_id="456"

export VAULT_TOKEN=AppRoleToken

vault kv get -mount=app apiKey

We first create a login with the role and the secret. We receive a token in return, which we can then set as an environment variable (thus overriding the root token). Essentially, we assume the role of the service. In this role, we can attempt to read the API key, which should hopefully work.

Go-Program

Here, I present a small code snippet to use. Quite a part of it has been copied from the token renewal example and put it into a small library.

You must set the following environmental variables:

The last to variables must not be empty!

The library looks like this:

// vault.go
package main

import (
	"cmp"
	"context"
	"fmt"
	vault "github.com/hashicorp/vault/api"
	"github.com/hashicorp/vault/api/auth/approle"
	"github.com/rs/zerolog/log"
	"os"
)

// VaultClient is the global vault client
var VaultClient *vault.Client

// InitVault initializes the vault client
func InitVault() {
	vaultAddress := cmp.Or(os.Getenv("VAULT_ADDR"), "https://vault:8200")
	vaultCAFile := cmp.Or(os.Getenv("VAULT_CACERT"), "/etc/vault/ca.crt")

	// define config
	config := vault.DefaultConfig() // modify for more granular configuration
	config.Address = vaultAddress
	if err := config.ConfigureTLS(&vault.TLSConfig{
		CAPath: vaultCAFile,
	}); err != nil {
		log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to configure Vault TLS")
	}

	// create client
	client, err := vault.NewClient(config)
	if err != nil {
		log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to create Vault client")
	}

	ctx, cancelContextFunc := context.WithCancel(context.Background())
	defer cancelContextFunc()

	// copy to global variable
	VaultClient = client

	// initial login
	authInfo, err := vaultLogin(ctx)
	if err != nil {
		log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to login to Vault")
	}
	// start the lease-renewal goroutine & wait for it to finish on exit
	go vaultStartRenewLeases(authInfo)

	// everything ok, log success
	log.Info().Str("VAULT_ADDR", vaultAddress).Msg("Vault successfully connected and initial token created.")
}

func vaultLogin(ctx context.Context) (*vault.Secret, error) {
	// Get environment variables for Vault
	vaultAppRoleId := os.Getenv("APPROLE_ROLE_ID")

	if vaultAppRoleId == "" {
		log.Fatal().Msg("Error: Vault App Role not set.")
	}

	// initial login with AppRole
	appRoleAuth, err := approle.NewAppRoleAuth(vaultAppRoleId, &approle.SecretID{
		FromEnv: "APPROLE_SECRET_ID",
	})
	// TODO: we might want to create ResponseWrapping somehow
	// ref: https://www.vaultproject.io/docs/concepts/response-wrapping
	// ref: https://learn.hashicorp.com/tutorials/vault/secure-introduction?in=vault/app-integration#trusted-orchestrator
	// ref: https://learn.hashicorp.com/tutorials/vault/approle-best-practices?in=vault/auth-methods#secretid-delivery-best-practices
	// and example in: https://github.com/hashicorp/hello-vault-go/blob/main/sample-app/vault.go
	if err != nil {
		return nil, err
	}

	return VaultClient.Auth().Login(ctx, appRoleAuth)
}

func vaultStartRenewLeases(authToken *vault.Secret) {
	ctx, cancelContextFunc := context.WithCancel(context.Background())
	defer cancelContextFunc()

	log.Info().Msg("Starting lease renewal service.")
	defer log.Info().Msg("Stopping lease renewal service.")

	currentAuthToken := authToken

	for {
		renewed, err := renewLeases(ctx, currentAuthToken)
		if err != nil {
			log.Fatal().Err(err).Msg("Failed to renew leases")
		}

		if renewed&exitRequested != 0 {
			return
		}

		if renewed&expiringAuthToken != 0 {
			log.Printf("auth token: can no longer be renewed; will log in again")

			authToken, err := vaultLogin(ctx)
			if err != nil {
				log.Fatal().Err(err).Msg("Failed to login to Vault")
			}

			currentAuthToken = authToken
		}
	}
}

// renewResult is a bitmask which could contain one or more of the values below
type renewResult uint8

const (
	renewError renewResult = 1 << iota
	exitRequested
	expiringAuthToken                // will be revoked soon
)

func renewLeases(ctx context.Context, authToken *vault.Secret) (renewResult, error) {
	log.Info().Msg("Starting lease renewal.")

	// auth token
	authTokenWatcher, err := VaultClient.NewLifetimeWatcher(&vault.LifetimeWatcherInput{
		Secret: authToken,
	})
	if err != nil {
		return renewError, fmt.Errorf("unable to initialize auth token lifetime watcher: %w", err)
	}

	go authTokenWatcher.Start()
	defer authTokenWatcher.Stop()

	// monitor events from all watchers
	for {
		select {
		case <-ctx.Done():
			return exitRequested, nil

		// DoneCh will return if renewal fails, or if the remaining lease
		// duration is under a built-in threshold and either renewing is not
		// extending it or renewing is disabled.  In both cases, the caller
		// should attempt a re-read of the secret. Clients should check the
		// return value of the channel to see if renewal was successful.
		case err := <-authTokenWatcher.DoneCh():
			// Leases created by a token get revoked when the token is revoked.
			return expiringAuthToken, err

		// RenewCh is a channel that receives a message when a successful
		// renewal takes place and includes metadata about the renewal.
		case info := <-authTokenWatcher.RenewCh():
			log.Printf("auth token: successfully renewed; remaining duration: %ds", info.Secret.Auth.LeaseDuration)

			//case info := <-databaseCredentialsWatcher.RenewCh():
			//	log.Printf("database credentials: successfully renewed; remaining lease duration: %ds", info.Secret.LeaseDuration)
			//}
		}
	}
}

The library can be used in a long-running service and automatically renews the access token in the background (as a Go routine).

A small test program:

// main.go
package main

import (
  "context"
  "fmt"
)

func main() {
  InitVault()

  secret, err := VaultClient.KVv2("app").Get(context.Background(), "apiKey")
  if err != nil {
    panic("Failed to get token key from Vault")
  }

  fmt.Printf("API-Key: %s\n", secret.Data["key"])
}

In a service, you can always access the global variable VaultClient. It will always contain a valid session (a token).

Title Image: Immeuble du Crédit Lyonnais - Used under the conditions of CC BY-SA 3.0 .

By logging in into comments, two cookies will be set! More information in the imprint.
Follow me