Describe the bug
When trying to use Raft as the backend storage for a Vault running as a container inside minikube (Kubernetes), the following error message is threw: Error initializing storage of type raft: failed to create fsm: invalid argument
To Reproduce
Steps to reproduce the behavior:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: vault1
labels:
app: vault1
spec:
replicas: 1
selector:
matchLabels:
app: vault1
template:
metadata:
labels:
app: vault1
spec:
containers:
- name: vault1
command: ["vault", "server", "-config", "/vault/config/vault1.json"]
image: "vault:1.4.2"
imagePullPolicy: IfNotPresent
env:
- name: VAULT_API_ADDR
value: https://vault:8200
- name: VAULT_CLUSTER_ADDR
value: https://vault1.default.svc.cluster.local:8201
- name: VAULT_LOG_LEVEL
value: Debug
- name: VAULT_TOKEN
valueFrom:
secretKeyRef:
name: vault-token
key: token
securityContext:
capabilities:
add:
- IPC_LOCK
volumeMounts:
- name: config
mountPath: /vault/config/vault1.json
subPath: vault1.json
- name: vault1-raft
mountPath: /vault/data
- name: certs
mountPath: /vault/certs
volumes:
- name: config
configMap:
name: vault1-config
- name: certs
secret:
secretName: vault1-certs
- name: vault1-raft
hostPath:
path: /Users/rodanami/hashi-vault/ha/volumes/vault1/data
---
apiVersion: v1
kind: Service
metadata:
name: vault1
labels:
app: vault1
spec:
type: ClusterIP
ports:
- port: 8200
targetPort: 8200
protocol: TCP
name: vault1
selector:
app: vault1
kubectl create secret generic vault-token \
--from-literal=username="root" \
--from-literal=token="s.XXXXXXXXXXXXXX"
kubectl create secret generic vault1-certs \
--from-file=certs/ca.crt \
--from-file=certs/vault1.crt \
--from-file=certs/vault1.key
kubectl create configmap vault1-config --from-file=configs/vault1.json
Apply the deployment
kubectl apply -f k8s-vault1-deploy.yaml
Get POD log to see an unique log entry
kubectl logs vault1-546767687-w4sfm
Error initializing storage of type raft: failed to create fsm: invalid argument
Expected behavior
Vault server initializes using Raft backend storage.
Environment:
vault status): 1.4.2 (Default image)vault version): Vault v1.4.2Darwin Kernel Version 19.5.0Vault server configuration file(s):
{
"storage": {
"raft": {
"path": "/vault/data/",
"node_id": "vault1"
}
},
"listener": {
"tcp":{
"address": "0.0.0.0:8200",
"cluster_address": "0.0.0.0:8201",
"tls_disable": 0,
"tls_cert_file": "/vault/certs/vault1.crt",
"tls_key_file": "/vault/certs/vault1.key"
}
},
"seal": {
"transit": {
"address": "https://vault-transit:8200",
"disable_renewal": "false",
"key_name": "autounseal",
"mount_path": "transit/",
"tls_ca_cert": "/vault/certs/ca.crt",
"tls_client_cert": "/vault/certs/vault1.crt",
"tls_client_key": "/vault/certs/vault1.key",
"tls_server_name": "vault-transit",
"tls_skip_verify": "false"
}
},
"ui": true,
"max_lease_ttl": "760h",
"default_lease_ttl": "760h"
}
Additional context
Host Filesystem:
rodanami@rodanami:~/hashi-vault/ha/volumes/vault1$ find .
.
./data
./data/vault.db
Host Filesystem permissions:
rodanami@rodanami:~/hashi-vault/ha/volumes/vault1$ ls -laR
total 0
drwxr-xr-x 3 rodanami staff 96B Jul 13 20:18 ./
drwxr-xr-x 6 rodanami staff 192B Jul 13 16:04 ../
drwxr-xr-x 3 rodanami staff 96B Jul 13 19:31 data/
./data:
total 32
drwxr-xr-x 3 rodanami staff 96B Jul 13 19:31 ./
drwxr-xr-x 3 rodanami staff 96B Jul 13 20:18 ../
-rw-r--r-- 1 rodanami staff 16K Jul 13 19:31 vault.db
raft directory, raft.db, snapshots were NOT created here for unknown reason.
The underlying error is raised by bolt.Open, and it's likely due to the the path specified for raft storage being a shared volume (similar to the issue reported here).
Can you try setting up a persistent volume to use as the path for Raft storage and see if that works?
Update: Alternatively, you can follow the instructions over here to use the Vault Helm chart to a deploy an HA Vault cluster with Integrated Storage.
Thanks @calvn ! It does make sense since I'm using a local folder synchronized with a Box.com folder for raft data. Let me try creating a minikube PV for raft that is not shared with Box or shared.
Unfortunately I'm still getting the same error message even though I'm using a PV now.
rodanami@rodanami:~/hashi-vault/ha$ kubectl logs vault1-7b76b7cc6d-7gnql
Error initializing storage of type raft: failed to create fsm: invalid argument
vault.json
{
"storage": {
"raft": {
"path": "/vault/data",
"node_id": "vault1"
}
},
vault1-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: vault1
labels:
app: vault1
spec:
replicas: 1
selector:
matchLabels:
app: vault1
template:
metadata:
labels:
app: vault1
spec:
containers:
- name: vault1
command: ["vault", "server", "-config", "/vault/config/vault1.json"]
image: registry.hub.docker.com/library/vault:1.4.2
imagePullPolicy: IfNotPresent
env:
- name: VAULT_API_ADDR
value: https://vault:8200
- name: VAULT_CLUSTER_ADDR
value: https://vault1.default.svc.cluster.local:8201
- name: VAULT_LOG_LEVEL
value: Debug
- name: VAULT_TOKEN
valueFrom:
secretKeyRef:
name: vault-token
key: token
securityContext:
capabilities:
add:
- IPC_LOCK
ports:
- containerPort: 8200
name: vault-port
protocol: TCP
- containerPort: 8201
name: cluster-port
protocol: TCP
volumeMounts:
- name: config
mountPath: /vault/config/vault1.json
subPath: vault1.json
- name: certs
mountPath: /vault/certs
- name: vault1-storage
mountPath: /vault
volumes:
- name: config
configMap:
name: vault1-config
- name: certs
secret:
secretName: vault1-certs
- name: vault1-storage
persistentVolumeClaim:
claimName: vault1-pv-claim
PVC
rodanami@rodanami:~/hashi-vault/ha$ kubectl describe pvc vault1-pv-claim
Name: vault1-pv-claim
Namespace: default
StorageClass:
Status: Bound
Volume: vault1-volume
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 2Gi
Access Modes: RWO
VolumeMode: Filesystem
Mounted By: vault1-7b76b7cc6d-7gnql
Events: <none>
Local path:
rodanami@rodanami:/usr/local/mnt/vault-ha/vault1$ find .
.
./file
./config
./config/vault1.json
./logs
./certs
./data
./data/vault.db
If I don't create the data directory manually, it complains with Error initializing storage of type raft: failed to create fsm: open /vault/data/vault.db: no such file or directory.
I have updated minikube and kubectl to the latest version, but no changes.
I tested vault:1.4.3 today and got the same issue.
Everything works fine with the file storage backend in the same directory. It's strange that raft module creates a data/vault.db and then throws this error. I didn't find any relevant error from minikube kubelet. Also, this new directory is not shared on MacOS level and not synchronized with any external filesystem. I'll try to shutdown my Box Sync and CrashPlan and any other software that may be reading a file in my disk.
No luck @calvn, same error even shutdown all possible software in my MacBook. I tried to mimic the go Bolt error that you referred in the same directory where I'm mounting my persistent volumes:
rodanami@rodanami:/usr/local/mnt/go$ docker run -i -t -v /usr/local/mnt/go:/gopath/src/test golang /bin/bash
root@67d8b50c6627:/gopath/src/test# cat main.go
package main
import (
"github.com/boltdb/bolt"
"log"
)
func main() {
db, err := bolt.Open("data.db", 0600, nil)
if err != nil {
log.Fatal(err)
} else {
log.Println("Successful!")
}
defer db.Close()
}
root@67d8b50c6627:/gopath/src/test# go version
go version go1.14.5 linux/amd64
root@67d8b50c6627:/gopath/src/test# go run main.go
2020/07/16 20:33:12 Successful!
I don't think this raft error is related to the same go Bolt issue from 2014.
Can you also output the actual persistent volumes (via kubectl describe pv) and share the YAML for the PVC?
If you're able to use helm, can you give the helm guide over here a try to see if you can get it running on your laptop?
Since we're using minikube, run this command instead to get a 3-node cluster with Raft set up on a single k8s node.
helm install vault hashicorp/vault --set='server.ha.enabled=true' --set='server.ha.raft.enabled=true' --set='server.affinity=null'
Sure, below is the persistent volume and related persistent volume claim
PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: vault1-volume
labels:
type: local
vault: vault1
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/usr/local/mnt/vault-ha/vault1"
PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: vault1-pv-claim
spec:
storageClassName: ""
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
volumeName: vault1-volume
Also, I'm mapping the minikube node filesystem to my MacBook filesystem:
minikube mount /usr/local/mnt/vault-ha:/usr/local/mnt/vault-ha &
I'll try helm over this weekend to see if makes any difference. Thanks
Also, I'm mapping the minikube node filesystem to my MacBook filesystem:
The issue could be the mapping between the minikube node and MacOS. Can you try a path for raft that's not mapped to the host?
Indeed, if I don't use minikube mount, raft initializes as expected:
rodanami@rodanami:~/hashi-vault/ha$ minikube ssh
_ _
_ _ ( ) ( )
___ ___ (_) ___ (_)| |/') _ _ | |_ __
/' _ ` _ `\| |/' _ `\| || , < ( ) ( )| '_`\ /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)
$ cd /usr/local/mnt/vault-ha/vault1/
$ find .
.
./logs
./data
./data/raft
./data/raft/snapshots
./data/raft/raft.db
./data/vault.db
Thanks for the help!
Most helpful comment
The issue could be the mapping between the minikube node and MacOS. Can you try a path for raft that's not mapped to the host?