This is my current thought for supporting the ZFS file system in Nomad. This feature is going to introduce file system drivers in Nomad, and in the future we could decide to make the file system drivers pluggable.
ZFS support in Nomad is going to enable the following -
a. Impose the allocation directory size restriction.
b. Faster creation of allocation directories by creating new allocation dirs from snapshots and COW mechanisms.
c. Transferring ZFS snapshots between hosts when we are migrating allocation directories.
Each of these features will land in different phases. The first phase is simply going to be allowing the creation of allocation drivers in a ZFS pool.
We will add the following configuration blocks
Nomad client.
file_system {
driver = "zfs" // file system driver name for the client.
options {
pool_name = "allocdirs" // This is the zpool nomad uses to create the alloc dirs
parent_dataset = "allocs" // All the allocdir datasets are created under this.
attr_blacklist = ["dedup", "compression"] // The attributes which we don't want the users to be overriding
.... // options related to controlling attributes of dataset
.... // such as dedup, compression type, etc
}
We are not going to change anything about the reserved configuration option of the client. Operators will still use the reserved configuration block to reserve disk space on the nomad client agents for usage with services not managed by Nomad.
Nomad Job File
We will modify the ephemeral_disk block to allow users certain ZFS(or other file system attributes) and also mention the file system driver.
ephemeral_disk {
driver = "zfs"
attributes {
record_size = 16
... // more file system attributes
}
}
The reason we may want to allow end users to specify certain attributes related to the file system is that certain class of applications, such as the I/O intensive ones, might need to tune things like record size, or atime, etc.
Fingerprinting
The client is going to fingerprint the filesystems available and send the information to server which will be used to constrain jobs which wants to use ZFS based file systems.
Talked to Diptanu on Gitter but for posterities sake:
1) We should avoid the setting of the driver in the ephemeral_disk block and defer to filesystem driver configured by the operator. If the job submitter requires ZFS they should use a constraint.
2) The attributes block should be a best effort fs_pragma that is consumed by the file system driver.
@diptanu We actually need the fingerprinter I realize for determining the size of the pool!
@dadgar The need for ZFS can be a constraint, however desired ZFS attributes should be specified in the job file and not by the operator. A generic fs_pragma is fine, but know that some ZFS attributes are well known and can/should be enforced, whereas others should be passed through as a completely opaque key/value map.
@dadgar Sounds good, hoping to get this out soon.
@diptanu Now with volume drivers being released, would this task be easier to accomplish?
Most helpful comment
@dadgar Sounds good, hoping to get this out soon.