Copilot-cli: Parse Healthcheck from Dockerfile Design

Created on 15 May 2020  路  6Comments  路  Source: aws/copilot-cli

Goal

We want the ecs-cli-v2 to be able to parse healthcheck options from the user鈥檚 Dockerfile. There are total of four options you can specify (otherwise will result into default values) and an optional command.

Overview

Options

  • interval
  • timeout
  • start-period
  • retries

Types of Healthcheck formats

  1. No command
HEALTHCHECK --interval=5m --timeout=3s
  1. No options
HEALTHCHECK CMD curl -f http://localhost/ || exit 1
  1. Options and command in the same line
HEALTHCHECK --interval=5m --timeout=3s CMD curl -f http://localhost/ || exit 1
  1. Options and command in different lines
HEALTHCHECK --interval=5m --timeout=3s \
  CMD curl -f http://localhost/ || exit 1
  1. No healthcheck
HEALTHCHECK NONE

Design Proposal

Building on top of the parsing ports on a Dockerfile #747, we can create a Healthcheck struct to group all of the options and command. In addition, a separate method for parsing Healthcheck.

Programming Model

type HealthCheck struct {
  interval uint16
  timeout uint16
  startPeriod uint16
  retries uint16
  command string
}
func parseHealthcheck(line string) []healthcheck {

}

Parsing Method

Currently, the way ecs-cli-v2 reads the user's Dockerfile by scanning line by line. We can use regex to parse the health check options as they will always follow the same format.

Example
healthcheckRegexPattern := "--([^\\s]+)" // this will grab all strings that start with --

We can also use regex to parse out the command

healtchcheckCMDRegexPattern := "CMD.*" //this will grab CMD until the end of the line

Issue

An issue arises when CMD is put on a different line (No. 4 on Types of Healtcheck formats). Because the scanner reads one line at a time, it will not be able to pick up CMD on the second line. I propose a healthcheckCMD flag where it will insert the following CMD line into the healthcheck struct if command in the healthcheck struct is empty and if the CMD line is right after HEALTHCHECK.

typdesign

Most helpful comment

Looks good!

We can use regex to parse the health check options as they will always follow the same format.

Leaving this here as a resource: https://github.com/moby/buildkit/blob/10889212c4635d4edbe8aa253be601c6fa2823a5/frontend/dockerfile/instructions/parse.go#L424 (how docker parses these instructions) but it's pretty difficult to follow.

The interesting thing is that they use their own flag library to parse the fields like --interval. Maybe we can do something similar to parse these fields with Go's flag package (https://www.digitalocean.com/community/tutorials/how-to-use-the-flag-package-in-go). I wonder if that would simplify anything for you.

Here is a mini sample that i wrote in the playground: https://play.golang.org/p/RegYfXEM8YZ

Because the scanner reads one line at a time, it will not be able to pick up CMD on the second line. I propose a healthcheckCMD flag where it will insert the following CMD line into the healthcheck struct if command in the healthcheck struct is empty and if the CMD line is right after HEALTHCHECK.

I don't think I'm following this 馃槢.

I think the method signature, would ideally be something like:

func (df *Dockerfile) ParseHealthCheck() (HealthCheck, error)

// which internally it might call to a private function:
func parseHealthCheck(lines []string) (HealthCheck, error)

So we would pass in all the lines needed to populate the HealthCheck struct.

All 6 comments

Nice design doc! Have a couple of questions:

  1. It seems like docker will complain at you for scenario one since HEALTHCHECK requires at least one argument.
  2. Instead of have an additional flag I was thinking maybe we can do some pre-precessing to combine the dockerfile to one string. Like ["foo \", "bar"] will become "foo \/\n bar" then we can easily locate the where healthcheck is then parse it to get each component.

Ideally we can have a parser that parse the dockerfile and return a dockerfile struct like this

// Dockerfile represents a parsed dockerfile.
type Dockerfile struct {
    ExposedPorts []portConfig
        HealthCheck healthCheck
    parsed       bool
    path         string

    fs afero.Fs
}

type healthCheck struct {
        interval int
        timeout int
        startPeriod int
        retries int
        command []string
}

What do you think?

Looks good!

We can use regex to parse the health check options as they will always follow the same format.

Leaving this here as a resource: https://github.com/moby/buildkit/blob/10889212c4635d4edbe8aa253be601c6fa2823a5/frontend/dockerfile/instructions/parse.go#L424 (how docker parses these instructions) but it's pretty difficult to follow.

The interesting thing is that they use their own flag library to parse the fields like --interval. Maybe we can do something similar to parse these fields with Go's flag package (https://www.digitalocean.com/community/tutorials/how-to-use-the-flag-package-in-go). I wonder if that would simplify anything for you.

Here is a mini sample that i wrote in the playground: https://play.golang.org/p/RegYfXEM8YZ

Because the scanner reads one line at a time, it will not be able to pick up CMD on the second line. I propose a healthcheckCMD flag where it will insert the following CMD line into the healthcheck struct if command in the healthcheck struct is empty and if the CMD line is right after HEALTHCHECK.

I don't think I'm following this 馃槢.

I think the method signature, would ideally be something like:

func (df *Dockerfile) ParseHealthCheck() (HealthCheck, error)

// which internally it might call to a private function:
func parseHealthCheck(lines []string) (HealthCheck, error)

So we would pass in all the lines needed to populate the HealthCheck struct.

Leaving this here as a resource: https://github.com/moby/buildkit/blob/10889212c4635d4edbe8aa253be601c6fa2823a5/frontend/dockerfile/instructions/parse.go#L424 (how docker parses these instructions) but it's pretty difficult to follow.

This is super helpful! It is almost exactly what we want, though I don't know if we want to be that specific to do all these validations.

This looks great, Seong! I have a few thoughts on implementation and the previous discussion.

For the long-line problem, I think we can also do some preprocessing, since Docker requires a continuation \ for multiline directives. That way when we encounter a healthcheck (or really any Dockerfile directive we want to extract) we can check for continuation tokens until the next directive, then try to gather what information we can from that. That might require reading the whole dockerfile first, though, then "tokenizing" the lines into directives, and only then looping over directives to extract information.

Also +1 @efekarakus for the use of flag sets--I much prefer leveraging that parser to turning this module into a bunch of regexes (I say, fully aware that I set the precedent for this :P)

Thank you all for the helpful feedback!!

@iamhopaul123 by preprocessing the Dockerfile into one string will definitely solve the issue I had with multi-line format. I like that idea!

@efekarakus Cool example! I will look into using the flag library as it will be easier to parse the flags.

@bvtujo Yes, I agree this way it will be easy as creating another parsing method for next directive.

So from what I have gathered, is that we want to read the Dockerfile as one string (I don't think size will be an issue) and use the Go's flag library for easy parsing. We would then need a set of directives to know when to stop.

Resolved with #971

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tachyonics picture tachyonics  路  3Comments

bvtujo picture bvtujo  路  3Comments

kohidave picture kohidave  路  4Comments

jaybauson picture jaybauson  路  3Comments

aidansteele picture aidansteele  路  3Comments