Terraform: http provider too strict with application/json content type

Created on 7 Jun 2017  ยท  8Comments  ยท  Source: hashicorp/terraform

Terraform Version

0.9.7

Affected Resource(s)

"http" datasource

Terraform Configuration Files

data "http" "openstack" {
  url = "http://169.254.169.254/openstack/latest/meta_data.json"
}

Actual Behavior

Not working:

Error refreshing state: 1 error(s) occurred:

* data.http.main: 1 error(s) occurred:

* data.http.main: data.http.main: Content-Type is not a text type. Got: application/json; charset=UTF-8

In source code builtin/providers/http/data_source.go:

func isContentTypeAllowed(contentType string) bool {
    allowedContentTypes := []*regexp.Regexp{
        regexp.MustCompile("^text/.+"),
        regexp.MustCompile("^application/json$"),
}

The regexp for "application/json" is too strict, as the charset can also be included by the web server

bug providehttp

Most helpful comment

Is there a way to use http provider with S3? It gives binary/octet-stream for most files.

All 8 comments

Thanks for reporting this, @sebastien-prudhomme!

The RFC for the application/json MIME type doesn't define any parameters for it, so technically your server here is misbehaving per the spec, but it seems reasonable for us to be pragmatic about it and trim off any erroneous parameters that are included.

Looking at this again here, I also see that we're not correctly handling the charset argument when it is _correctly_ used on a text/* type, so hopefully we can address both of these things together:

  • If a text/* type is used, check for a charset= argument and attempt to convert the result to UTF-8 so we produce a valid Terraform string.
  • If application/json is used, ignore all arguments and instead obey the JSON spec rules on character encodings, which will likely be to fail if the input isn't UTF-8 since UTF-16 and UTF-32 JSON is not widely used enough to justify that complexity.

Although we're not necessarily reading _HTML_ here, it seems like we could use golang.org/x/net/html/charset to get reasonable, pragmatic handling for the text/* types, using the charset= argument if it's present and content sniffing to distinguish between utf8 and windows-1252 if it isn't.

Any reference for your RFC?

The RFC for HTTP1/1 "Content-Type" says you can have a charset after the mime type, so i think there is not a problem with the server (which is OpenStack metadata web service):

https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7

The application/json type is registered in RFC7159, section 11, where it says:

   Required parameters:  n/a
   Optional parameters:  n/a
   Encoding considerations:  binary

It also says:

    Note:  No "charset" parameter is defined for this registration.
    Adding one really has no effect on compliant recipients.

...which seems to excuse your server, I suppose!

But as I'd said above, we can be pragmatic about this and just ignore it, since Terraform doesn't employ a team of RFC lawyers. :grinning:

Is there a way to use http provider with S3? It gives binary/octet-stream for most files.

Is there a reason i cannot use zip files with the http data source?

Is there a way to use http provider with S3? It gives binary/octet-stream for most files.

I just ran into this issue trying to get the Azure JSON list of service tags and public IPs. The content-type is application/octet-stream.

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings