Connexion: How to get the request data without add. 4 lines at top and last line?

Created on 17 Oct 2018  路  7Comments  路  Source: zalando/connexion

Description

When I use connexion.request.data, to retrieve the results of a PUT, I get the raw data of the request with some stuff at the top. How do I get just the data that was sent in a PUT body?

Expected behaviour

Upload a file via PUT, write the file out, has no diff to input file.

I'm using the below code on the handler:

def ot_dataset_put_handler(name, datasetname):
    uploaded_file = connexion.request.data
    with open("/tmp/test.pdf", "wb") as f:
        f.write(uploaded_file)
    return Response(status=200)

Actual behaviour

The output file starts with the following lines:

--------------------------ee6039b16072c943
Content-Disposition: form-data; name="data"; filename="foo.pdf" 
Content-Type: application/octet-stream

FILE_SAME_STARTING_FROM_HERE

My goal is to get rid of these first 4 lines. In addition, the file ends with

--------------------------ee6039b16072c943--

And I don't want that either.

The curl I'm using is:

curl -v -X PUT localhost:10000/objecttypes/TEST_ONLY_DONT_USE/datasets/testdataset -H "Content-Type: application/octet-stream" -H "X-Remote-User: [redacted]" -F 'data=@/Users/tommy/Desktop/foo.pdf'

Steps to reproduce

Handle any PUT using the above code

Additional info:

Output of the commands:

  • python --version
    Python 3.7.0
  • pip show connexion | grep "^Version\:"
    Version: 1.5.3
question

Most helpful comment

You are using the -F argument to curl, which uploads form data (and automatically sets the Content-Type header appropriately for form data).

You probably intended to use curl --upload-file ...

Your curl command is uploading form data, but you're overwriting the Content-Type of the envelope to application/octet-stream. This breaks werkzeug/flask's ability to parse the form envelope, and so it returns the entire HTTP body. That's what those "extra lines" are.

Here's an example of what curl is trying to do:
Example Form Data request

POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: multipart/form-data; boundary=---------------------------974767299852498929531610575

-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description" 

some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt" 
Content-Type: text/plain   # I think this is the place where you want to set the content type

(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

Setting the Content-Type header overrides the Content-Type of the envelope, so flask returns the entire HTTP body, which includes the form envelope.

POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: application/octet-stream  # now the server can't understand how to parse the form data

-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description" 

some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt" 
Content-Type: text/plain 

(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

All 7 comments

This is more of a question maybe, maybe connexion.request.data isn't the right method.

Can you please post your json/yaml spec as well?
You should never need to access the request.data object when using connexion. In fact, a large part of what connexion provides is validation and deserialization of the request body.

The way this normally works is that you define the body in your spec, and then it is passed into the user-provided handler function by name.

Does that make sense?

In this case we have application/octet-stream and are acting as a file store (like S3), because the binary content is unknown. Thus, I cannot use connexion to validate or serialize. This particular endpoint's goal is to take an arbitrary binary file and store it. In the above case, I tried a PDF, and the bytes were the same, modulo the first 4 and last line as shown above.

This is a company Swagger spec, so I cannot post it, but I could try (later) to replicate it with a minimal example. But essentially it's just a PUT with application/octet-stream.

Hmm, yeah it would be nice to support the streaming use-case better. Maybe I'll be inspired to look at that after the 2.0 release.
It's always helpful to have a sscce.
In the meantime, I think you might be able to use the request.files property.

Files also did not work with octet, however I was able to get files working if I changed the Content-Type to multipart/form-data. This is the current hackaround but it puts more burden on all clients because the CURL or equivalent upload request is more cumbersome to write.

Agree on the sscce, but this came up in a huge swagger spec for my employer, so I'll have to get to that.

You are using the -F argument to curl, which uploads form data (and automatically sets the Content-Type header appropriately for form data).

You probably intended to use curl --upload-file ...

Your curl command is uploading form data, but you're overwriting the Content-Type of the envelope to application/octet-stream. This breaks werkzeug/flask's ability to parse the form envelope, and so it returns the entire HTTP body. That's what those "extra lines" are.

Here's an example of what curl is trying to do:
Example Form Data request

POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: multipart/form-data; boundary=---------------------------974767299852498929531610575

-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description" 

some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt" 
Content-Type: text/plain   # I think this is the place where you want to set the content type

(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

Setting the Content-Type header overrides the Content-Type of the envelope, so flask returns the entire HTTP body, which includes the form envelope.

POST /foo HTTP/1.1
Content-Length: 68137
Content-Type: application/octet-stream  # now the server can't understand how to parse the form data

-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description" 

some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt" 
Content-Type: text/plain 

(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

Wowzers! Thanks so much, I definitely would not have caught that, and this is a little beyond my knowledge of web forms. Big thanks to you for your investigation, and sorry for the trouble.

Was this page helpful?
0 / 5 - 0 ratings