Aria2: Non-ASCII character support in Content-Disposition header

Created on 13 Jun 2015  路  3Comments  路  Source: aria2/aria2

When iterating over Content-Disposition header, file names are processed byte by byte, so wide characters often (if not always) trigger early parsing quit.

question

All 3 comments

There are 2 variations for Content-Disposition header field.


    1. Content-Disposition: Attachment; filename=example.html

In this case, filename parameter can carry iso8859-1 characters only. No multi byte characters are allowed.


    1. Content-Disposition: attachment; filename*= UTF-8''%e2%82%ac%20rates

This uses RFC 5987, can specify encodings. In the above example, UTF-8 is used.
aria2 supports UTF-8 and iso8859-1 just described in RFC 6266.

I realize this probably violates RFC, but look for example at this page.
If we check Content-Disposition header of first photo, we will see:

$ curl -I http://cfile9.uf.tistory.com/original/142D493F506557F31182F6
HTTP/1.1 200 OK
Expires: Tue, 22 Mar 2016 10:44:02 GMT
Date: Sun, 21 Feb 2016 10:44:01 GMT
Server: Apache
Content-Disposition: inline; filename="%EC%82%AC%EB%B3%B8_-DSC02499.jpg"
Last-Modified: Fri, 28 Sep 2012 07:55:33 GMT
Content-Type: image/jpeg
Content-Length: 792416
Age: 198
Via: 1.1 Wcache(2.0)
Connection: keep-alive

>>> import urllib.parse
>>> urllib.parse.unquote('%EC%82%AC%EB%B3%B8_-DSC02499.jpg')
'靷掣_-DSC02499.jpg'

So it uses UTF-8 encoded Content-Disposition filename, but doesn't specify encoding. I think such cases might be rather common in real web? What about additional option for this, e.g. --content-disposition-encoding=utf8?

I'm not sure whether it is common or not. We have implemented several features which violates RFC in order to support "real" web. So this could be one of those things.
But I'd like to get away from encoding issue as long as I can, because it is nasty thing and it is hard to get it right.
I myself have no plan to add this feature, but we are ready to accept a patch if someone is really interested in this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

testcaoy7 picture testcaoy7  路  5Comments

nicolasxu picture nicolasxu  路  4Comments

zsrinivas picture zsrinivas  路  5Comments

SumatraPeter picture SumatraPeter  路  4Comments

minimax4233 picture minimax4233  路  3Comments