I have a file which may contain a question mark in its name. I'm trying to fetch it with libcurl (using curlmulti and passing the full file path as a file:// url as CURLOPT_URL, and I noticed that it truncates the name before the ?. I replicated the problem with curl command line:
$ echo test > '/tmp/test?file'
$ curl 'file:///tmp/test?file'
curl: (37) Couldn't open file /tmp/test
This appears to be present in current master, as file_connect only uses the path urlpart to resolve the file name: https://github.com/curl/curl/blob/8d8b5ec3444ae2ae7d20790bcf2e1a96b5819645/lib/file.c#L146
While investigating this problem I also realized that PROTOPT_NOURLQUERY (which seemed like a possible culprit) is never actually checked throughout the code, though it's still assigned in protocol handlers constructors.
@lultimouomo
https://en.wikipedia.org/wiki/File_URI_scheme
Characters such as the hash (#) or question mark (?) which are part of the filename should be percent-encoded.
Characters which are not allowed in URIs, but which are allowed in filenames, must also be percent-encoded. For example, any of "{}`^ " and all control characters. In the example above, the space in the filename is encoded as %20.
Characters which are allowed in both URIs and filenames must NOT be percent-encoded.
I'm with @OldTimer571 here. I can't find any wording in RFC 1738, RFC 3986 or RFC 8089 that makes the question mark (?) that special for file: URLs that it would magically become part of the path.
It is a reserved separator, so it needs to be provided URL-encoded for it to be used as a path part.
This is how file:// URLs work.
Most helpful comment
I'm with @OldTimer571 here. I can't find any wording in RFC 1738, RFC 3986 or RFC 8089 that makes the question mark (
?) that special forfile:URLs that it would magically become part of the path.It is a reserved separator, so it needs to be provided URL-encoded for it to be used as a path part.