Hello folks. It looks like this behavior can be improved:
iex(10)> %URI{scheme: "gid", path: "hello/123"} |> URI.to_string()
"gid:hello/123"
iex(11)> %URI{scheme: "gid", authority: "foo.com", path: "hello/123"} |> URI.to_string()
"gid://foo.comhello/123"
iex(12)> %URI{scheme: "gid", host: "foo.com", path: "hello/123"} |> URI.to_string()
"gid://foo.comhello/123"
So it produces a non-deterministic result in a sense that if it will be parsed, the URI structure will not be the same as the source one.
Would be nice of URI.to_string() either to prepend / to the path automatically when the path is relative and host is present
iex(10)> %URI{scheme: "gid", path: "hello/123"} |> URI.to_string()
"gid:hello/123"
iex(11)> %URI{scheme: "gid", authority: "foo.com", path: "hello/123"} |> URI.to_string()
"gid://foo.com/hello/123"
iex(12)> %URI{scheme: "gid", host: "foo.com", path: "hello/123"} |> URI.to_string()
"gid://foo.com/hello/123"
or to raise an exception on relative path passed like Ruby does, for example.
2.6.5 :001 > URI::Generic.build(scheme: 'http', path: 'foo/bar').to_s
Traceback (most recent call last):
1: from /Users/pyromaniac/.rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/generic.rb:761:in `check_path'
URI::InvalidComponentError (bad component(expected absolute path component): foo/bar)
2.6.5 :002 > URI::Generic.build(scheme: 'http', host: 'baz', path: 'foo/bar').to_s
Traceback (most recent call last):
1: from /Users/pyromaniac/.rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/generic.rb:761:in `check_path'
URI::InvalidComponentError (bad component(expected absolute path component): foo/bar)
I personally would prefer the first one.
I can even try to prepare a PR with the fix as soon as some decision will be made.
Thanks!
Hi @pyromaniac, thanks for the report.
Note though you are not supposed to build URI structs by hand. You should use functions URI.parse instead, that will always guarantee they are valid.
However, I would say it doesn't hurt a quick check on to_string that raises if path does not start with a forward slash. I don't want to support it because i don't think we should support invalid data and you shouldn't be creating the URI by hands anyway.
A PR is welcome, thanks!
Hello @josevalim,
Thanks for the prompt response.
Actually, Ruby behavior in this case in incorrect. According to rfc3986, mailto:[email protected] is a valid URI having path [email protected], and this path is relative. And
If a URI contains an authority component, then the path component must either be empty or begin with a slash ("/") character
So we should check if the path is absolute only if host or authority is present.
Also, I don't understand, why can't I build a URI manually? For example, I want to generate GIDs. What is the correct way to do it if I have all the data values like id or model name?
What I do now is:
%URI{
scheme: "gid",
host: app
path: "#{model}/#{id}"
} |> URI.to_string()
As of now, you should do URI.parse("gid://app/#{model}/#{id}")
iex(1)> model="MODEL"
"MODEL"
iex(2)> id="ID"
"ID"
iex(3)> URI.parse("gid://app/#{model}/#{id}")
%URI{
authority: "app",
fragment: nil,
host: "app",
path: "/MODEL/ID",
port: nil,
query: nil,
scheme: "gid",
userinfo: nil
}
iex(4)> URI.parse("gid://app/#{model}/#{id}") |> to_string()
"gid://app/MODEL/ID"
Looks like an overhead to me tbh. Can you explain to me what are the reasons not to use the structure directly please? Cause my understanding it that is the structure appear to be invalid, the functions that consume it can actually validate it before doing something.
Now it also looks like an overhead to me and I probably understand the implications. But in this case maybe there should be some kind of URI.build function that creates the struct and validates it? Then we can actually put this path check there. And it will be backward compatible and maybe more idiomatic. WDYT?
I was writting the message bellow as yours popped up.
But I'm wondering if a URI.build/1 would be a good addition, doing something like:
%{
scheme: "gid",
host: app
path: "#{model}/#{id}"
} |> URI.build() |> to_string()
The Elixir way is not to validate structs, cause that would be a helluva overhead. What you do is you validate at the time of building the structs, and only use functions to modify it, and then just trust them. By the way URI.new/1 would be a standard name
I guess I was wrong because it seems we use %URI directly in many places in our docs. URI.build could work, but then you lose the compile-time checking. I think for now the best is to raise an error when converting to string, we can always add an explicit URI.build later.
I guess I was wrong because it seems we use %URI directly in many places in our docs.
@josevalim where are we doing this?
In the docs of URI.to_string, for example.
Oh, yes, I see it is the only one place where we do that. Should they be updated?
If we can rewrite them to use URI.parse, yes!
Still looks fragile and an overheadish to me. I would gladly use URI.new.
This URI.new can optionally accept path as a list, which will be even more factorish! :)
As said above, there are no plans for a URI.new/build now. But we will be glad to do validation on URI.to_string for people building it by hand. Thanks.
I had a couple minutes available, so I went ahead and sent a PR here: #9588.
We can follow up there.
I had a couple minutes available, so I went ahead and sent a PR here: #9588.
I think you meant #9589
Thanks for dealing with it!
Most helpful comment
As said above, there are no plans for a URI.new/build now. But we will be glad to do validation on URI.to_string for people building it by hand. Thanks.