The following test case returns false. I think that it should return true, as these two URLs are logically and practically equal:
URL urlA = new URL("http://localhost?a=b&c=d");
URL urlB = new URL("http://localhost?c=d&a=b");
Assertions.assertThat(urlA).isEqualTo(urlB);
That's a good idea but not for isEqualTo which follows the equals semantics.
What would a good name for the assertion?
Do you not have an already-established standard name for such an "equivalence" comparison with other objects?
Here are some brainstorming ideas:
isEquivalentTo
isSameAs
isSemanticallyEqualTo
isEffectivelyEqualTo
I kind of like isEquivalentTo.
I quite like isEquivalentTo too, @scordio any preference?
What if we refer to URI normalization?
Normalizations that change semantics
Applying the following normalizations result in a semantically different URI although it may refer to the same resource:
- Sorting the query parameters. Some web pages use more than one query parameter in the URI. A normalizer can sort the parameters into alphabetical order (with their values), and reassemble the URI. Example:
http://example.com/display?lang=en&article=fred→http://example.com/display?article=fred&lang=en
However, the order of parameters in a URI may be significant (this is not defined by the standard) and a web server may allow the same variable to appear multiple times.
I'm still thinking about a good name to reflect this.
A different proposal. What if we don't add a dedicated comparison API but only a switch to enable normalization strategies? E.g.:
URL urlA = new URL("http://localhost?a=b&c=d");
URL urlB = new URL("http://localhost?c=d&a=b");
assertThat(urlA).withUriNormalization()
.isEqualTo(urlB);
// OR
assertThat(urlA).normalized()
.isEqualTo(urlB);
I tend to favor withUriNormalization() because the normalization should be applied to both the object under assertion and the parameter of isEqualTo().
I like the approach, I think it would be easier to discover the assertion if we could put the idea of normalization in a variant of isEqualTo so that it shows up in code completion close to isEqualTo.
Here are some options, please suggest some!
isEqualToAsNormalizedUris isEqualToByComparingNormalizedUris isEqualInNormalizedFormTo isEqualToWithNormalization?
I would leave the URL/URI part out as it is implicit and the javadoc can give more details about what happens behind.
This wikipedia site surprised me. I didn't know that switching the order of the parameters is permitted by the standard to be significant.
I think you are on the right track with the normalisation idea. isEqualToWithNormalisation() sounds good. However, looking at that site, there are a lot of possible normalisations. Which ones are you referring to? Do you want to program them all in? Removing duplicate slashes sounds good. Technically you should also make http and https equivalent, according to one of the items in that list. I suspect that we don't want this. In any case, it all sounds like a a lot of work.
Maybe we shouldn't use the term "normalisation" unless we are implementing all normalisations, or can at least make a logical statement about which ones.
Have a look at the list and see what you think. My initial thought is: You could do all the "normalisations that preserve semantics" but not the others. For the others, you might need to name them separately. Since the only one anyone is asking for at the moment is query parameter sorting, we should probably just focus on this for now.
Therefore, a isEqualToWithSortedQueryParameters() might do the trick. Or since there are possibly more operations which could be performed in the future, maybe we should go back to the two-step-idea: withSortedQueryParameters().isEqualTo().
Fair point @fletchgqc, I think we should either add isEqualToWithSortedQueryParameters to address the initial use case or start a fluent api to provide different normalization options:
assertThat(urlA).withSortedQueryParameters()
.ignoringWww()
.ignoringFragment()
.isEqualTo(urlB);
My gut feeling would be to go for isEqualToWithSortedQueryParameters as it will be easy to discover and later only introduce the fluent api if we have a request for more normalization.
thoughts?
I am happy with either way, not sure which is better.
isEqualToWithSortedQueryParameters :+1:
we have a winner!
I'm working on this issue.
I have implemented the function isEqualToWithSortedQueryParameters, but now I'm wondering which class this function should belong to. It seems that I should add it to class URLAssert. The class URLAssert currently has no methods but only a constructor. Should I put the method there?
@Sunt-ing the correct class would be AbstractUrlAssert.
I would imagine the method declaration like:
public SELF isEqualToWithSortedQueryParameters(URL expected)
When I try to solve the issue, it occurs to me that maybe isEqualToExceptQueryParameters will be useful. It's also required in isEqualToWithSortedQueryParameters I think. Should I implement that?
When I try to solve the issue, it occurs to me that maybe isEqualToExceptQueryParameters will be useful. It's also required in isEqualToWithSortedQueryParameters I think. Should I implement that?
Not sure I understood how these two are related. Personally, I would introduce one assertion at a time. Would you like to raise a pull request so we can take a look at the code?
yes, let's do one assertion per PR, it makes it easier to review.
I mean isEqualToWithSortedQueryParameters depends on isEqualToExceptQueryParameters. To ensure isEqualToWithSortedQueryParameters, we first need to ensure isEqualToExceptQueryParameters. Since there is no such function, I need to implement it in isEqualToWithSortedQueryParameters. And I think the assertion isEqualToExceptQueryParameters itself is useful and maybe it will be better extract it into a new assertion. Am I wrong?
If I'm right, I'd like to work on it. Considering isEqualToWithSortedQueryParameters depends on isEqualToExceptQueryParameters, I'd like to work on isEqualToExceptQueryParameters first. Maybe I can open a new issue and self-assign. What do you think?
Sorry @Sunt-ing this is still not clear to me why isEqualToWithSortedQueryParameters would depend on isEqualToExceptQueryParameters.
Can you raise a PR with your code changes so that we can have look at it.
Thanks!
OK. I will raise a PR.
By the way, for example, "http://localhost?a=b&c=d" is not equal to "http://example.com?c=d&a=b" with sorted query parameters, because even if their parameters are not taken into account, the two are not the same.
Therefore, it seems that isEqualToExceptQueryParameters can be called in isEqualToWithSortedQueryParameters.
Anyway, thanks very much. I will raise a PR as soon as possible.
FYI The URI class has accessors which allow getting parts of the URI individually. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/net/URI.html. However getting everything except query parameters requires quite a few accessors.
FYI The URI class has accessors which allow getting parts of the URI individually. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/net/URI.html. However getting everything except query parameters requires quite a few accessors.
@fletchgqc Yes, and that is exactly what I use to solve this issue.
Most helpful comment
@Sunt-ing the correct class would be
AbstractUrlAssert.I would imagine the method declaration like: