Please answer these questions before submitting your issue. Thanks!
go version)?1.8.1
go env)?darwin/amd64
Use the /x/net/publicsuffix to parse domain names using this;
package main
import (
"fmt"
"golang.org/x/net/publicsuffix"
)
func main() {
// name := "sub.example.com" // this works
name := "sub.s3.amazonaws.com"
tldPlusOne, err := publicsuffix.EffectiveTLDPlusOne(name)
if err != nil {
fmt.Println("err", err)
} else {
fmt.Println("OK", tldPlusOne)
}
}
running over a batch of hostnames, I noticed all "*.s3.amazonaws.com" gives the following error:
publicsuffix: cannot derive eTLD+1 for domain "sub.s3.amazonaws.com"
all other hostnames seems to work fine
the sub.s3.amazonaws.com should successfully be parsed into "amazonaws.com" by publicsuffix.EffectiveTLDPlusOne()
the error described above
A probably related issue is that the following domain com.s3-website-us-east-1.amazonaws.com succeeds to parse, but the same value is returned, rather than a TLD+1, that is "amazonaws.com" should be returned, instead the full com.s3-website-us-east-1.amazonaws.com is returned
Okay, so according to the comment @aviramst linked to
s3.amazonaws.com is itself a public suffix, so it's not "under" one.
Then I guess the the original snippet shouldnt return "s3.amazonaws.com", but rather it should return the input "sub.s3.amazonaws.com", but not erroring.
Also, my 2nd comment is incorrect then, working-as-intended.
Is there any chance we could get an option to ignore privately-administered suffixes, or is that outside of the scope/purpose of this package?
Change https://golang.org/cl/153737 mentions this issue: net/publicsuffix: Permit determining whether public suffix was accepted under the "*" wildcard or not.
This problem cannot be reproduced with the latest version of golang.org/x/net/publicsuffix. The sample code outputs "OK sub.s3.amazonaws.com" which I think is correct given the rule
s3.amazonaws.com
in the current PSL https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat#L10760
I think the problem was fixed in the underlying public suffix list in commit 37e30d13801e but I'm not convinced that the expected output of publicsuffix.EffectiveTLDPlusOne("sub.s3.amazonaws.com") should be "amazonaws.com" as demanded by the bug report. With a rule like "s3.amazonaws.com" (or before commit 37e330) "*.s3.amazonaws.com" the answer can never be "amazonaws.com"
I have moved on to other projects, but thanks for the update on this. I guess it can be closed then!
@flowchartsman said:
Is there any chance we could get an option to ignore privately-administered suffixes
I think the existing API lets you distinguish ICANN vs Private vs the catch-all * rule, as per https://go-review.googlesource.com/c/net/+/153737/4#message-d558a752d9467d474ac28d0fc6efa4ee687e49a7
As per the @vdobler and @martinlindhe conversation immediately above, I'm closing this issue.
Most helpful comment
Is there any chance we could get an option to ignore privately-administered suffixes, or is that outside of the scope/purpose of this package?