I would like to suggest a feature to prevent Typosquatting programming language package managers.
I suggest that RubyGems should make it difficult to publish gems whose names are too similar to some pre-existing popular gem. "Too similar" may be defined as having a Levenshtein distance of 1. "Popular" may be defined simply as having more than some threshold of downloads in some period, e.g., more than 1k downloads in the last month.
RubyGems could automatically refuse such submissions by stating that in order to prevent malicious gem typosquatting, in case of high lexical similarity to a pre-existing popular gem, manual authorization by admins is required, which may be requested by email with proper justification.
This issue is related to:
gemI will abide by the code of conduct.
I imagine step 1 of enforcing this would be disallowing case-sensitive gem names...
Maybe the problem is not case-sensitivity _per se_, but allowing two gems whose name differ only in terms of case.
Something along the following lines could implement @djberg96's idea of step 1?
def disallow?(new_gem_name)
# pre_existing_gem_names comes from the database, disallowed may be cached
disallowed = pre_existing_gem_names.map(&:downcase).to_set
disallowed.include?(new_gem_name.downcase)
end
A variant of the levenshtein distance aspect of this issue was added in https://github.com/rubygems/rubygems.org/pull/2037 ("Invalidate gem name using levenshtein distance for gems with ten million downloads"), back June of this year. This, as of that PR ,applies to anything similar to existing gems with >10 million downloads total.
There's also https://github.com/rubygems/rubygems.org/pull/1357 , from June 2016, which made the name check case-insensitive. This is for creation only, as there's at least once instance of case-only-difference from prior to that PR (saikuro vs Saikuro).
@adrianomitre does my previous comment address everything, or do you have potential ideas for changes/tweaks?
@duckinator Your previous comment addressed everything. Thanks!
Most helpful comment
A variant of the levenshtein distance aspect of this issue was added in https://github.com/rubygems/rubygems.org/pull/2037 ("Invalidate gem name using levenshtein distance for gems with ten million downloads"), back June of this year. This, as of that PR ,applies to anything similar to existing gems with >10 million downloads total.
There's also https://github.com/rubygems/rubygems.org/pull/1357 , from June 2016, which made the name check case-insensitive. This is for creation only, as there's at least once instance of case-only-difference from prior to that PR (
saikurovsSaikuro).