Special dollar sign variables like $~
or $1
aren't needed for Regex - captured groups can already be accessed through Regex::MatchData with brackets like md[1]
.
I don't know any other place where the dollar sign is used in the language since the removal of global variables in https://github.com/crystal-lang/crystal/issues/4715.
There's also still a global variable used in system
.
AFAIK, $~
is the only way to access the match data when using a Regex in a when
statement. I wonder if a more general way to get at the results of the ===
comparison in a when
block would be useful.
edit for clarification: this is the kind of code I want to preserve:
case line_to_parse
when /host=(.*)/
hostname = $1
when /credentials=(.*?):(.*)/
user = $1
password = $2
end
Yes, the above example is the only reason worth having $1
in the language.
That said, I wouldn't mind removing them, and also removing $?
(but for that we'll have to remove the backtick method too). You can write that with a series of if-else. It's uglier, more code to write, but it's semantically the same and it works:
if match = line_to_parse.match(/host=(.*)/)
hostname = match[1]
elsif match = line_to_parse.match(/credentials=(.*?):(.*)/)
user = $1
password = $2
end
We can use class variables to replace the global scope of dollar variables.
For example here is a rather naive and clumsy example:
class String
@@match : Regex::MatchData? = nil
def self.match : Regex::MatchData
@@match.not_nil!
end
def m(regex : Regex)
@@match = self.match regex
end
end
line_to_parse = "credentials=host:test"
case line_to_parse
when .m /host=(.*)/
hostname = String.match[1]
when .m /credentials=(.*?):(.*)/
_, user, password = String.match
end
p user, password # => "host" "test"
There are currently no global variables used, but you are suggesting replacing this with global variables. That definitely won't do.
Invalid memory access for this example:
class String
@match : Regex::MatchData? = nil
def match : Regex::MatchData
@match.not_nil!
end
def m(regex : Regex)
@match = self.match regex
end
end
# Invalid memory access (signal 11) at address 0x562dd12ba3f0
line_to_parse = "credentials=host:test"
case line_to_parse
when .m /host=(.*)/
hostname = line_to_parse.match[1]
when .m /credentials=(.*?):(.*)/
_, user, password = line_to_parse.match
end
p user, password
If Regex#===
would return the MatchData object, case
could be changed to optionally store the results of the case comparison and we wouldn't need $1
:
case "line_to_parse" and_keep_result_as |md|
when /host=(.*)/
hostname = md[1]
when /credentials=(.*?):(.*)/
user = md[1]
password = md[2]
end
That might allow for some other interesting things to be done with ===
overloads.
@ezrast That looks way to complicated to be worth thinking about it :D
Then it would be better to just use if
with match
. Less complexity in the language and only a little bit more code to write.
However, I don't see a real reason to remove these dollar accessors for regex groups. They might look a bit off, especially since there are no global variables. But in the end, I think I'd prefer to keep them because they're easy to use and I don't think there are any real issues with them.
Don't fix what's not broken they say :)
Well, it does make the implementation of the compiler more complex. And if you use them without a match you get an ugly exception, and this is not trivial to fix. So yeah, removing these will at least simplify the compiler implementation.
Unless there are typing / safety issues or future issues with incremental/modular compilation I don't see a reason to remove them.
It's true that there are not essential but they serve well for matching and extracting IMO.
@j8r note that $1
, etc are not global variables. Its substitution by class variables is not accurate.
They're not beautiful, but they're limited to being passed only a single call up the call stack and they're safe. There have been discussions about removing them before (started by me) and we arrived at the same conclusion as this thread. Closing.
If there are still bugs surrounding them and bad error messages, that should be their own issues.
What about this solution?
class Result
property match_data : Regex::MatchData { raise "no match" }
def initialize
end
forward_missing_to match_data
end
class String
def match(regex : Regex, result : Result)
if result && (match_data = match regex)
result.match_data = match_data
end
end
end
result = Result.new
case "abcd0123"
when .match(/([0-9]+)/, result)
puts result[1] #=> 0123
end
Removing the dollar signs will solve the cryptic nil assertion failed (https://github.com/crystal-lang/crystal/issues/4776), and will bring more consistent style.
Most helpful comment
Unless there are typing / safety issues or future issues with incremental/modular compilation I don't see a reason to remove them.
It's true that there are not essential but they serve well for matching and extracting IMO.
@j8r note that
$1
, etc are not global variables. Its substitution by class variables is not accurate.