import strutils
var
search = ""
replacement = "abc"
echo "string".multiReplace(search, replacement)
# EDIT(@timotheecour ) I think this should be: `echo "string".multiReplace(("", "abc"))`
Crash output
strutils.nim multiReplace
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
This works on 0.18.0 but fails on 0.19.0 and #head.
I am not sure if "" should be supported. If the search engine is non utf-8 multibyte characters aware, it can destroy characters. I would prefer if this would just raise an error for illegal argument.
I am not sure if "" should be supported. If the search engine is non utf-8 multibyte characters aware, it can destroy characters. I would prefer if this would just raise an error for illegal argument.
Excellent point that I never considered before. However, IMO an empty string simply doesn't "match" and so the original string is returned unmodified.
Search string could be "" at runtime, I would prefer the proc would handle it rather than having to check myself. It's not too much work but everyone else using the proc will also benefit.
That being said, even the replace() proc is broken from Araq's definition.
import strutils
echo "abc".replace("", "a")
# Prints aaabaca
However, IMO an empty string simply doesn't "match" and so the original string is returned unmodified.
import nre
doAssert replace("foo", re"", "-") == "-f-o-o-"
"-f-o-o")"likewise" with re pending https://github.com/nim-lang/Nim/issues/9437 ; currently returns "-foo"
see also relevant discussion here:
https://github.com/nim-lang/Nim/pull/9280#issuecomment-428843621
This was discussed and we choose this implemementation because Python/JS/whatever also does it this way.
This should be consistent with replace I guess.
This was discussed and we choose this implemementation because Python/JS/whatever also does it this way.
That is what I said before knowing it breaks UTF-8 and can only be fixed by even more special logic for empty matches which is bad.
@Araq
knowing that, would it be possible to enforce the same behavior across these for dealing with empty match string:
and could that behavior be to throw an error (either doAssert or ValueError)
seems like a saner behavior, that would catch some bugs
@krux02 there were a few options discussed in this issue; which option did you pick in your commit?
@timotheecour well you can look in the commit and see. But sure. In this PR strutils won't match empty strings on anything anymore. I didn't touch re or nre because those are wrappers for PCRE and I can't control what PCRE matches and what it doesn't match.
for clarity, here's new behavior:
echo multiReplace("abcd", ("", "123"))
abcd
also, @genotrance I think your original issue was wrong, since replacement has to be varargs[(string, string)] ; I'll edit it in top-post
Really? We've changed the way replace works again?!
Was it changed in the past? The only alternative to "empty string doesn't match on anything" is to make everything utf8, and state clearly that string is only for utf8 data.
the only alternative is throwing, which seems saner behavior to catch bugs (empty string is probably a bug in user code in that case)
We don't throw on API usage bugs, we could have used doAssert but that makes people nervous who value "continue" over "crash and burn".
/cc @Araq
We don't throw on API usage bugs, we could have used doAssert
huh? see https://github.com/nim-lang/Nim/pull/9280 (which added doAssert to count) and your comment (https://github.com/nim-lang/Nim/pull/9280#issuecomment-428840784) in pretty much the same context (for count on empty string):
This should instead use assert or doAssert, it's an API misuse.
As for the rest of your comment:
but that makes people nervous who value "continue" over "crash and burn".
which is why I suggested in https://github.com/nim-lang/Nim/pull/9280#issuecomment-428786497 (as well as here) to throw (by that I mean a recoverable exception) instead of doAssert
My suggestion is to simply throw CatchableError (or perhaps ValueError but I like the simplicity of using the root of catchable errors)
That's not crash and burn, that's normal exception handling mechanism ; it can peacefully coexist inside a production-server-that-can-never-die
That was a main point behind the exception hierarchy rewrite in https://github.com/nim-lang/Nim/issues/8237
But programming bugs are not recoverable/catchable errors, see the RFC I wrote.
Most helpful comment
Search string could be "" at runtime, I would prefer the proc would handle it rather than having to check myself. It's not too much work but everyone else using the proc will also benefit.
That being said, even the
replace()proc is broken from Araq's definition.