At the moment (nixpkgs 183eeb3c0fdac8de3146aedaa6028b474f96db6f), lib.splitStrings ":" ":a:" == ["" "a:"]
I would have expected the result to either be ["" "a" ""]
or [":a:"]
, not a different behavior with separators at the beginning and end of the string.
I would like to try and fix this, but the function seems quite used (51 occurrences). What do you think is a possible way forward? Just run through all points in the code where the function is used and check the proposed change doesn't break things? Keep the current function and add another one for handling a "saner" version of it?
Let me suggest:
The Haskell "split" package does the first one:
$ nix-shell -p 'haskellPackages.ghcWithPackages (hs: [hs.split])' --run ghci
位 import Data.List.Split
位 splitOn ":" ":a:"
["","a",""]
So does Python:
$ python
>>> ':a:'.split(':')
['', 'a', '']
and JavaScript:
> ":a:".split(":")
Array [ "", "a", "" ]
As pointed in the comments, this function should not be used because of its bad performance.
It is unfortunate as the only alternative is hacky use of builtins.match
, eg builtins.match "(:)?(([^:]*)(:))+" ":a:"
, but regexp support is limited (no non capturing group support for example).
It would really be nice to have more builtins
functions for string manipulation. (related feature request https://github.com/NixOS/nix/issues/1062)
I have just pushed https://github.com/NixOS/nixpkgs/pull/23851 which fixes the function implementation, however I haven't checked the callers are safe with the new behavior, as if I understood correctly hydra will tell us where to look for instead of blindly checking everything.
This is fixed on master:
nix-repl> splitString ":" ":a:"
[ "" "a" "" ]
Most helpful comment
The Haskell "split" package does the first one:
So does Python:
and JavaScript: