Julia: `one` for AbstractString

Created on 8 Dec 2016  Â·  16Comments  Â·  Source: JuliaLang/julia

I noticed you can do something like

julia> prod(["*" for k in 1:3])
"***"

but you might get an error if you try

julia> prod(["*" for k in 1:0])
ERROR: MethodError: no method matching one(::Type{String})

Why was multiplication decided to be the operator on strings which appends them together? Wouldn't addition make more sense?

What do you guys think of introducing

one(String) = ""
zero(String) = ""

since "zero" is often defined as a + 0 = a, and "one" is defined as a * 1 = 1. This zero then makes sense for the way multiplication is defined for Strings

help wanted strings

Most helpful comment

The discussion about using * or + has already been had. For future discussion to be productive, let's focus on the part:

What do you guys think of introducing

one(String) = ""
zero(String) = ""

Thanks.

All 16 comments

Multiplication is the string concatenation operator because concatenation is not commutative, whereas + usually denotes a commutative operator in mathematics. This has been discussed many times, e.g. in #11030.

Given that, defining one{T<:AbstractString}(::T) = T("") makes sense to me, since this is the identity element for string concatenation, and one is defined as the multiplicative identity.

Defining zero (the additive identity), however, makes no sense here since we don't define + for strings.

Isn't multiplication usually commutative also?

Often not.

whereas + usually denotes a commutative operator in mathematics.

Addition doesn't have to be commutative either. Multiplication can be commutative. This just seems confusing to me.

After reading through some of https://github.com/JuliaLang/julia/issues/11030 I can see arguments for not having any kind of operator that does string concatenation, but I don't see why * is chosen over +.

But if one of these is included in the language I think there should be an appropriate zero and one.

The discussion about using * or + has already been had. For future discussion to be productive, let's focus on the part:

What do you guys think of introducing

one(String) = ""
zero(String) = ""

Thanks.

our Strings should be the Free Monoid over all characters.
As a Free Monoid, they should have defined their identity element -- the empty string.
And since we define our monoid operation using the * symbol, we should term that identity the one.
Thus one(String) = "" is a good idea, and also one(::Type{String})

Should it further one{S<:AbstractString}(::Type{S}) = convert(S, "") be defined? (as @stevengj suggests)
Like for Numbers
I'm less sure on that.

Maybe it is more natural to have one(r::RepString) = RepString(r.string, 0)
and for Substring to be a zero length substring of that String.
I don't see why this would matter though.
So probably defining it for AbstractString would be fine


zero(String) on the other hand should be left undefined.
If we went and defined something using the + symbol,
then defining zero to be that identity would be expected.
As we are not defining such a operation, it is left free for some user (/package) to do so.
As such when they do so, they should also be defining zero(::String) (not Base)

Let's go ahead and do this since we're still using * for string concatenation. If we move away from that, we'll naturally also move away from this at the same time – this will hardly be the most painful part of that transition.

What is the appropriate place in Base for such a definition?

This seems like a pun and more often than not sending one a string in generic code will be a bug rather than appropriate to return an empty string as the desired behavior. Particularly because Strings occupy a strange place where they behave sometimes like collections and sometimes like scalars. one strikes me as a mostly numeric concept. If we were ever to deprecate * for string concatenation as has been floated, this would need to be deprecated alongside it.

If we were ever to deprecate * for string concatenation as has been floated, this would need to be deprecated alongside it.

Right, that's my point. This makes sense as long as we're using * and if we deprecate that, then deprecating one for String will hardly be a big deal.

I agree this makes it even clearer that we should move away from * for string concatenation. But for now better be consistent and implement one.

I actually had a use case just today for "one" being ""
I wanted to * reduce an array of strings, and was disappointed it would not work.

The PR has been merged into master so if you check out master branch it should work out for you. Otherwise you'll have to wait for next release.

Writing them one by one to an IOBuffer should be much more efficient I think (although perhaps less convenient).

You can just call join to efficiently concatenate an array of strings.

julia> a = [randstring(10) for i = 1:1000];

julia> using BenchmarkTools

julia> @benchmark join($a)
BenchmarkTools.Trial: 
  memory estimate:  13.13 kb
  allocs estimate:  16
  --------------
  minimum time:     54.150 μs (0.00% GC)
  median time:      56.809 μs (0.00% GC)
  mean time:        57.813 μs (0.28% GC)
  maximum time:     1.708 ms (94.77% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark prod($a)
BenchmarkTools.Trial: 
  memory estimate:  4.87 mb
  allocs estimate:  999
  --------------
  minimum time:     620.688 μs (0.00% GC)
  median time:      1.592 ms (0.00% GC)
  mean time:        1.855 ms (29.36% GC)
  maximum time:     6.105 ms (82.00% GC)
  --------------
  samples:          2684
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
Was this page helpful?
0 / 5 - 0 ratings

Related issues

helgee picture helgee  Â·  3Comments

musm picture musm  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

manor picture manor  Â·  3Comments

TotalVerb picture TotalVerb  Â·  3Comments