Elixir: Formatter does not preserve Windows line endings

Created on 21 Nov 2020  路  14Comments  路  Source: elixir-lang/elixir

Environment

  • Elixir & Erlang/OTP versions (elixir --version):
    Erlang/OTP 23 [erts-11.1.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]
    Elixir 1.11.2 (compiled with Erlang/OTP 23)
  • Operating system: all (relevant only on Windows)

Current behavior

iex(2)> Code.format_string!("defmodule A   do\r\n:ok\r\nend")
["defmodule", " ", "A", " do", "\n  ", ":ok", "\n", "end"]

the same goes for mix format

Sorry if this has been raised before. I couldn't find any issues related to this nor any info in mix format nor Code.format_string! docs.

Expected behavior

I don't know if CRLF should be preserved. If yes then it's a bug. If not then it should be a documented behaviour.

Elixir Discussion Windows

Most helpful comment

Oh, I see. This is definitely an Elixir bug then, thanks! No need for a new issue, I plan to tackle all of this today or latest this week.

All 14 comments

I think CRLF is still the standard for Windows (please let me know if I am wrong) so the formatter should indeed preserve the line endings.

Agreed. I think the simplest way is to do a pass on the data after formatting that converts "n" to "rn" if you are on Windows. The main question I have is: should we effectively preserve the input or if we should always use "rn" on Windows. I am more inclined towards the latter. In any case, the behaviour should be disableable via an option.

I'm more inclined to stick with \n. Please note that a line may end with \n in a string, which should not use \r\n - so Windows editors has to deal with \n in such cases anyway.

"a
b" |> IO.inspect()
$ cat t.exs | unix2dos > t.dos.exs && elixir t.dos.exs && elixir t.exs
"a\r\nb"
"a\nb"

Some notes

The main question I have is: should we effectively preserve the input or if we should always use "rn" on Windows. I am more inclined towards the latter.

We should preserve what is in the source file. If we always switch to "rn" on windows and to "n" on unix you can't really have developers using both systems on the same project.

We would also need --check-formatted to ignore what line ending is used.

We should preserve what is in the source file. If we always switch to "rn" and to "n" on unix you can't really have developers using both systems on the same project.

That leads to mixed line endings in my experience.
How about reading git settings? It's a good practise on Windows to set

git config --global core.autocrlf true

What was your initial use case that made you report the issue, @lukaszsamson?

Given Go enforces n, I am more inclined to keep things as is. And yes, most Windows tools nowadays handles it transparently.

There are other tools to prevent mixed line endings. If you have this set git config --global core.autocrlf true there is no need for the formatter to also mess with the line endings, so reading this config seems unnecessary. I am also guessing that if you are setting core.autocrlf then you actually want the CRLF so the formatter converting to LF would be wrong.

EDIT: Or maybe that's what @lukaszsamson meant, sorry if I misunderstood you.

What was your initial use case that made you report the issue, @lukaszsamson?

I was validating if elixir-ls behaves correctly with regards to line endings.

Given Go enforces n, I am more inclined to keep things as is. And yes, most Windows tools nowadays handles it transparently.

As i wrote in the issue report, I'm OK with keeping it as it is. I just found it strange it's not documented.

Ok, I think the simplest and least obstrusive change for now is to keep and document the behaviour, so that's what we will do. We can add an option to enforce one or the other in the future.

@lukaszsamson Please note that git's core.autocrlf option may change the value of a string in elixir code (as I shown there), since git does not know the context (as I shown in the above)

See also https://github.community/t/git-config-core-autocrlf-should-default-to-false/16140

By using the current Elixir formatter (which enforces \n), you will not have mixed line endings in Elixir source code file (ex, exs) at least.

However, I see your point for other files (e.g. other template files!) - but I tend to use other tools to prevent it. credo has line endings check but I think it only covers elixir code..

Please note that git's core.autocrlf option may change the value of a string in elixir code (as I shown there), since git does not know the context (as I shown in the above)

I have to say this is a surprising behaviour. It's not particularly documented either. Moreover, it is not consistent with heredoc. In heredocs \r\n gets changed to \n

@lukaszsamson it is nothing related to Elixir. Git will change all newlines to CRLF if configured to, including the ones inside strings. There is nothing Git or Elixir can do.

@josevalim the surprising behavior mentioned by @lukaszsamson is that heredoc (""") always change \r\n into \n while multi-line string with single quote (") is preserving the line ending.

I guess it's related to https://github.com/elixir-lang/elixir/blob/v1.11.2/lib/elixir/src/elixir_tokenizer.erl#L995 - which always change \r\n into \n

"a
b" |> IO.inspect(label: "single")

"""
a
b
""" |> IO.inspect(label: "triple")
cat t.exs | unix2dos > t.dos.exs && elixir t.dos.exs && elixir t.exs
single: "a\r\nb"
triple: "a\nb\n"
single: "a\nb"
triple: "a\nb\n"

You can see - triple (heredoc) is always using \n while single quote is preserving it as-is (which is correct, since it's binary actually!)

I think we just need to document the behavior of heredoc somewhere - actually I tried to find "heredoc" from elixir, but there is no reference in https://hexdocs.pm/elixir/search.html?q=heredoc for example. We may open a new issue since this is not related to formatter :heart:

Oh, I see. This is definitely an Elixir bug then, thanks! No need for a new issue, I plan to tackle all of this today or latest this week.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ericmj picture ericmj  路  3Comments

andrewcottage picture andrewcottage  路  3Comments

eproxus picture eproxus  路  3Comments

GianFF picture GianFF  路  3Comments

shadowfacts picture shadowfacts  路  3Comments