Pylint: PEP 597, require encoding kwarg in open call (and other calls that delegate to io.open)

Created on 10 Sep 2020  路  9Comments  路  Source: PyCQA/pylint

Is your feature request related to a problem? Please describe

there's a proposal to deprecate not passing an encoding to open(...)
https://www.python.org/dev/peps/pep-0597/#motivation

Developers using macOS or Linux may forget that the default encoding is not always UTF-8.

For example, long_description = open("README.md").read() in setup.py is a common mistake. Many Windows users can not install the package if there is at least one non-ASCII character (e.g. emoji) in the README.md file which is encoded in UTF-8.

For example, 489 packages of the 4000 most downloaded packages from PyPI used non-ASCII characters in README. And 82 packages of them can not be installed from source package when locale encoding is ASCII. [1] They used the default encoding to read README or TOML file.

Another example is logging.basicConfig(filename="log.txt"). Some users expect UTF-8 is used by default, but locale encoding is used actually. [2]

Even Python experts assume that default encoding is UTF-8. It creates bugs that happen only on Windows. See [3] and [4].

Raising a warning when the encoding option is omitted will help to find such mistakes.

Describe the solution you'd like

raise a warning similar to subprocess-run-check

Additional context

Add any other context about the feature request here.

contributor friendly enhancement help wanted

Most helpful comment

@Pierre-Sassoulas let's go for unspecified-encoding.

All 9 comments

# bad
with open(filename) as f:
    ...


# bad
with open(filename, encoding=None) as f:
    ...


# good
with open(filename, encoding="utf8", errors="surrogateescape") as f:
    ...

# good
locale_encoding = getattr(io, "LOCALE_ENCODING", None)
with open(filename, encoding=locale_encoding) as f:
    ...

@graingert thanks!
It makes sense.

@hippo91 can you allocate a pylint feature id for me?

@graingert sorry for the delay. What do you mean by feature id?

@graingert i think a new message id is necessary for this case. Something around missing_open_encoding.
Maybe @Pierre-Sassoulas @AWhetter or @PCManticore will have a better idea?
subprocess-run-check should be indeed a good starting point.

Note that it's not just open, eg os.fdopen

I really like this one, I had the problem multiple time for windows/mac users, that would be really helpful. I'd also use an error code for that one. Regarding the message id, what about unspecified-encoding ?

@Pierre-Sassoulas let's go for unspecified-encoding.

Was this page helpful?
0 / 5 - 0 ratings