Consider the following codeblock
name = "Katie"
stmt1 = f"Hello {name}"
stmt2 = "Hello {name}"
While the block is correct syntax-wise, the content is almost certainly logically invalid, as seen below:
>>> print(stmt1)
Hello Katie
>>> print(stmt2)
Hello {name}
stmt2 is a valid string, but declares string interpolation formatting that is compatible with f-strings, but the string does not have the f prefix to invoke the interpolation.
PyLint should detect strings that use f-string-like syntax, but have not been declared as f-strings, in Python 3.6+
This could either be by:
{}, and if valid Python, return a warningEdge Case One: Having { and } characters are perfectly valid, but on the edge case that the content within them is a reference to an in-scope variable, or some other expression, it is likely the string was meant to be an f-string. This would be desirable to catch as a linting error.
Edge Case Two: f'...' is effectively shorthand for '...'.format(locals()); so this:
mystring = 'hello {name}'
print(mystring.format(name='Katie'))
is entirely legal, and isn't an error. There is a possibly design decision about whether this linting check should be
I understand PyLint already warns against f-strings in logging, but I can't locate the base functionality to detect f-strings in the first place. I'd be willing to help implement this feature, but would require some mentoring.
Hey @glasnt !
This makes sense, but will be tricky due to the string potentially being used as a formatting string later on down the line. But we can definitely try to look for this pattern as it might become more prevalent now with the adoption of f-strings.
Here's what we can do. Let me know if something of the following doesn't make sense.
f-strings are internally marked as astroid.JoinedStr (because we use typed_ast to parse the source code, there's currently a bug where some f-strings are marked as FormattedValue, but we can check just JoinedStr for now until that gets fixed)
being a new check regarding strings, this can go in pylint/checkers/strings.py, where you only have to implement a def visit_str(self, node) method to visit every string in the analyzed program. This does not include JoinedStr/f-strings, but it's exactly what we want here.
next we'll probably need to figure out that a given string was intended as an f-string in the first place. As you mentioned, we can take a hint from the local scope, which you can grab with node.scope(). If you have the variables names extracted from the string, you can grab the relevant nodes that they represent with node.scope().locals.get(name). .locals is just a dictionary from strings to nodes. Once we know that all the names are there (or in the global scope, which you can access with node.root(), there's only the matter of figuring out if the said string was formatted later on via .format(**locals()) or something akin to that construct. You might be able to check by looking for Attribute nodes in the same scope where the left hand side of the attribute is our string, something along the lines of:
# the skip_class is to make sure we don't go into inner scopes, but might not be needed per se
for attr in scope.nodes_of_class(astroid.Attribute, skip_class=(astroid.FunctionDef, )):
if isinstance(attr.expr, astroid.Name): # LHS of `name.format`
# Check if it's using `.format`
if attr.expr.name == 'the name you are looking for' and attr.expr.attrname == 'format':
# then the string was being used to be formatted after all so stop checking
as mentioned early, you'll need most likely to extract the variables that the said string contains. You can use pylint.checkers.strings.parse_format_method_string for that, which is already being used in the mentioned file.
another note is that node.parent and node._astroid_fields are your friends. .parent walks up the tree to the parent, so if you need to grab the name of the string that was supposed to be used as a f-string, just call .parent until you get to an Assign node. If it has a name, it might be possible though that it was intended for .format() to be called on it, but you can follow the above instructions to make sure that's the case. And _astroid_fields is usually a sequence that shows what AST children a node can have. There's also _other_fields which is useful as well, as it contains non-AST members of a node. For instance Attribute.attrname, which is the name of the attribute being accessed, will be found in _other_fields. Another useful utility is node.repr_tree() which shows exactly how the subtree looks for that node.
That would be all that's needed to get this going. Let me know if there's anything else I can do to help.
I would also like this check. Missing those dastardly little 'f's is a common occurrence in my experience, and while the edge cases noted in this thread are hypothetically valid, as best I can tell, not a single one of them apply to the code bases I worked in day to day. So I'd be more than happy to have at least a way to opt-in to this check (or, of course, having this be the default).
I think this would be a very useful addition to have - +1
Though it's worth noting that there are edge cases where { and } exist in a string that is not an f string, and that's a very valid case - i.e. s = "awk '{print $1}'"
I still think some analysis on these would be an improvement
The analysis would need to check if the contents are
Example of relevance: Many examples on https://thedailywtf.com/series/errord
One of the most common errors people run into is interpolation code not being interpolated, resulting in things like
Dear {customer.displayname},
...
or frustrating things like
An error has occurred: {errormessage}
Such examples have been posted from many sources, including some of the largest IT companies, so missing interpolation seems to be a widespread and very user-facing bug type.