Wemake-python-styleguide: Forbid to use variables from conditional blocks

Created on 29 Jul 2019  路  24Comments  路  Source: wemake-services/wemake-python-styleguide

Rule request

Thesis

This code does not raise a single violation that x might not be defined:

try:
    function_that_raises()
    x = 1
except:
    ...
print(x)

One more example:

if False:
    x = 1
print(x)

With for:

for i in []:
    x = 0
print(x)

With while:

while False:
    x = 0
print(x)

Reasoning

We need to forbid to use variables that are only defined in try, if, for body.
It might not be defined at all. mypy also does not complain about it.

Cases like:

x = 0

try:
    function_that_raises()
    x = 1
except:
    ...
print(x)

are fine.

Hacktoberfest help wanted advanced rule request

Most helpful comment

@sobolevn I'll be able to continue soon. That's what I have to do:

  • [ ] implement the collection of assigned variables in lambda and comprehension scopes. It doesn't collect them now, so access to the variables inside the corresponding expression is considered unsafe
  • [ ] decompose the SafeVars class to reduce the complexity of the code
  • [ ] edit the code to pass linters

I have a few questions, but I'll ask them when I get back. Sorry for long pauses.

All 24 comments

What about this cases?

if some:
   ...
   x = 1
else:
   ...
   x = 2
  1. It is pretty popular among python developers to write code like this one
  2. All branches are covered (I will need to check how mypy handles that, but do not fully rely on it)
  3. It might over-complicate the rule and implementation

The same applies to for/else, try/except/else/finally
Blocks with and while are always dangerous:

with suppress(ZeroDivisionError):
     x = 1 / 0

print(x)  # error, it is not defined.

Steps I see:

  1. All top-level AssignNodes are added to list of names that are free to use: {context: {var_name: True}}
  2. If branched (if, for, try) has all branches with the same assigned name it is added to the same list
  3. Partially assigned name is added to the list with False flag

Then we look at the usage pattern:

if some:
    x = 1
    print(x)  # ok
print(x)  # not ok

If variable is used within the same block of code, that we are fine. If it is used outside of the definition block and is not free to use: then we raise a violation.

Algorithm prototype that I am working on:

if some:
    if other:
        x = 1
    else:
        x = 2
    print(x)
else:
    x = 3
input(x)

"""
Steps:
1. visit_If: ``if some:``
2. visit_If: ``if other:``
3. visit_Assign: ``x = 1``
4. Set ``{ x: [coveredpath("if other:")] }`` on ``if other:``
5. Check all path covered for node: ``if other:``  - False

6. visit_If: ``else:``
7. visit_Assign: ``x = 2``
8. Set ``{ x: [coveredpath("if other:"), coveredpath("else:")] }`` on ``if other:``
9. Check all path covered for node: ``if other:``  - True
10. Set ``{ safe_vars: ["x"] }`` to ``if other:``
11. Set { x: [coveredpath("if some:")] } on ``if some:``

12. visit_If: ``else:``
13. visit_Assign: ``x = 3``
14. Set ``{ x: [coveredpath("if some:"), coveredpath("else:")] }`` on ``if some:``
15. Set ``{ safe_vars: ["x"] }`` to module context
"""

As a result:

  • visit_Name[ctx=ast.Load, id='x'] inside if some: check for node.id in block.safe_vars and allows this name to be used
  • visit_Name[ctx=ast.Load, id='x'] inside module: check for node.id in context.safe_vars and allows this name to be used

This is very hard. And I am quite tired to implement it.

@artemiy312 do you want to give this a try? This one is very hard, but it would be so satisfying to solve it!

@sobolevn Sure, I'd like to try it

PyLint has a rule for for:

cell-var-from-loop (W0640):
--
聽 | Cell variable %s defined in loop聽A variable used in a closure is defined in a loop. 
This will result in all closures using the same value for the closed-over variable

https://docs.pylint.org/en/1.6.0/features.html#id18

And

undefined-loop-variable (W0631):
--
聽 | Using possibly undefined loop variable %r聽Used when 
an loop variable (i.e. defined by a for loop or a list 
comprehension or a generator expression) is used outside the loop.

We can drop every loop (including body) one-by-one and see for undefined variables as usual.

@orsinium sorry, I did not quite get the last idea. What do you mean?

I am working on prototype for the issue, and I already have implemented safe_vars collector for For, AyncFor, If, While, FunctionDef, AsyncFunctionDef, ClassDef, Module scopes.
Progress can be found here

Awesome work, @artemiy312! Thanks!

@artemiy312 how's it going? Do you need any help from my side?

@sobolevn Sorry, I had to pause due lack of time. Fortunately, I will be able to continue during this week.

Awesome!

I am almost done with SafeVars class that collects variable assignments for a given node and reduces them to a set of safe variables. But there are cases I still haven't covered.

The next step will be implementation of visitor with this logic:
1) Collect safe vars
- Collect local safe vars to check vars from local scopes
- Collect safe vars for module, classes and functions to check vars from outer scopes
2) Check that the used variables in safe vars from 1)

e.g.

if ...                      # 1.1) collect outer safe vars: {outer_x, outer_fn}
    outer_x = ...           
else ...:                   
    outer_x = ...           

def outer_fn(outer_z):      # 1.2) collect outer safe vars: {outer_z, outer_y, local_fn}
    outer_y = ...           

    def local_fn(z):        # 1.3) collect outer safe vars: {z, y, somevar}
        y = ...              
        if ...:                
            if ...:         
                w = ...     
            else:           
                w = ...      

            # 1.4) collect local safe vars for each variable use below: {w, y, z}

            print(y)        # 2) local safe var from 1.4
            print(w)        # 2) local safe var from 1.4
            print(outer_x)  # 2) outer safe var from 1.1
            print(outer_z)  # 2) outer safe var from 1.2
            print(outer_y)  # 2) outer safe var from 1.2
            print(unknown)  # 2) unsafe
            print(somevar)  # 2) unsafe because var not in local vars and not in {outer vars} - {`local_fn` vars}
        else:
            ...
        somevar = ...

@artemiy312 do you need any help on this one?

@sobolevn No, thank you! The visitor is done. I only need to adapt the unit tests, which were convenient for development, to e2e tests and open PR.

@artemiy312 Thanks a lot! Then we will include this task into 0.13. It would be a great addition!

I have removed this task from 0.13, so @artemiy312 take your time. Do you need any help on this one?

@sobolevn I'll be able to continue soon. That's what I have to do:

  • [ ] implement the collection of assigned variables in lambda and comprehension scopes. It doesn't collect them now, so access to the variables inside the corresponding expression is considered unsafe
  • [ ] decompose the SafeVars class to reduce the complexity of the code
  • [ ] edit the code to pass linters

I have a few questions, but I'll ask them when I get back. Sorry for long pauses.

@artemiy312 thanks a lot! Feel free to ask for help and/or to send WIP PRs so we can discuss them.

Does this include with blocks? I think the following is common, no?

with open('js.json', 'r', encoding='utf8') as fd:
  imported_js = json.load(fd)
Was this page helpful?
0 / 5 - 0 ratings