Prefect: Mutating list while iterating

Created on 26 Apr 2020  路  9Comments  路  Source: PrefectHQ/prefect

https://github.com/PrefectHQ/prefect/blob/8c886b64df8e264791ec1d0e401cbbd6c4719af5/src/prefect/agent/local/agent.py#L75-L77

It's a no-no to mutate the list while iterating. Indeed, in my tests, a loop like this fails to examine the _next_ item in the list, so it's actually not working as intended. For instance:

x = [0,1,2,3,4,5,6,7,8]
for idx, y in enumerate(x):
    print(idx, y)
    if y == 3:
        z = x.pop(idx)
    print(idx, y)

yields this output; note that the 4 element in the list is never examined.

0 0
0 0
1 1
1 1
2 2
2 2
3 3
3 3
4 5
4 5
5 6
5 6
6 7
6 7
7 8
7 8

To be fair this is somewhat harmless, because the process will likely be examined in the next heartbeat call, or in the worst case after all of the processes earlier in the list are completed.

agent bug good first issue

Most helpful comment

Haha so many different ways you could do it! Here's my suggestion:

for idx, process in enumerate(list(self.processes)): 
     if process.poll() is not None: 
         self.processes.pop(idx) 

Make a new list() with self.processes 馃槈

All 9 comments

Nice catch! I'm going to label this as a bug because it _technically_ is and/or will be.

Hi! I'd like to help with this! May I?
Do you have any solution in mind?

Hi @trkohler - your help would definitely be appreciated! You could solve this by creating a new list to track the completed processes like:

    def heartbeat(self) -> None:
        completed_procs = []
        for idx, process in enumerate(self.processes):
            if process.poll() is not None:
                completed_procs.append(idx)
                if process.returncode:
                    self.logger.info(
                        "Process PID {} returned non-zero exit code".format(process.pid)
                    )
        for idx in completed_procs:
            self.processes.pop(idx)
        super().heartbeat()

or something to that effect

Heck, I'd just stuff the _non_-completed procs into a separate list and swap it into self.processes after the loop is done.

Haha so many different ways you could do it! Here's my suggestion:

for idx, process in enumerate(list(self.processes)): 
     if process.poll() is not None: 
         self.processes.pop(idx) 

Make a new list() with self.processes 馃槈

Hi there!
I did as @joshmeek suggested because I'm too afraid to broke something and get lost in unknown challenges :)

Unfortunately, I had issues with local testing. I did as mentioned in docs but still, there is a full console of failures. They can't be related to my change, mostly errors are about aws, google etc

Hm testing failures is weird especially as it looks like all is good on your PR 馃The tests should skip if certain dependencies are not installed

Closed with #2464

I'll try to discover deep test errors on my local machine tomorrow. Seems like I do something wrong.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jameslamb picture jameslamb  路  3Comments

rej-jsa picture rej-jsa  路  4Comments

jlowin picture jlowin  路  3Comments

Trymzet picture Trymzet  路  4Comments

dkapitan picture dkapitan  路  3Comments