Ray: Progressbar (e.g. tqdm)

Created on 27 Aug 2019  路  9Comments  路  Source: ray-project/ray

Is there a way to integrate ray with tqdm( or other progress bar) to show the working progress? I tried by adding the progress bar in the sub-function which is decorated by @ray.remote. As expected, the progress bar re-displayed line by line with updated (pid = *).

Thank you so much!

Most helpful comment

@wrosko, @alexkreidler, @davidweioct, @mazzma12 - is this what you're looking for?

import ray
from tqdm import tqdm


def test():
    import time
    time.sleep(1)
    return 1

test_r = ray.remote(test)

ray.init()

def to_iterator(obj_ids):
    while obj_ids:
        done, obj_ids = ray.wait(obj_ids)
        yield ray.get(done[0])
obj_ids = [test_r.remote() for i in range(100)]
for x in tqdm(to_iterator(obj_ids), total=len(obj_ids)):
    pass

# Without total
obj_ids = [test_r.remote() for i in range(100)]
for x in tqdm(to_iterator(obj_ids)):
    pass

All 9 comments

Unfortunately this is not straight forward to do in the remote process; you'll have to do everything in the driver.

Unfortunately this is not straight forward to do in the remote process; you'll have to do everything in the driver.

Hi, do you have an example ?
I tried something like this but it didn't work :

results = ray.get([do_file.remote(file) for file in tqdm(files)])

Any updates on this? It'd be an awesome feature!

Ah, I didn鈥檛 see the followup. I think this shouldn鈥檛 be hard (using ray.wait). I鈥檒l post a snippet later today or tomorrow (and feel free to ping if I forget!)

Any update on your snippet @richardliaw ? Was kind of serendipitous to come across the new messages here right when they're happening.

@wrosko, @alexkreidler, @davidweioct, @mazzma12 - is this what you're looking for?

import ray
from tqdm import tqdm


def test():
    import time
    time.sleep(1)
    return 1

test_r = ray.remote(test)

ray.init()

def to_iterator(obj_ids):
    while obj_ids:
        done, obj_ids = ray.wait(obj_ids)
        yield ray.get(done[0])
obj_ids = [test_r.remote() for i in range(100)]
for x in tqdm(to_iterator(obj_ids), total=len(obj_ids)):
    pass

# Without total
obj_ids = [test_r.remote() for i in range(100)]
for x in tqdm(to_iterator(obj_ids)):
    pass

Closing this for now; seems like people like the above solution.

Hi!

I'm trying your code but it doesn't seem to work for me. I've included a function that requires some arguments just to try.

@ray.remote
def test(x):
    return np.array([x**3] * x)

ray.init(ignore_reinit_error=True, num_cpus=5)

def to_iterator(obj_ids):
    while obj_ids:
        done, obj_ids = ray.wait(obj_ids)
        yield ray.get(done[0])

obj_ids = [test.remote(i) for i in range(30000)]
for x in tqdm(to_iterator(obj_ids), total=len(obj_ids)):
    pass

ray.shutdown()

Non-parallelized it takes nearly a minute. When I parallelize it without the bar it takes around 20s. With the bar it takes nearly two and a half minutes, so something must be wrong.

Thanks!

Interesting... Can you profile this and open a new issue?

Was this page helpful?
0 / 5 - 0 ratings