Is your feature request related to a problem? Please describe.
I realize my testing loop is not efficient at all. I need to understand where is the bottleneck and how I can make it faster.
Describe the solution you'd like
An option similar to fast_dev_run where the training OR validation OR testing loop is profiled, in order to see where is the bottleneck.
Describe alternatives you've considered
No alternative so far.
@Colanim great suggestion. I know the PyTorch team has been thinking about something like this, maybe @ezyang, @soumith have some suggestions? I'm hesitant to add something lightning specific for this, might be more appropriate inside PyTorch.
how about torch.utils.bottleneck: https://pytorch.org/docs/stable/bottleneck.html?highlight=bottleneck
awesome. we'll point to this in our docs as well so people know it's available.
For future reference.
I couldn't use torch.utils.bottleneck, it gave me OOM error...
I ended up using this SO answer :
import cProfile
def profileit(func):
def wrapper(*args, **kwargs):
datafn = func.__name__ + ".profile" # Name the data file sensibly
prof = cProfile.Profile()
retval = prof.runcall(func, *args, **kwargs)
prof.dump_stats(datafn)
return retval
return wrapper
@profileit
def function_you_want_to_profile(...)
...
So I could just do :
@profileit
def test_step(self, batch, batch_nb, dataloader_nb=None):
And then I visualized it using snakeviz :
snakeviz test_step.profile
which give some neat visualization :

@ian-13 @jeffling sounds like you guys were thinking about contributing something like this? have you tried torch.utils.bottleneck: https://pytorch.org/docs/stable/bottleneck.html?highlight=bottleneck?
@williamFalcon We've tried torch.utils.bottleneck for deep profiling. But like @Colanim we have a similar method that does the same sort of thing and found it a bit more stable to use. With that we also can force Cuda syncs to get GPU-bound timings.
We also have something lighter weight (not using CProfile) so we can always have it on for all our runs. We use it for our testing framework to catch any speed regressions. It's a simple tool that we could contribute if that's within the scope of this framework :)