Incubator-mxnet: Memory profiling enhancements

Created on 12 Apr 2019  路  6Comments  路  Source: apache/incubator-mxnet

I have created prototype for visualizing the memory pools on the gpu. I have added a doc explaning the feature and how to use the prototype in the cwiki: https://cwiki.apache.org/confluence/display/MXNET/MXNet+Memory+Profiling+Enhancements

I would need some help making this prototype ready to be PR'ed.

There are more improvements that can be done as mentioned in the cwiki. Listing some of them here:

  1. Support for visualizing cuDNN memory allocation and frees
  2. Better visualization for CPU memory pools
  3. Support for MKLDNN Memory allocation
  4. Parameter server, server and worker memory visualization.

Let me know if interested.

Call for Contribution Discussion

Most helpful comment

The system team from UofT developed https://github.com/tbd-ai/tbd-tools which profiles memory footprint http://www.sysml.cc/doc/2019/demo_24.pdf for MXNet
@SerailHydra @olympian94 @izaakniksan @ArmageddonKnight
If possible we can reuse and avoid duplicating work

All 6 comments

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Feature

Thanks for this feature idea. Interested to work. Will post a PR once I get this working.

Thanks @ChaiBapchya . I think it requires some additional testing. Also, need to do some sanity performance tests. Need to add a switch to toggle it in the profiler API. Also, it would be nice to test it in distributed training setting though not strictly required.

The system team from UofT developed https://github.com/tbd-ai/tbd-tools which profiles memory footprint http://www.sysml.cc/doc/2019/demo_24.pdf for MXNet
@SerailHydra @olympian94 @izaakniksan @ArmageddonKnight
If possible we can reuse and avoid duplicating work

Thanks @eric-haibin-lin for the pointer! Will take a look

Hi, all

Thanks for your interests in our memory tools! I started to build this tool for benchmarking purpose and the version is only 0.11.0. Later my colleagues use the same techniques to build it on new versions of MXNet for optimization purpose.

The open-sourced one is on a bit old version, I am not sure how helpful it is since the codebase changed a lot. I think @ArmageddonKnight has the memory profiling tool for a newer version. If you need some input from us in person, we would be happy to help have the tool integrated to main branch.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Shiro-LK picture Shiro-LK  路  3Comments

JonBoyleCoding picture JonBoyleCoding  路  3Comments

Ajoo picture Ajoo  路  3Comments

yuconglin picture yuconglin  路  3Comments

dushoufu picture dushoufu  路  3Comments