Caffe2 - Python API
A deep learning, cross platform ML framework
Public Member Functions | Public Attributes | List of all members
torch.autograd.profiler.emit_nvtx Class Reference
Inheritance diagram for torch.autograd.profiler.emit_nvtx:

Public Member Functions

def __init__ (self, enabled=True)
 
def __enter__ (self)
 
def __exit__ (self, exc_type, exc_val, exc_tb)
 

Public Attributes

 enabled
 
 entered
 

Detailed Description

Context manager that makes every autograd operation emit an NVTX range.

It is useful when running the program under nvprof::

    nvprof --profile-from-start off -o trace_name.prof -- <regular command here>

Unfortunately, there's no way to force nvprof to flush the data it collected
to disk, so for CUDA profiling one has to use this context manager to annotate
nvprof traces and wait for the process to exit before inspecting them.
Then, either NVIDIA Visual Profiler (nvvp) can be used to visualize the timeline, or
:func:`torch.autograd.profiler.load_nvprof` can load the results for inspection
e.g. in Python REPL.

.. warning:
    This context manager should not be called recursively, i.e. at most one
    instance should be enabled at any given time.

Arguments:
    enabled (bool, optional): Setting this to False makes this context manager a no-op.
        Default: ``True``.

Example:
    >>> with torch.cuda.profiler.profile():
    ...     model(x) # Warmup CUDA memory allocator and profiler
    ...     with torch.autograd.profiler.emit_nvtx():
    ...         model(x)

**Forward-backward correlation**

When viewing a profile created using :class:`emit_nvtx` in the Nvidia Visual Profiler,
correlating each backward-pass op with the corresponding forward-pass op can be difficult.
To ease this task, :class:`emit_nvtx` appends sequence number information to the ranges it
generates.

During the forward pass, each function range is decorated with ``seq=<N>``.  ``seq`` is a running
counter, incremented each time a new backward Function object is created and stashed for backward.
Thus, the `seq=<N>` annotation associated with each forward function range tells you that
if a backward Function object is created by this forward function,
the backward object will receive sequence number N.
During the backward pass, the top-level range wrapping each C++ backward Function's
``apply()`` call is decorated with ``stashed seq=<M>``.  ``M`` is the sequence number that
the backward object was created with.  By comparing ``stashed seq`` numbers in backward with ``seq``
numbers in forward, you can track down which forward op created each backward Function.

Any functions executed during the backward pass are also decorated with ``seq=<N>``.  During
default backward (with ``create_graph=False``) this information is irrelevant, and in fact,
``N`` may simply be 0 for all such functions.  Only the top-level ranges associated with
backward Function objects' ``apply()`` methods are useful, as a way to correlate these Function
objects with the earlier forward pass.

**Double-backward**

If, on the other hand, a backward pass with ``create_graph=True`` is underway (in other words,
if you are setting up for a double-backward), each function's execution during backward
is given a nonzero, useful ``seq=<N>``.  Those functions may themselves create Function objects
to be executed later during double-backward, just as the original functions in the forward pass did.
The relationship between backward and double-backward is conceptually the same as the relationship
between forward and backward: The functions still emit current-sequence-number-tagged ranges,
the Function objects they create still stash those sequence numbers, and during the eventual
double-backward, the Function objects' ``apply()`` ranges are still tagged with ``stashed seq``
numbers, which can be compared to `seq` numbers from the backward pass.

.. warning:
    The sequence number is thread-local, and some forward functions don't create an associated
    backward Function object (instead delegating that to sub-functions further down the call chain).
    For these reasons, the correspondence of stashed sequence numbers in
    backward Function ``apply()`` ranges with `seq` numbers in forward-pass ranges is
    not guaranteed to be 1 to 1.  The sequence numbers alone may not be enough to fully
    disambiguate which forward function created which
    backward Function object.  You may need to make a judgment based on analytic knowledge of what
    the expected correspondence should be.

Definition at line 225 of file profiler.py.


The documentation for this class was generated from the following file: