NifTK  16.4.1 - 0798f20
CMIC's Translational Medical Imaging Platform
Public Member Functions | Static Public Member Functions | Protected Member Functions | Friends | List of all members
niftk::CUDAManager Class Reference
Inheritance diagram for niftk::CUDAManager:
Inheritance graph
[legend]
Collaboration diagram for niftk::CUDAManager:
Collaboration graph
[legend]

Public Member Functions

ScopedCUDADevice ActivateDevice (int dev)
 
cudaStream_t GetStream (const std::string &name)
 
ReadAccessor RequestReadAccess (const LightweightCUDAImage &lwci)
 
WriteAccessor RequestOutputImage (unsigned int width, unsigned int height, int FIXME_pixeltype)
 
LightweightCUDAImage Finalise (WriteAccessor &writeAccessor, cudaStream_t stream)
 
LightweightCUDAImage FinaliseAndAutorelease (WriteAccessor &writeAccessor, ReadAccessor &readAccessor, cudaStream_t stream)
 
void Autorelease (ReadAccessor &readAccessor, cudaStream_t stream)
 
void Autorelease (WriteAccessor &writeAccessor, cudaStream_t stream)
 

Static Public Member Functions

static CUDAManagerGetInstance ()
 

Protected Member Functions

 CUDAManager ()
 
virtual ~CUDAManager ()
 
void AllRefsDropped (LightweightCUDAImage &lwci)
 

Friends

class LightweightCUDAImage
 
struct impldetail::ModuleCleanup
 

Detailed Description

Singleton that owns all CUDA resources. It manages images in a copy-on-write like fashion: you cannot write into an existing CUDA-image, you can only read from these and write data into a newly allocated one.

To get access to an image living on the card, do the usual DataNode::GetData(), and a cast to CUDAImage. Then call CUDAImage::GetLightweightCUDAImage() to retrieve a handle to the actual bits in CUDA-memory. Side note: even though LightweightCUDAImage has members you should consider it opaque.

This LightweightCUDAImage instance you can use with RequestReadAccess() to obtain a device pointer that you can read from in your kernel. RequestReadAccess() will increment a reference count for that image so that CUDAManager will not recycle it too early.

Then call RequestOutputImage() to get a device pointer to where you can write your kernel's output. From an API point of view, RequestOutputImage() will always give you a new memory block so that you never overwrite an existing image.

Call GetStream() with your favourite name, or create your own stream, for synchronising and coarse-grain parallelising CUDA tasks.

Run your kernel on your stream. But do not synchronise on its completion!

When all your work has been submitted to the driver, call FinaliseAndAutorelease() to turn the output device pointer into a proper LightweightCUDAImage that you can stick onto a DataNode. This function will also release your read-request on the input image at the right time so that it can be eventually returned to the memory pool. In addition, Finalise*() functions will queue a "ready" event that you can use on your stream to GPU-synchronise on completion of a previous processing step.

CUDAManager is thread-safe: all public methods can be called from any thread at any time.

Constructor & Destructor Documentation

niftk::CUDAManager::CUDAManager ( )
protected
niftk::CUDAManager::~CUDAManager ( )
protectedvirtual

Member Function Documentation

ScopedCUDADevice niftk::CUDAManager::ActivateDevice ( int  dev)
void niftk::CUDAManager::AllRefsDropped ( LightweightCUDAImage lwci)
protected

Used by LightweightCUDAImage to notify us that all references to it have been dropped, and that it can be placed back onto m_AvailableImagePool for later re-use.

void niftk::CUDAManager::Autorelease ( ReadAccessor readAccessor,
cudaStream_t  stream 
)

Releases the read-request of an image once processing on stream has finished. This method does not block, it will return immediately. Make sure you call this method after Finalise(), or use FinaliseAndAutorelease().

void niftk::CUDAManager::Autorelease ( WriteAccessor writeAccessor,
cudaStream_t  stream 
)
LightweightCUDAImage niftk::CUDAManager::Finalise ( WriteAccessor writeAccessor,
cudaStream_t  stream 
)
LightweightCUDAImage niftk::CUDAManager::FinaliseAndAutorelease ( WriteAccessor writeAccessor,
ReadAccessor readAccessor,
cudaStream_t  stream 
)

Combines Finalise() and Autorelease() into a single call.

CUDAManager * niftk::CUDAManager::GetInstance ( )
static
Exceptions
std::runtime_errorif CUDA is not available on the system.
cudaStream_t niftk::CUDAManager::GetStream ( const std::string name)
WriteAccessor niftk::CUDAManager::RequestOutputImage ( unsigned int  width,
unsigned int  height,
int  FIXME_pixeltype 
)
ReadAccessor niftk::CUDAManager::RequestReadAccess ( const LightweightCUDAImage lwci)
Exceptions
std::runtime_errorif lwci is not valid.

Friends And Related Function Documentation

friend struct impldetail::ModuleCleanup
friend
friend class LightweightCUDAImage
friend

The documentation for this class was generated from the following files: