AsynchronousLoader¶

This dataloader behaves identically to the standard pytorch dataloader, but will transfer data asynchronously to the GPU with training. You can also use it to wrap an existing dataloader.

Note

We rely on the community to keep these updated and working. If something doesn’t work, we’d really appreciate a contribution to fix!

Example:

dataloader = AsynchronousLoader(DataLoader(ds, batch_size=16), device=device)

for b in dataloader:
    ...

class pl_bolts.datamodules.async_dataloader.AsynchronousLoader(data, device=device(type='cuda', index=0), q_size=10, num_batches=None, **kwargs)[source]

Bases: object

Warning

The feature AsynchronousLoader is currently marked under review. The compatibility with other Lightning projects is not guaranteed and API may change at any time. The API and functionality may change without warning in future releases. More details: https://lightning-bolts.readthedocs.io/en/latest/stability.html

Class for asynchronously loading from CPU memory to device memory with DataLoader.

Note that this only works for single GPU training, multiGPU uses PyTorch’s DataParallel or DistributedDataParallel which uses its own code for transferring data across GPUs. This could just break or make things slower with DataParallel or DistributedDataParallel.

Parameters

data¶ (Union[DataLoader, Dataset]) – The PyTorch Dataset or DataLoader we’re using to load.
device¶ (device) – The PyTorch device we are loading to
q_size¶ (int) – Size of the queue used to store the data loaded to the device
num_batches¶ (Optional[int]) – Number of batches to load. This must be set if the dataloader doesn’t have a finite __len__. It will also override DataLoader.__len__ if set and DataLoader has a __len__. Otherwise it can be left as None
**kwargs¶ – Any additional arguments to pass to the dataloader if we’re constructing one here