Caffe2 - C++ API: torch::data::DataLoaderBase< Dataset, Batch, BatchRequest > Class Template Reference

Caffe2 - C++ API

A deep learning, cross platform ML framework

torch
data
DataLoaderBase

Data Structures
struct	Job
	A `Job` is either a `BatchRequest` (new indices to fetch data at) or a `QuitWorker` object, to indicate the worker should shut down. More...

struct	QuitWorker

struct	Result
	The finished result of a job. More...

struct	Sequenced
	Simple mix-in to give something a sequence number. More...

Public Types
using	BatchType = Batch

using	BatchRequestType = BatchRequest

Public Member Functions
	DataLoaderBase (DataLoaderOptions options, std::unique_ptr< Dataset > main_thread_dataset=nullptr)
	Constructs a new DataLoader from a `dataset` to sample from, `options` to configure the DataLoader with, and a `sampler` that specifies the sampling strategy. More...

Iterator< Batch >	begin ()
	Returns an iterator into the DataLoader. More...

Iterator< Batch >	end ()
	Returns a special "sentinel" iterator that compares equal with a non-sentinel iterator once the DataLoader is exhausted. More...

void	join ()
	Joins the DataLoader's worker threads and drains internal queues. More...

const FullDataLoaderOptions &	options () const noexcept
	Returns the options with which the DataLoader was configured.

Protected Member Functions
virtual optional< BatchRequestType >	get_batch_request ()=0
	Subclass hook for getting the next batch request. More...

virtual void	reset ()
	Resets the internal state of the DataLoader, optionally pre-fetching new jobs. More...

void	prefetch (size_t requested_jobs)
	Schedules `requested_jobs` many new batches to be fetched. More...

void	prefetch ()
	Schedules the maximum number of jobs (based on the `max_jobs` option).

optional< BatchType >	next ()
	Returns the next batch of data, or an empty `optional` if the DataLoader is exhausted. More...

void	worker_thread (Dataset &dataset)
	The function that worker threads run.

template<typename T >
void	push_job (T value)
	Convenience method that calls `shuttle_.push_job()` with the next sequence number. More...

optional< Result >	pop_result ()
	Convenience method that gets the next result from the sequencer.

std::unique_ptr< detail::sequencers::Sequencer< Result > >	new_sequencer ()
	Convenience method that creates a new sequencer based on the `enforce_ordering` option. More...

Protected Attributes
const FullDataLoaderOptions	options_
	The options the DataLoader was configured with.

std::unique_ptr< Dataset >	main_thread_dataset_
	The dataset for the main thread, only has a value if the number of worker threads was configured as zero, meaning the main thread has to do all the work (synchronously). More...

size_t	sequence_number_ = 0
	The sequence number for the next batch to be retrieved from the dataset. More...

std::vector< std::thread >	workers_
	The worker threads, running the `worker_thread()` method.

detail::DataShuttle< Job, Result >	shuttle_
	The `DataShuttle` which takes care of the life cycle of a job.

std::unique_ptr< detail::sequencers::Sequencer< Result > >	sequencer_
	The `Sequencer`, which handles optional ordering of batches.

bool	joined_ = false
	True if the DataLoader has joined its worker threads.

Detailed Description

template<typename Dataset, typename Batch, typename BatchRequest>
class torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >

Definition at line 27 of file base.h.

Constructor & Destructor Documentation

template<typename Dataset, typename Batch, typename BatchRequest>

torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::DataLoaderBase	(	DataLoaderOptions	options,
		std::unique_ptr< Dataset >	main_thread_dataset = `nullptr`
	)

inline

Constructs a new DataLoader from a dataset to sample from, options to configure the DataLoader with, and a sampler that specifies the sampling strategy.

Definition at line 35 of file base.h.

Member Function Documentation

template<typename Dataset, typename Batch, typename BatchRequest>

Iterator<Batch> torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::begin ( )

inline

Returns an iterator into the DataLoader.

The lifetime of the iterator is bound to the DataLoader. In C++ standards language, the category of the iterator is OutputIterator. See https://en.cppreference.com/w/cpp/named_req/OutputIterator for what this means. In short: you may increment the iterator and dereference it, but cannot go back, or step forward more than one position at a time. When the DataLoader is exhausted, it will compare equal with the special "sentinel" iterator returned by DataLoader::end(). Most of the time, you should only use range-for loops to loop over the DataLoader, but standard algorithms like std::copy(dataloader.begin(), dataloader.end(), output_iterator) are supported too.

Definition at line 57 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

Iterator<Batch> torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::end ( )

inline

Returns a special "sentinel" iterator that compares equal with a non-sentinel iterator once the DataLoader is exhausted.

Definition at line 69 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

virtual optional<BatchRequestType> torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::get_batch_request ( )

protectedpure virtual

Subclass hook for getting the next batch request.

The stateless case will ask the sampler for a new batch request (e.g. a vector of indices), while the stateful one will simply return the batch size.

template<typename Dataset, typename Batch, typename BatchRequest>

void torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::join ( )

inline

Joins the DataLoader's worker threads and drains internal queues.

This function may only be invoked from the main thread (in which the DataLoader lives).

Definition at line 77 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

std::unique_ptr<detail::sequencers::Sequencer<Result> > torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::new_sequencer ( )

inlineprotected

Convenience method that creates a new sequencer based on the enforce_ordering option.

Definition at line 212 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

optional<BatchType> torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::next ( )

inlineprotected

Returns the next batch of data, or an empty optional if the DataLoader is exhausted.

This operation will block until a batch is available if one is still expected.

Definition at line 165 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

void torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::prefetch ( size_t requested_jobs )

inlineprotected

Schedules requested_jobs many new batches to be fetched.

The actual number of jobs scheduled may be less if the DataLoader exhausts.

Definition at line 147 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

template<typename T >

void torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::push_job ( T value )

inlineprotected

Convenience method that calls shuttle_.push_job() with the next sequence number.

Definition at line 200 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

virtual void torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::reset ( )

inlineprotectedvirtual

Resets the internal state of the DataLoader, optionally pre-fetching new jobs.

Definition at line 138 of file base.h.

Field Documentation

template<typename Dataset, typename Batch, typename BatchRequest>

std::unique_ptr<Dataset> torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::main_thread_dataset_

protected

The dataset for the main thread, only has a value if the number of worker threads was configured as zero, meaning the main thread has to do all the work (synchronously).

NOTE: Really want this to be on the heap when empty, therefore unique_ptr and not optional.

Definition at line 227 of file base.h.

template<typename Dataset, typename Batch, typename BatchRequest>

size_t torch::data::DataLoaderBase< Dataset, Batch, BatchRequest >::sequence_number_ = 0

protected

The sequence number for the next batch to be retrieved from the dataset.

Definition at line 231 of file base.h.

The documentation for this class was generated from the following file:

torch/csrc/api/include/torch/data/dataloader/base.h

Facebook Open Source

Open Source Projects GitHub Twitter

Contribute to this project on GitHub