afnio.utils.data#
- class afnio.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, drop_last=False, seed=None)[source]#
Bases:
Generic[T_co]Data loader combines a dataset and a sampler, and provides an iterable over the given dataset.
The
DataLoadersupports both map-style and iterable-style datasets with single-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.See
afnio.utils.datadocumentation page for more details.- Parameters:
dataset (Dataset) – dataset from which to load the data.
batch_size (int, optional) – how many samples per batch to load (default:
1).shuffle (bool, optional) – set to
Trueto have the data reshuffled at every epoch (default:False).sampler (Sampler or Iterable, optional) – defines the strategy to draw samples from the dataset. Can be any
Iterablewith__len__implemented. If specified,shufflemust not be specified.drop_last (bool, optional) – set to
Trueto drop the last incomplete batch, if the dataset size is not divisible by the batch size. IfFalseand the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default:False)seed (int, optional) – If not
None, this seed will be used by RandomSampler to generate random indexes. (default:None)
- class afnio.utils.data.Dataset[source]#
Bases:
Generic[T_co]An abstract class representing a
Dataset.All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite
__getitem__(), supporting fetching a data sample for a given key and__len__(), which is expected to return the size of the dataset by the default options ofDataLoader. Subclasses could also optionally implement__getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.
- class afnio.utils.data.RandomSampler(data_source, replacement=False, num_samples=None, seed=None)[source]#
-
Samples elements randomly. If without replacement, then sample from a shuffled dataset.
If with replacement, then user can specify
num_samplesto draw.- Parameters:
- class afnio.utils.data.Sampler[source]#
Bases:
Generic[T_co]Base class for all Samplers.
Every Sampler subclass has to provide an
__iter__()method, providing a way to iterate over indices or lists of indices (batches) of dataset elements, and may provide a__len__()method that returns the length of the returned iterators.
- class afnio.utils.data.SequentialSampler(data_source)[source]#
-
Samples elements sequentially, always in the same order.
- Parameters:
data_source (Dataset) – dataset to sample from
- class afnio.utils.data.WeightedRandomSampler(weights, num_samples, replacement=True, seed=None)[source]#
-
Samples elements from
[0,..,len(weights)-1]with given probabilities (weights).- Parameters:
weights (sequence) – a sequence of weights, not necessary summing up to one
num_samples (int) – number of samples to draw
replacement (bool) – if
True, samples are drawn with replacement. If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.seed (int) – A number to set the seed for the random draws.
Example
>>> list(WeightedRandomSampler([0.1, 0.9, 0.4, 0.7, 3.0, 0.6], 5, replacement=True)) [4, 4, 1, 4, 5] >>> list(WeightedRandomSampler([0.9, 0.4, 0.05, 0.2, 0.3, 0.1], 5, replacement=False)) [0, 1, 4, 3, 2]
Modules