Datasets
sentence_transformers.datasets contains classes to organize your training input examples.
ParallelSentencesDataset
ParallelSentencesDataset is used for multilingual training. For details, see multilingual training.
.. autoclass:: sentence_transformers.datasets.ParallelSentencesDataset
SentenceLabelDataset
SentenceLabelDataset can be used if you have labeled sentences and want to train with triplet loss.
.. autoclass:: sentence_transformers.datasets.SentenceLabelDataset
DenoisingAutoEncoderDataset
DenoisingAutoEncoderDataset is used for unsupervised training with the TSDAE method.
.. autoclass:: sentence_transformers.datasets.DenoisingAutoEncoderDataset
NoDuplicatesDataLoader
NoDuplicatesDataLoadercan be used together with MultipleNegativeRankingLoss to ensure that no duplicates are within the same batch.
.. autoclass:: sentence_transformers.datasets.NoDuplicatesDataLoader