Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
9
This is a Cross Encoder model finetuned from prajjwal1/bert-tiny using the sentence-transformers library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
This model was trained using train_script.py.
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross-encoder-testing/reranker-bert-tiny-gooaq-bce")
# Get scores for pairs of texts
pairs = [
['are javascript developers in demand?', "JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL."],
['are javascript developers in demand?', 'In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture.'],
['are javascript developers in demand?', 'Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language.'],
['are javascript developers in demand?', 'Things in the body tag are the things that should be displayed: the actual content. Javascript in the body is executed as it is read and as the page is rendered. Javascript in the head is interpreted before anything is rendered.'],
['are javascript developers in demand?', 'Web apps tend to be built using JavaScript, CSS and HTML5. Unlike mobile apps, there is no standard software development kit for building web apps. However, developers do have access to templates. Compared to mobile apps, web apps are usually quicker and easier to build — but they are much simpler in terms of features.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'are javascript developers in demand?',
[
"JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL.",
'In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture.',
'Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language.',
'Things in the body tag are the things that should be displayed: the actual content. Javascript in the body is executed as it is read and as the page is rendered. Javascript in the head is interpreted before anything is rendered.',
'Web apps tend to be built using JavaScript, CSS and HTML5. Unlike mobile apps, there is no standard software development kit for building web apps. However, developers do have access to templates. Compared to mobile apps, web apps are usually quicker and easier to build — but they are much simpler in terms of features.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
gooaq-dev, NanoMSMARCO, NanoNFCorpus and NanoNQCrossEncoderRerankingEvaluator| Metric | gooaq-dev | NanoMSMARCO | NanoNFCorpus | NanoNQ |
|---|---|---|---|---|
| map | 0.5677 (+0.0366) | 0.4280 (-0.0616) | 0.3397 (+0.0787) | 0.4149 (-0.0047) |
| mrr@10 | 0.5558 (+0.0318) | 0.4129 (-0.0646) | 0.5196 (+0.0198) | 0.4132 (-0.0135) |
| ndcg@10 | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3308 (+0.0058) | 0.4859 (-0.0147) |
NanoBEIR_R100_meanCrossEncoderNanoBEIREvaluator| Metric | Value |
|---|---|
| map | 0.3942 (+0.0041) |
| mrr@10 | 0.4486 (-0.0194) |
| ndcg@10 | 0.4313 (-0.0241) |
question, answer, and label| question | answer | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| question | answer | label |
|---|---|---|
are javascript developers in demand? |
JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL. |
1 |
are javascript developers in demand? |
In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture. |
0 |
are javascript developers in demand? |
Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language. |
0 |
BinaryCrossEntropyLoss with these parameters:{
"activation_fct": "torch.nn.modules.linear.Identity",
"pos_weight": 5
}
eval_strategy: stepsper_device_train_batch_size: 2048per_device_eval_batch_size: 2048learning_rate: 0.0005num_train_epochs: 1warmup_ratio: 0.1seed: 12bf16: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 2048per_device_eval_batch_size: 2048per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0005weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 12data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | gooaq-dev_ndcg@10 | NanoMSMARCO_ndcg@10 | NanoNFCorpus_ndcg@10 | NanoNQ_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
|---|---|---|---|---|---|---|---|
| -1 | -1 | - | 0.0887 (-0.5025) | 0.0063 (-0.5341) | 0.3262 (+0.0012) | 0.0000 (-0.5006) | 0.1108 (-0.3445) |
| 0.0035 | 1 | 1.1945 | - | - | - | - | - |
| 0.0707 | 20 | 1.1664 | 0.4082 (-0.1830) | 0.1805 (-0.3600) | 0.3168 (-0.0083) | 0.2243 (-0.2763) | 0.2405 (-0.2149) |
| 0.1413 | 40 | 1.1107 | 0.5260 (-0.0652) | 0.3453 (-0.1951) | 0.3335 (+0.0085) | 0.3430 (-0.1576) | 0.3406 (-0.1147) |
| 0.2120 | 60 | 1.022 | 0.5623 (-0.0289) | 0.3929 (-0.1475) | 0.3512 (+0.0262) | 0.3472 (-0.1535) | 0.3638 (-0.0916) |
| 0.2827 | 80 | 0.973 | 0.5691 (-0.0221) | 0.4048 (-0.1356) | 0.3530 (+0.0280) | 0.3833 (-0.1174) | 0.3804 (-0.0750) |
| 0.3534 | 100 | 0.963 | 0.5814 (-0.0098) | 0.4385 (-0.1019) | 0.3471 (+0.0221) | 0.4227 (-0.0779) | 0.4028 (-0.0526) |
| 0.4240 | 120 | 0.9419 | 0.5963 (+0.0050) | 0.4106 (-0.1298) | 0.3540 (+0.0289) | 0.4843 (-0.0163) | 0.4163 (-0.0391) |
| 0.4947 | 140 | 0.9331 | 0.5953 (+0.0041) | 0.4310 (-0.1094) | 0.3367 (+0.0117) | 0.4163 (-0.0843) | 0.3947 (-0.0607) |
| 0.5654 | 160 | 0.9263 | 0.6070 (+0.0158) | 0.4626 (-0.0778) | 0.3443 (+0.0193) | 0.4823 (-0.0184) | 0.4297 (-0.0256) |
| 0.6360 | 180 | 0.9212 | 0.6069 (+0.0156) | 0.4602 (-0.0802) | 0.3391 (+0.0141) | 0.4782 (-0.0224) | 0.4258 (-0.0295) |
| 0.7067 | 200 | 0.901 | 0.6126 (+0.0214) | 0.4602 (-0.0803) | 0.3413 (+0.0162) | 0.4780 (-0.0227) | 0.4265 (-0.0289) |
| 0.7774 | 220 | 0.8997 | 0.6136 (+0.0224) | 0.4801 (-0.0604) | 0.3349 (+0.0098) | 0.4903 (-0.0103) | 0.4351 (-0.0203) |
| 0.8481 | 240 | 0.9021 | 0.6132 (+0.0220) | 0.4850 (-0.0554) | 0.3438 (+0.0188) | 0.4855 (-0.0151) | 0.4381 (-0.0173) |
| 0.9187 | 260 | 0.9013 | 0.6188 (+0.0276) | 0.4820 (-0.0584) | 0.3387 (+0.0137) | 0.4851 (-0.0156) | 0.4353 (-0.0201) |
| 0.9894 | 280 | 0.8996 | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3305 (+0.0054) | 0.4859 (-0.0147) | 0.4312 (-0.0242) |
| -1 | -1 | - | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3308 (+0.0058) | 0.4859 (-0.0147) | 0.4313 (-0.0241) |
Carbon emissions were measured using CodeCarbon.
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
prajjwal1/bert-tiny