dinho1597/Telecom-QA-MultipleChoice
Viewer • Updated • 8.19k • 33 • 2
How to use dinho1597/phi-2-telecom-ft with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("dinho1597/phi-2-telecom-ft")
sentences = [
"What problem can reconfigurable intelligent surfaces mitigate in light fidelity systems?",
"The document mentions that blind channel estimation requires a large number of data symbols to improve accuracy, which may not be feasible in practice.",
"Empirical evidence suggests that the power decay can even be exponential with distance.",
"Reconfigurable intelligent surface-enabled environments can enhance light fidelity coverage by mitigating the dead-zone problem for users at the edge of the cell, improving link quality."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5 on the telecom-qa-multiple_choice dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What is the trade-off between privacy and convergence performance when using artificial noise obscuring in federated learning?',
'The trade-off between privacy and convergence performance when using artificial noise obscuring in federated learning is that increasing the noise variance improves privacy but degrades convergence.',
"The 'decrypt_error' alert indicates a handshake cryptographic operation failed, including being unable to verify a signature, decrypt a key exchange, or validate a finished message.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
telecom-ir-evalInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.968 |
| cosine_accuracy@3 | 0.9916 |
| cosine_accuracy@5 | 0.9916 |
| cosine_accuracy@10 | 0.9924 |
| cosine_precision@1 | 0.968 |
| cosine_recall@1 | 0.968 |
| cosine_ndcg@10 | 0.9823 |
| cosine_mrr@10 | 0.9789 |
| cosine_map@100 | 0.9791 |
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
What is multi-user multiple input, multiple output (MU-MIMO) in IEEE 802.11-2020? |
MU-MIMO is a technique by which multiple stations (STAs) either simultaneously transmit to a single STA or simultaneously receive from a single STA independent data streams over the same radio frequencies. |
What is the purpose of wireless network virtualization? |
The purpose of wireless network virtualization is to improve resource utilization, support diverse services/use cases, and be cost-effective and flexible for new services. |
What is the E2E (end-to-end) latency requirement for factory automation applications? |
Factory automation applications require an E2E latency of 0.25-10 ms. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
Which standard enables building Digital Twins of different Physical Twins using combinations of XML (eXtensible Markup Language) and C codes? |
The functional mockup interface (FMI) is a standard that enables building Digital Twins of different Physical Twins using combinations of XML and C codes. |
What algorithm is commonly used for digital signatures in S/MIME? |
RSA is commonly used for digital signatures in S/MIME. |
What are the three modes of operation based on the communication range and the SA (subarray) separation? |
The three modes of operation based on the communication range and the SA separation are: (1) a mode where the channel paths are independent and the channel is always well-conditioned, (2) a mode where the channel is ill-conditioned, and (3) a mode where the channel is highly correlated. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 256per_device_eval_batch_size: 256weight_decay: 0.01num_train_epochs: 10lr_scheduler_type: cosine_with_restartswarmup_ratio: 0.1fp16: Trueload_best_model_at_end: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 256per_device_eval_batch_size: 256per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: cosine_with_restartslr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss | telecom-ir-eval_cosine_ndcg@10 |
|---|---|---|---|---|
| 0.7143 | 15 | 0.824 | 0.1333 | 0.9701 |
| 1.3810 | 30 | 0.1731 | 0.0759 | 0.9776 |
| 2.0476 | 45 | 0.0917 | 0.0657 | 0.9807 |
| 2.7619 | 60 | 0.0676 | 0.0609 | 0.9813 |
| 3.4286 | 75 | 0.0435 | 0.0596 | 0.9818 |
| 4.0952 | 90 | 0.038 | 0.0606 | 0.9814 |
| 4.8095 | 105 | 0.0332 | 0.0594 | 0.9820 |
| 5.4762 | 120 | 0.0269 | 0.0607 | 0.9817 |
| 6.1429 | 135 | 0.0219 | 0.0600 | 0.9819 |
| 6.8571 | 150 | 0.0244 | 0.0599 | 0.9823 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
BAAI/bge-small-en-v1.5