| --- |
| language: |
| - en |
| - zh |
| license: apache-2.0 |
| pipeline_tag: audio-classification |
| tags: |
| - Language Identification |
| - LID |
| - Audio Classification |
| - VoxLingua107 |
| - audio |
| - automatic-speech-recognition |
| - asr |
| --- |
| |
| <div align="center"> |
| <h1> |
| FireRedASR2S - FireRedLID |
| <br> |
| A SOTA Industrial-Grade Spoken Language Identification System |
| </h1> |
|
|
| </div> |
|
|
| [[Paper]](https://huggingface.co/papers/2603.10420) |
| [[Code]](https://github.com/FireRedTeam/FireRedASR2S) |
| [[Blog]](https://fireredteam.github.io/demos/firered_asr/) |
| [[Demo]](https://huggingface.co/spaces/FireRedTeam/FireRedASR) |
| |
| |
| FireRedLID is the Spoken Language Identification (LID) module of **FireRedASR2S**, a state-of-the-art (SOTA), industrial-grade, all-in-one ASR system. It supports 100+ languages and 20+ Chinese dialects/accents, achieving 97.18% accuracy on the FLEURS benchmark, outperforming Whisper and SpeechBrain-LID. |
| |
| This model was introduced in the paper [FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System](https://huggingface.co/papers/2603.10420). |
| |
| ## 🔥 News |
| - [2026.02.12] We release FireRedASR2S (FireRedASR2-AED, FireRedVAD, FireRedLID, and FireRedPunc) with model weights and inference code. |
| |
| ## Evaluation |
| ### FireRedLID |
| Metric: Utterance-level LID Accuracy (%). Higher is better. |
| |
| |Testset\Model|Languages|FireRedLID|Whisper|SpeechBrain|Dolphin| |
| |:-----------------:|:---------:|:---------:|:-----:|:---------:|:-----:| |
| |FLEURS test |82 languages |**97.18** |79.41 |92.91 |-| |
| |CommonVoice test |74 languages |**92.07** |80.81 |78.75 |-| |
| |KeSpeech + MagicData|20+ Chinese dialects/accents |**88.47** |-|-|69.01| |
| |
| |
| ## Sample Usage |
| |
| To use this module independently, first clone the [GitHub repository](https://github.com/FireRedTeam/FireRedASR2S) and install the dependencies. |
| |
| ### Python API Usage |
| ```python |
| from fireredasr2s.fireredlid import FireRedLid, FireRedLidConfig |
| |
| batch_uttid = ["hello_zh", "hello_en"] |
| batch_wav_path = ["assets/hello_zh.wav", "assets/hello_en.wav"] |
|
|
| config = FireRedLidConfig(use_gpu=True, use_half=False) |
| model = FireRedLid.from_pretrained("FireRedTeam/FireRedLID", config) |
| |
| results = model.process(batch_uttid, batch_wav_path) |
| print(results) |
| # [{'uttid': 'hello_zh', 'lang': 'zh mandarin', 'confidence': 0.996, 'dur_s': 2.32, 'rtf': '0.0741', 'wav': 'assets/hello_zh.wav'}, {'uttid': 'hello_en', 'lang': 'en', 'confidence': 0.996, 'dur_s': 2.24, 'rtf': '0.0741', 'wav': 'assets/hello_en.wav'}] |
| ``` |
| |
| ## Citation |
| ```bibtex |
| @article{xu2026fireredasr2s, |
| title={FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System}, |
| author={Xu, Kaituo and Jia, Yan and Huang, Kai and Chen, Junjie and Li, Wenpeng and Liu, Kun and Xie, Feng-Long and Tang, Xu and Hu, Yao}, |
| journal={arXiv preprint arXiv:2603.10420}, |
| year={2026} |
| } |
| ``` |