rsalshalan/QASR
Updated β’ 42 β’ 1
How to use IbrahimAmin/hubert-arabic-spoken-dialect-classifier with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("audio-classification", model="IbrahimAmin/hubert-arabic-spoken-dialect-classifier") # Load model directly
from transformers import AutoProcessor, AutoModelForAudioClassification
processor = AutoProcessor.from_pretrained("IbrahimAmin/hubert-arabic-spoken-dialect-classifier")
model = AutoModelForAudioClassification.from_pretrained("IbrahimAmin/hubert-arabic-spoken-dialect-classifier")This model is a fine-tuned version of facebook/hubert-base-ls960 for Arabic spoken dialect classification. It identifies Modern Standard Arabic (MSA) and 17 regional Arabic dialects from raw audio.
This model is intended for use in tasks such as dialect identification, linguistic research, and dialect-aware speech processing systems.
facebook/hubert-base-ls960id2label)
The model predicts one of the following 18 classes:
{
"0": "MSA", // Modern Standard Arabic
"1": "IRA", // Iraqi Arabic
"2": "EGY", // Egyptian Arabic
"3": "MAU", // Mauritanian Arabic
"4": "KSA", // Saudi Arabic
"5": "UAE", // Emirati Arabic
"6": "SYR", // Syrian Arabic
"7": "PAL", // Palestinian Arabic
"8": "LEB", // Lebanese Arabic
"9": "LIB", // Libyan Arabic
"10": "KUW", // Kuwaiti Arabic
"11": "ALG", // Algerian Arabic
"12": "OMA", // Omani Arabic
"13": "QAT", // Qatari Arabic
"14": "YEM", // Yemeni Arabic
"15": "SUD", // Sudanese Arabic
"16": "MOR", // Moroccan Arabic
"17": "JOR", // Jordanian Arabic
}
from transformers import Wav2Vec2FeatureExtractor, HubertForSequenceClassification
import torch
import torchaudio
# Load feature extractor and model
processor = Wav2Vec2FeatureExtractor.from_pretrained("IbrahimAmin/hubert-arabic-spoken-dialect-classifier")
model = HubertForSequenceClassification.from_pretrained("IbrahimAmin/hubert-arabic-spoken-dialect-classifier")
# Load audio (must be mono, 16kHz)
waveform, sample_rate = torchaudio.load("your_audio.wav")
# Convert to mono if not already
if waveform.shape[0] > 1:
waveform = torch.mean(waveform, dim=0, keepdim=True)
# Resample if needed to 16 kHz
if sample_rate != 16000:
resampler = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=16000)
waveform = resampler(waveform)
inputs = processor(waveform.squeeze(), sampling_rate=16_000, return_tensors="pt")
# Run inference
with torch.inference_mode():
logits = model(**inputs).logits
# Get predicted label
predicted_label = torch.argmax(logits, dim=-1).item()
print(f"Predicted Dialect: {model.config.id2label[predicted_label]}")
This model was trained using:
If you use this model in your research or application, please cite:
@misc{amin2025hubertarabicdialect,
title={HuBERT Arabic Spoken Dialect Classifier},
author={Ibrahim Amin},
year={2025},
publisher = {Hugging Face},
howpublished={\url{https://huggingface.co/IbrahimAmin/hubert-arabic-spoken-dialect-classifier}},
}
Base model
facebook/hubert-base-ls960