A Unified Multimodal Data Quality Classifier for generating quality scores for both image-text caption data and interleaved document data
Weizhi Wang
weizhiwang
AI & ML interests
None yet
Organizations
models 12
weizhiwang/UniFilter-Qwen3-0.6B
Image-Text-to-Text • 1B • Updated
weizhiwang/UniFilter-Qwen2.5-1.5B
Image-Text-to-Text • 2B • Updated
weizhiwang/Open-Qwen2VL
Image-Text-to-Text • Updated
• 46 • 21
weizhiwang/mlm-filter-qwen2.5-1.5b-gpt4o
Text Generation • 2B • Updated
• 4 • 3
weizhiwang/Open-Qwen2VL-base
Image-Text-to-Text • Updated
• 4
weizhiwang/unifilter_mllm_pretrain_checkpoints
Updated
weizhiwang/unifilter_mllm_sft_checkpoints
Updated
weizhiwang/LLaVA-Video-Llama-3.1-8B
8B • Updated
• 17 • 5
weizhiwang/llava-video-llama-3.1-8b-siglip-so-384-aapool-144-projector
Updated
weizhiwang/mlm-filter-llava-13b-gpt4v
Text Generation • Updated
• 3 • 6
datasets 11
weizhiwang/unifilter_train_data
Updated
• 20
weizhiwang/OBELICS_HQ_5M_UniFilter
Viewer
• Updated
• 5.06M • 703
weizhiwang/cnsi-chatbot
Updated
• 40
weizhiwang/mlm_filter_instructions
Updated
• 27 • 5
weizhiwang/agent_eval
Viewer
• Updated
• 851 • 255
weizhiwang/Open-Qwen2VL-Data
Viewer
• Updated
• 13M • 857 • 23
weizhiwang/Open-Qwen2VL-Data-Interleaved
Viewer
• Updated
• 23.3M • 86 • 2
weizhiwang/mmc4_fewer_faces
Updated
• 7
weizhiwang/datacomp-hq
Updated
• 13
weizhiwang/llava_v15_instruction_images
Preview
• Updated
• 93 • 6