You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

1

Qwen3-VL-8B-Instruct-abliterated-v2.0

Qwen3-VL-8B-Instruct-abliterated-v2.0 is an abliterated (v2.0) variant of Qwen3-VL-8B-Instruct, designed for Abliterated Reasoning and Captioning. This model is fine-tuned to produce highly detailed, descriptive, and reasoning-focused outputs across a wide range of visual and multimodal contexts, including complex, sensitive, or nuanced content. It supports varied image resolutions and aspect ratios while maintaining interpretive coherence and descriptive accuracy.

Key Highlights

  • Abliterated / Uncensored Captioning Fine-tuned to bypass conventional content filters while preserving factual, descriptive, and reasoning-rich outputs.

  • High-Fidelity Reasoning and Descriptions Generates in-depth captions and reasoning for general, artistic, technical, abstract, and low-context images.

  • Robust Across Aspect Ratios Performs consistently on wide, tall, square, panoramic, and irregular image dimensions.

  • Variational Detail Control Capable of generating outputs ranging from concise summaries to intricate, multi-level descriptive reasoning.

  • Foundation on Qwen3-VL-8B-Instruct Architecture Built upon Qwen3-VL-8B-Instruct’s multimodal reasoning, comprehension, and instruction-following framework.

  • Multilingual Output Capability Primarily outputs in English, but adaptable to multiple languages via prompt engineering.


Base Model Signatures:

This model has been re-sharded and optimized for the latest Transformers version from the base model: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-8B-Instruct-abliterated.


Quick Start with Transformers

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)

inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
).to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=128)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

This model is suited for:

  • Generating detailed, unfiltered captions and reasoning for general-purpose and artistic datasets.
  • Research in content moderation, red-teaming, and generative safety analysis.
  • Enabling descriptive captioning and reasoning for datasets typically excluded from mainstream models.
  • Creative and exploratory applications such as storytelling, visual interpretation, and multimodal reasoning.
  • Captioning and reasoning for non-standard, stylized, or abstract visual content.

Limitations

  • May generate explicit, sensitive, or offensive content depending on the prompt and input image.
  • Not suitable for production environments that require strict content filtering or moderation.
  • Output tone, style, and reasoning depth can vary depending on phrasing and visual complexity.
  • May show variability in performance on synthetic or highly abstract visuals.
Downloads last month
713
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2

Finetuned
(296)
this model
Finetunes
1 model
Quantizations
5 models

Collections including prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2