HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive Review 2024

8.2/5Verified

uncensored LLMQwen3.6MoE modelGGUF quantization

Try HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive Free →N/A (Open-weight model)

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

Lossless uncensored MoE model with extended context and multimodal support

Starting at

0.00

Billing

One-time download

Refund

N/A (Open-weight model)

Try HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive →

Our Take

A highly capable, unrestricted variant of the Qwen3.6-35B-A3B architecture, optimized for local deployment and specialized workflows requiring unfiltered outputs.

Is It Worth It?

Yes, for developers and researchers who require an open-weight, uncensored MoE model with extensive quantization options and strong reasoning capabilities.

Best Suited For

Local AI deployment, uncensored content generation, agentic coding workflows, and long-context reasoning tasks.

What We Loved

✓Completely removes safety refusal filters
✓Wide range of lossless GGUF quantizations for flexible hardware deployment
✓Strong coding and reasoning capabilities for its size
✓Native multimodal and long-context support
✓Free to download and self-host

What Bothered Us

✗Requires substantial VRAM for higher precision formats
✗Lacks built-in content moderation, requiring external safeguards
✗No official vendor support or SLA
✗Aggressive variant may produce unverified or harmful outputs without careful prompting

How It Performed

output Quality

Maintains the strong reasoning, coding, and multimodal understanding of the base Qwen3.6 architecture across various quantization levels.

ai Intelligence

Competitive for its size, leveraging a 35B/3B MoE design with 262k context and native tool-calling.

speed Test

Inference speed depends heavily on hardware and quantization; MoE architecture typically offers faster token generation than dense models of similar parameter counts.

This model serves as a direct, lossless modification of the Qwen3.6-35B-A3B base, focusing on removing alignment filters while preserving original capabilities. It supports text, image, and video inputs, and features a 'thinking mode' that maintains chain-of-thought reasoning across extended sessions. The release includes a comprehensive set of GGUF quantizations, ranging from Q8-KP (~44 GB) down to IQ2-M (~11 GB), enabling deployment across a wide spectrum of hardware configurations. While the removal of safety filters provides maximum flexibility for developers, it also necessitates careful prompt engineering and output validation. The model integrates well with popular local inference frameworks and maintains competitive performance in coding and reasoning benchmarks relative to its parameter count.

Ideal for developers building local AI agents, researchers studying unfiltered model behavior, and users requiring long-context multimodal processing without vendor-imposed restrictions. It is less suitable for enterprise environments requiring strict content moderation or compliance guarantees.

Competes with other uncensored community releases like Llama 3.2-Uncensored and Mixtral-8x7B-Uncensored, as well as the standard safety-aligned Qwen3.6-35B-A3B. Its MoE architecture provides a favorable balance of performance and resource efficiency compared to dense 35B models.

Frequently Asked Questions

It indicates that all built-in safety refusal mechanisms have been removed, allowing the model to respond to prompts that would typically be blocked by aligned versions.

Requirements vary by quantization. The Q8-KP version requires approximately 44 GB of VRAM, while the IQ2-M version can run on around 11 GB, making it accessible to consumer-grade GPUs.

Yes, it retains the multimodal capabilities of the base Qwen3.6 architecture, allowing it to process text, images, and video natively.

The model itself is open-weight and intended for self-hosting. However, cloud providers offering the base Qwen3.6 architecture typically charge around $0.78 per 1M input tokens and $3.90 per 1M output tokens.

No, this is a community-released modification. Support is handled through community channels like Hugging Face discussions and Discord.

It is optimized for and compatible with Transformers, vLLM, SGLang, KTransformers, LM Studio, and Ollama.

For thinking mode, use temperature=1.0, top_p=0.95, top_k=20. For coding or precise tasks, lower the temperature to 0.6 and set presence_penalty to 0.

Alternative Comparisons

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs GLM 5.1

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs GPT-5 (via ChatGPT)

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs Claude 4

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs DeepSeek V3

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs moonshotai/Kimi-K2.6

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs Qwen/Qwen3.6-35B-A3B

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs Qwen/Qwen3.6-27B

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs unsloth/Qwen3.6-35B-A3B-GGUF

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs unsloth/Qwen3.6-27B-GGUF

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs deepseek-ai/DeepSeek-V4-Flash

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs google/gemma-4-31B-it

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs inclusionAI/LLaDA2.0-Uni

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs MiniMaxAI/MiniMax-M2.7

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs robbyant/lingbot-map

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs z-lab/Qwen3.6-35B-A3B-DFlash

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs Qwen/Qwen3.6-27B-FP8

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs zai-org/GLM-5.1

→

HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive vs deepseek-ai/DeepSeek-V4-Pro

→

Affiliate Disclosure: Some links on this page are affiliate links. If you purchase through them, we may earn a small commission at no extra cost to you. This does not influence our editorial reviews. We only recommend tools we have personally tested.

8.2/5 — Verified Pick

“Yes, for developers and researchers who require an open-weight, uncensored MoE model with extensive quantization options and strong reasoning capabilities.”

Try HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive Free →

✓ N/A (Open-weight model) · No risk

Starting price0.00

BillingOne-time download

Author Notes

Gryd Team · 27DFCFB

Straightforward Hugging Face repository with clear documentation on quantization options and recommended generation parameters.

~ Gryd Team

The Experience

😤

Pain Points

Requires significant VRAM for higher precision quantizations; lack of built-in safety filters means outputs require careful monitoring.

💡

Standout Moment

The extensive range of lossless GGUF quantizations allows the model to run efficiently on consumer hardware without sacrificing core capabilities.

📈

Learning Curve

Moderate; requires familiarity with local LLM inference tools like LM Studio, Ollama, or vLLM.

Quick Specs

Platforms	Linux, macOS, Windows, Cloud GPU Instances
Features	Mixture-of-Experts (35B total, 3B active), 262k native context length, Multimodal input (text, image, video), Native tool-calling & function support, Thinking mode with chain-of-thought preservation, Extensive GGUF quantization support (Q8 to IQ2), Zero safety refusals (Aggressive variant)
Pricing	Free to download and self-host. Cloud API pricing for the base Qwen3.6 model via Alibaba Cloud is approximately $0.78 per 1M input tokens and $3.90 per 1M output tokens.
Refund	N/A (Open-weight model)

Editor Note

This is a community-modified version of an open-weight model. It is provided as-is without official vendor support.

— Gryd Team