Thudm Models
Explore the Thudm language and embedding models available through our OpenAI Assistants API-compatible service.
THUDM: GLM 4.1V 9B Thinking
- Context Length:
- 65,536 tokens
- Architecture:
- text+image->text
- Max Output:
- 8,000 tokens
Pricing:
GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.
THUDM: GLM Z1 32B
- Context Length:
- 32,768 tokens
- Architecture:
- text->text
- Max Output:
- 32,768 tokens
Pricing:
GLM-Z1-32B-0414 is an enhanced reasoning variant of GLM-4-32B, built for deep mathematical, logical, and code-oriented problem solving. It applies extended reinforcement learning—both task-specific and general pairwise preference-based—to improve performance on complex multi-step tasks. Compared to the base GLM-4-32B model, Z1 significantly boosts capabilities in structured reasoning and formal domains.
The model supports enforced “thinking” steps via prompt engineering and offers improved coherence for long-form outputs. It’s optimized for use in agentic workflows, and includes support for long context (via YaRN), JSON tool calling, and fine-grained sampling configuration for stable inference. Ideal for use cases requiring deliberate, multi-step reasoning or formal derivations.
Ready to build with Thudm?
Start using these powerful models in your applications with our flexible pricing plans.