Stepfun-ai Models

Explore the Stepfun-ai language and embedding models available through our OpenAI Assistants API-compatible service.

Stepfun-ai logo

StepFun: Step3

Context Length:
65,536 tokens
Architecture:
text+image->text
Max Output:
65,536 tokens

Pricing:

Prompt: $0.00000057
Completion: $0.00000142

Step3 is a cutting-edge multimodal reasoning model—built on a Mixture-of-Experts architecture with 321B total parameters and 38B active. It is designed end-to-end to minimize decoding costs while delivering top-tier performance in vision–language reasoning. Through the co-design of Multi-Matrix Factorization Attention (MFA) and Attention-FFN Disaggregation (AFD), Step3 maintains exceptional efficiency across both flagship and low-end accelerators.

Ready to build with Stepfun-ai?

Start using these powerful models in your applications with our flexible pricing plans.