OpenAI is shutting down the Assistants API on August 26, 2026. Learn how to migrate to Responses API + Conversations API or use wire-compatible alternatives.

OpenAI Assistants API Shutdown (August 26, 2026): Migration Guide & Wire-Compatible Alternatives

Last updated: January 28, 2026

OpenAI has deprecated the Assistants API and plans to remove it from the API on August 26, 2026. If you have production workloads built on /v1/assistants and /v1/threads, you need to migrate to OpenAI's Responses API + Conversations API (recommended) or move to a third-party/wire-compatible implementation before that date.


Key Timeline (Updated)

  • December 18, 2024: Assistants API v1 beta access discontinued (v2 only after this)
  • August 26, 2025: OpenAI notified developers of Assistants API deprecation
  • Today (January 28, 2026): Assistants API is still working, but deprecated
  • August 26, 2026: Assistants API shutdown / removal (requests will no longer work)

Table of Contents

  1. What the Deprecation Means
  2. What Changed Since 2025: The New Mental Model
  3. Impact on Existing Applications
  4. Migration Option 1: OpenAI Responses API + Conversations API
  5. Migration Option 2: Wire-Compatible Alternatives
  6. Data Migration Realities
  7. Migration Decision Framework
  8. Timeline and Action Plan (Jan 2026 → Aug 2026)
  9. Frequently Asked Questions

What the Deprecation Means

OpenAI's Assistants API is deprecated and is scheduled for removal on August 26, 2026. The official recommended replacements are the Responses API and Conversations API.

What This Means in Practice:

  • Your existing Assistants API integrations still work today, but you should treat them as end-of-life
  • New agent features and a simpler architecture are centered on Responses + Conversations, and the official migration path is documented
  • If you do nothing, your Assistants endpoints will stop working after the shutdown date

What Changed Since 2025: The New Mental Model

OpenAI's migration guide frames the platform shift as a conceptual replacement of core objects:

Before (Assistants API) Now (Responses platform) Why it matters
Assistants Prompts Prompts hold configuration (model, tools, instructions) and are easier to version and update
Threads Conversations Conversations store a stream of "items" (not just messages)
Runs Responses A response consumes input items (or a conversation) and yields output items; tool loops are explicit
Run steps Items Tool calls, tool outputs, messages, etc. are unified as "items"

Prompts Replace Assistants

  • Assistants were persistent objects created/managed via API
  • Prompts replace them and are created in the dashboard (and can be versioned)

Conversations Replace Threads

  • If you want "thread-like" server-side state, use the Conversations API, then pass a conversation ID to the Responses API
  • You can also chain via previous_response_id, but it cannot be used at the same time as conversation

Impact on Existing Applications

Migration difficulty is mostly driven by how much you rely on:

  • Server-side thread state (Threads / Messages / Runs)
  • Tool orchestration (especially multi-step function calls)
  • Retrieval patterns (vector stores, metadata filtering, citations)

Apps Most Impacted:

  • Complex multi-turn assistants built around Threads/Runs polling
  • File-heavy RAG systems that rely on persistent vector stores and citations
  • Tool-heavy workflows that depend on run-steps and tool call lifecycle

Apps Less Impacted:

  • Stateless chat or single-turn generation
  • Systems already doing client-managed state

Migration Option 1: OpenAI Responses API + Conversations API

This is OpenAI's recommended replacement path for Assistants API.

Why This Path is the "Default" Recommendation

  • Officially supported replacement stack: Responses + Conversations
  • Better performance + newer agent capabilities (deep research, MCP, computer use) are emphasized in the migration guide
  • Built-in tools and remote MCP servers are integrated into Responses workflows

Step-by-Step: "Assistants → Prompts → Responses + Conversations"

1) Inventory What You Have

Capture:

  • Assistants (instructions, model, tools)
  • Threads and message volume (how much history you need to preserve)
  • Vector stores and file sources (what content must remain searchable)

2) Convert Important Assistants into Prompts (Dashboard)

OpenAI's migration guide explicitly recommends creating prompts from existing assistants in the dashboard.

3) Replace Threads with Conversations (Server-Side State)

Create a conversation:

import OpenAI from "openai";
const client = new OpenAI();

const conversation = await client.conversations.create({
  metadata: { topic: "support-bot" },
  items: [
    { type: "message", role: "user", content: "Hello!" }
  ],
});

console.log(conversation.id);

Then generate via Responses, attached to that conversation:

const response = await client.responses.create({
  model: "gpt-4.1",
  conversation: conversation.id,
  input: [{ role: "user", content: "How can you help me today?" }],
});

console.log(response.output_text);

Conversations persist "items" (messages, tool calls, outputs) and can be reused across sessions.

4) Decide How You'll Manage State (3 Valid Patterns)

A) Conversations API (closest to Threads)

  • Best for: long-running, durable conversations and server-side state

B) previous_response_id chaining (quick but less structured)

  • Best for: simple "continue from last response" flows

C) Client-managed history (stateless server)

  • Best for: total control; no server-side storage

Note on retention: Response objects are stored with a default TTL (30 days) unless you disable store; conversation items are not subject to that 30-day TTL.


Tooling in Responses: What to Change

The Responses platform supports built-in tools plus function calling and remote MCP servers.

Web Search

const response = await client.responses.create({
  model: "gpt-5",
  tools: [{ type: "web_search" }],
  input: "Summarize a positive news story from today.",
});

console.log(response.output_text);

File Search (RAG) with Vector Stores

File search in Responses is driven by vector stores; you pass vector_store_ids in the tools config.

// Assuming you already created a vector store and uploaded files to it:
const response = await client.responses.create({
  model: "gpt-4.1",
  input: "What does our policy say about refunds?",
  tools: [
    {
      type: "file_search",
      vector_store_ids: ["vs_123"],
      // optional:
      // max_num_results: 5,
      // filters: { ... },
    },
  ],
  // optional: include raw search results
  // include: ["file_search_call.results"],
});

console.log(response.output_text);

Function Calling (Your Tools)

Responses still uses the tools array to describe callable functions, and you run your own code when the model requests it.


Pricing Changes to Account For (Tool Costs)

OpenAI's tool costs (as of the current pricing docs):

  • Code Interpreter: priced per container/session (e.g., $0.03 for 1 GB default container)
  • File search: storage is $0.10/GB/day after the first free GB; tool calls are $2.50 / 1k calls (Responses only)
  • Web search: tool call pricing varies by tool version/model class (e.g., $10 / 1k calls for "web search" plus search content tokens billed at model rates in many cases)

Migration Option 2: Wire-Compatible Alternatives

If you want to avoid rewriting your application around Prompts/Conversations/Responses before August 26, 2026, a wire-compatible platform can act as an interim or long-term substitute: you keep your Assistants API client code and swap the base URL + API key.

What "Wire-Compatible" Usually Means

Your existing calls like openai.beta.assistants.create(), openai.beta.threads.create(), openai.beta.threads.runs.create(), etc. can continue working with minimal change (often just baseURL).

When Wire Compatibility is a Good Fit

  • You have a large Assistants integration and can't justify a full refactor immediately
  • You need more time to redesign architecture or validate new agent flows
  • You want to preserve the Assistants "Threads/Runs" mental model during transition

Tradeoffs to Be Explicit About

  • You may not get the newest OpenAI-native agent features that are being shipped around Responses + Conversations (deep research, MCP, computer use, etc.)
  • You should validate tool behavior parity: streaming semantics, run-step lifecycles, file search behavior, and error codes can differ between implementations

Example: Ragwalla (Wire-Compatible Assistants API)

Ragwalla's docs describe an Assistants API-compatible endpoint and show configuring the standard OpenAI SDK with a custom baseURL.

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.RAGWALLA_API_KEY,
  baseURL: "https://example.ai.ragwalla.com/v1",
});

// Existing Assistants API code paths remain the same:
const thread = await openai.beta.threads.create();

Other Wire-Compatible Approaches

There are also open-source drop-in implementations described as Assistants API compatible (for example, DataStax's "astra-assistants-api" project).


Data Migration Realities

If You Migrate Within OpenAI (Assistants → Responses/Conversations)

  • Vector stores are still first-class objects and power file search; you can continue to use them through the Retrieval/File Search stack
  • You should plan to migrate "thread-like" state from Threads into Conversations if you rely on server-side persistence

If You Migrate Away from OpenAI (to a Wire-Compatible Alternative or Other Platform)

  • Plan to re-index/re-embed from your original files. Vector stores are described as containers of chunked/embedded content backing retrieval; you should treat the embedded index as platform-specific
  • Make sure you have a clean source-of-truth copy of every document you uploaded

Practical Approach to Preserving "Thread History"

If you need to preserve conversational history after Assistants shutdown:

  1. Export existing thread messages now (while Assistants still works)
  2. Create a Conversation and seed it with message items (Conversations API supports creating a conversation with initial items)
  3. Continue via Responses using conversation: <id>

Migration Decision Framework

Quick Decision Tree

Are you currently using /v1/assistants or /v1/threads?
├─ No  → You likely don't need this guide
└─ Yes → Continue
    ├─ Do you need the fastest "keep code working" path?
    │   ├─ Yes → Consider wire-compatible alternatives (temporary or permanent)
    │   └─ No  → Migrate to OpenAI Responses + Conversations
    ├─ Do you want OpenAI's newest agent stack (deep research, MCP, computer use)?
    │   ├─ Yes → Migrate to Responses + Conversations (and Prompts)
    │   └─ No  → Wire-compatible may be acceptable short-term
    └─ Do you rely on server-side conversation persistence?
        ├─ Yes → Use Conversations API (closest to Threads)
        └─ No  → Use client-managed state or previous_response_id chaining

Recommendation Matrix (Updated)

Your situation Strong default Why
You want to follow OpenAI's long-term direction Responses + Conversations + Prompts Official replacement path and mental model
You must minimize refactor risk right now Wire-compatible Keeps Assistants-style code paths while you plan
You need thread-like server-side state Conversations API Durable state object replacing Threads
You have heavy RAG usage Either, but plan carefully File search is vector-store based; costs include storage + tool calls

Timeline and Action Plan (Jan 2026 → Aug 2026)

Now (Late Jan → Feb 2026)

  • Inventory Assistants usage (assistants, threads, runs, tools, vector stores)
  • Decide: OpenAI migration vs wire-compatible as a bridge
  • Create at least one end-to-end prototype in Responses API
  • If you need persistent state, prototype Conversations API early

March → April 2026

  • Convert your most important Assistants into dashboard Prompts
  • Implement tool-loop handling for function calling and multi-step workflows
  • Implement cost monitoring for file search + web search tool calls

May → June 2026

  • Migrate production traffic gradually (feature flags / per-tenant cutover)
  • Run shadow traffic comparisons (Assistants vs Responses) to detect regressions
  • Confirm retrieval quality and citations behavior under file search

July → Early Aug 2026

  • Finish cutover of all Assistants API endpoints
  • Export any remaining thread history you must keep
  • Leave buffer for last-minute parity gaps or unexpected behavior changes

Hard Deadline

  • August 26, 2026: Assistants API removed. Plan to be fully migrated well before this date

Frequently Asked Questions

When exactly will the Assistants API stop working?

OpenAI's published shutdown date is August 26, 2026.

What's the official replacement for Assistants?

OpenAI recommends migrating to Responses API + Conversations API and using Prompts for assistant configuration.

Do I have to use Conversations, or can I keep state myself?

You can manage state in multiple ways:

  • Client-managed message history
  • previous_response_id chaining
  • Conversations API (durable server-side)

All three are documented as supported approaches.

Can I use both conversation and previous_response_id?

No — the Responses API docs note you can't use previous_response_id while using a conversation.

Does file search still use vector stores?

Yes. File search is built on vector stores; in Responses you pass vector_store_ids to the file_search tool config.

What are the key tool costs I should plan for?

From OpenAI's pricing docs:

  • File search storage: $0.10/GB/day after 1GB free
  • File search tool calls (Responses only): $2.50/1k calls
  • Web search tool calls: priced per tool version/model class, commonly $10/1k calls plus search content tokens billed at model rates

Always confirm current prices before finalizing budgets.

What if I don't migrate by the deadline?

Your Assistants API endpoints will stop working after the shutdown date.


Conclusion

You now have a fixed, official deadline: August 26, 2026.

If you want OpenAI's newest agent stack and the supported long-term architecture, migrate to:

  • Prompts (dashboard) for configuration
  • Conversations API for durable state
  • Responses API for generation + tool use

If you need to keep existing Assistants code running with minimal refactor while you plan a larger redesign, a wire-compatible implementation can be a practical bridge—but you should validate feature parity and understand what you may lose compared to the native Responses/Conversations stack.