SWE-1
SWE-1 is our family of in-house frontier models built specifically for software engineering tasks. Based on our internal evals, it has performance nearing that of frontier models from the foundation labs.- SWE-1: High-reasoning, tool-capable, and Cascade-optimized. Claude 3.5-level performance at a fraction of the cost.
- SWE-1-mini: Powers passive suggestions in Windsurf Tab, optimized for real-time latency.
Bring your own key (BYOK)
This is only available to free and paid individual users.
BYOK
.
Note that this is different from API Pricing.
To input your API key, navigate to this page in the subscription settings and add your key.
If you have not configured your API key, it will return an error if you try to use the BYOK model.
Currently, we only support BYOK for these models:
Claude 4 Sonnet
Claude 4 Sonnet (Thinking)
Claude 4 Opus
Claude 4 Opus (Thinking)
API Pricing
Unlike flat rate pricing, where a fixed number of credits are used for each user prompt, API pricing charges a fixed number of credits per token processed (i.e. proportional to compute). The number of credits per token processed is variable based on the model selected. Both API pricing and flat rate pricing consume the same pool of credits. Models with API pricing are clearly marked in the model selector. We charge the model’s API price plus a 20% margin. Each credit corresponds to $0.04.We utilize the same tokenizers as the model providers (Anthropic’s for Claude models, OpenAI’s for GPT models, etc) to ensure accurate and consistent token counting and pricing. View OpenAI’s tokenizer demo
Model | Plans with API Pricing | Input Tokens (Credits / Million Tokens)1 | Cache Read Tokens (Credits / Million Tokens)2 | Output Tokens (Credits / Million Tokens) |
---|---|---|---|---|
Claude Sonnet 4 | - Pro - Teams - Enterprise (contracted) - Enterprise (self-serve) | 90 | 9 | 450 |
Claude Sonnet 4 (Thinking) | - Pro - Teams - Enterprise (contracted) - Enterprise (self-serve) | 90 | 9 | 450 |
2 The prompt cache has a limited TTL (time-to-live) determined by the model provider (eg. approximately 5 minutes on Anthropic). Even within the TTL, the prompt cache is not guaranteed to hit. Prompt cache misses are charged as input tokens.
Example Conversation
To show how API pricing works in practice, let us walk through an example conversation with Cascade using Claude Sonnet 4 directly.Role | Message | Tokens | Note | Cost per message |
---|---|---|---|---|
User | Refactor @my_function | 20k | Input (cache write). Note: Incl. full shared timeline, editor context & system prompt. | 2.25 Credits |
Windsurf | Let me first analyze my_function to come up with a plan to refactor it. | 1k | Output tokens. | 0.45 Credits |
tool_call | Analyze my_function | 23k | Input (cache read) + Input (cache write). | 0.42 Credits |
Windsurf | Here is a plan to refactor my_function […] do you want me to continue with implementing? | 2k | Output tokens. | 0.90 Credits |
User | Yes, continue. | 46k | Input (cache read) + Input (cache write). | 0.52 Credits |
tool_call | Edit foo.py | 50k | Input (cache read) + Output tokens. | 2.22 Credits |
tool_call | Add bar.py | 56k | Input (cache read) + Output tokens. | 3.15 Credits |
Windsurf | I am done refactoring my_function. Here is a summary of my changes: […] | 2k | Output tokens. | 0.90 Credits |
Total | 200k | 10.81 Credits |