DeepSeek v4 is not just another model launch; it is a tooling and economics event. The reason builders care is simple: you can now get competitive capability with much lower inference cost in many workloads, and the integration friction is relatively low because DeepSeek supports OpenAI-compatible and Anthropic-compatible formats. If your startup margin depends on token costs, this is a “change your routing this week” moment.
This setup guide shows how to wire DeepSeek v4 across the major dev surfaces: Claude Code, Cursor, Zed, direct API, Bedrock-style routing, and Vertex-style orchestration. The snippets are intentionally practical so you can copy, adapt, and ship.
Before all tools: model IDs and base URLs
DeepSeek v4 introduces two main model targets:
{
"models": {
"throughput_default": "deepseek-v4-flash",
"high_capability": "deepseek-v4-pro"
},
"base_urls": {
"openai_compatible": "https://api.deepseek.com",
"anthropic_compatible": "https://api.deepseek.com/anthropic"
}
}
Use deepseek-v4-flash for broad traffic and deepseek-v4-pro for high-value complex tasks. Keep a rollback route to your current production model until metrics prove the switch.
Claude Code
In Claude Code, add DeepSeek as a provider profile and split default vs deep-work profiles. This gives fast migration without forcing every task onto pro-level cost.
{
"providers": {
"deepseek": {
"type": "openai-compatible",
"baseUrl": "https://api.deepseek.com",
"apiKeyEnv": "DEEPSEEK_API_KEY"
}
},
"profiles": {
"default": {
"provider": "deepseek",
"model": "deepseek-v4-flash"
},
"complex-agent": {
"provider": "deepseek",
"model": "deepseek-v4-pro"
},
"rollback": {
"provider": "openai",
"model": "gpt-5.4"
}
}
}
Exact change: set your default Claude Code profile model to deepseek-v4-flash and add a separate complex-agent profile for deepseek-v4-pro.
Cursor
Cursor should be configured with model overrides by task type. If you set everything to v4-pro, you erase the cost-efficiency advantage.
{
"cursor.ai.provider": "openai-compatible",
"cursor.ai.baseUrl": "https://api.deepseek.com",
"cursor.ai.apiKeyEnv": "DEEPSEEK_API_KEY",
"cursor.ai.defaultModel": "deepseek-v4-flash",
"cursor.ai.modelOverrides": {
"multi_file_refactor": "deepseek-v4-pro",
"complex_debugging": "deepseek-v4-pro",
"quick_edits": "deepseek-v4-flash"
},
"cursor.ai.maxOutputTokens": 8192
}
Exact change: switch base URL to DeepSeek and map high-complexity tasks to deepseek-v4-pro, leaving low-risk tasks on flash.
Zed
Zed migration is easiest with profile-based separation. Keep “everyday” and “deep-work” profiles so teams can choose capability vs cost intentionally.
{
"assistant": {
"provider": "openai-compatible",
"base_url": "https://api.deepseek.com",
"api_key_env": "DEEPSEEK_API_KEY",
"default_model": "deepseek-v4-flash",
"profiles": {
"everyday": {
"model": "deepseek-v4-flash"
},
"deep-work": {
"model": "deepseek-v4-pro"
}
}
}
}
Exact change: set assistant.default_model to flash and assistant.profiles.deep-work.model to pro.
Direct API integration
If your app already uses OpenAI-style chat completions, DeepSeek can slot in with mostly configuration updates. Add reasoning controls deliberately so latency and spend remain predictable.
{
"url": "https://api.deepseek.com/chat/completions",
"headers": {
"Authorization": "Bearer ${DEEPSEEK_API_KEY}",
"Content-Type": "application/json"
},
"body": {
"model": "deepseek-v4-pro",
"messages": [
{"role": "system", "content": "You are a precise engineering assistant."},
{"role": "user", "content": "Find root cause and fix this production bug."}
],
"thinking": {"type": "enabled"},
"reasoning_effort": "high",
"stream": false
}
}
Recommended env variables:
DEEPSEEK_API_KEY=***
MODEL_DEFAULT=deepseek-v4-flash
MODEL_COMPLEX=deepseek-v4-pro
MODEL_ROLLBACK=gpt-5.4
Exact change: replace base URL and model routing while preserving your existing request orchestration.
Bedrock-style routing layer
Many teams use an internal multi-provider gateway even if they call it “Bedrock routing.” Add DeepSeek as a first-class provider and route by policy.
{
"router": {
"providers": {
"deepseek": {
"type": "openai-compatible",
"base_url": "https://api.deepseek.com",
"api_key_env": "DEEPSEEK_API_KEY"
},
"openai": {
"type": "openai",
"api_key_env": "OPENAI_API_KEY"
}
},
"routes": {
"default": {"provider": "deepseek", "model": "deepseek-v4-flash"},
"complex": {"provider": "deepseek", "model": "deepseek-v4-pro"},
"regulated": {"provider": "openai", "model": "gpt-5.5"}
}
}
}
Exact change: add DeepSeek provider and route complex workloads there while preserving a compliance-safe alternative lane.
Vertex-style orchestration
On Vertex-centric stacks, a common approach is to keep orchestration in Vertex while delegating model calls through an internal gateway. This avoids rewiring every app.
{
"vertex_orchestrator": {
"backend": "external_llm_gateway",
"gateway_url": "https://llm-gateway.internal/v1/chat/completions",
"headers": {
"X-Provider": "deepseek"
},
"model_map": {
"default": "deepseek-v4-flash",
"complex": "deepseek-v4-pro",
"rollback": "gpt-5.4"
}
}
}
Exact change: update model map entries in your gateway-backed Vertex orchestration config instead of rewriting business logic.
Shared cost and risk controls you should add immediately
DeepSeek’s advantage is cost-efficient performance. Protect that advantage with hard controls from day one.
{
"controls": {
"daily_token_cap": 5000000,
"per_task_token_cap": 200000,
"timeout_ms": 120000,
"retry_limit": 2,
"escalation_rule": "flash_first_then_pro_if_confidence<0.7"
},
"compliance": {
"require_data_classification": true,
"require_region_policy_check": true
},
"metrics": [
"completion_rate",
"retry_count",
"tokens_per_completed_task",
"cost_per_completed_task"
]
}
Geopolitical risk is now part of architecture. If you ignore policy and procurement constraints, you can win technical tests and still lose deployment approval.
Final rollout checklist
{
"preflight": [
"DeepSeek API key and model IDs verified",
"Route-based canary enabled (10% traffic)",
"Rollback path tested under load",
"Parser/schema compatibility validated",
"Compliance review completed"
]
}
If all five are true, scale. If not, keep rollout narrow. DeepSeek v4 can be a major margin and capability lever, but only when deployed with routing discipline and policy awareness.
Now you know more than 99% of people. — Sara Plaintext