Skip to content

Add model settings for DeepSeek R1 and V3 family on Azure AI Foundry#5169

Open
Cyberfilo wants to merge 1 commit into
Aider-AI:mainfrom
Cyberfilo:feat/add-azure-ai-deepseek-models
Open

Add model settings for DeepSeek R1 and V3 family on Azure AI Foundry#5169
Cyberfilo wants to merge 1 commit into
Aider-AI:mainfrom
Cyberfilo:feat/add-azure-ai-deepseek-models

Conversation

@Cyberfilo
Copy link
Copy Markdown
Contributor

Summary

Adds four entries to `aider/resources/model-settings.yml` for the DeepSeek family hosted on Azure AI Foundry (`azure_ai/` provider — distinct from the native `deepseek/` provider):

Model weak_model_name
`azure_ai/deepseek-r1` (reasoner) `azure_ai/deepseek-v3`
`azure_ai/deepseek-v3` (chat) (no weak, defaults)
`azure_ai/deepseek-v3-0324` (chat, pinned) (no weak, defaults)
`azure_ai/deepseek-v3.2` (chat, newer) (no weak, defaults)

These models exist in litellm's pricing JSON (aider's upstream source for model metadata via `ModelInfoManager.MODEL_INFO_URL`) but were missing project-side settings. Without them, aider falls back to defaults that don't match the DeepSeek family conventions (`reminder: sys`, `caches_by_default: true`, `use_temperature: false` for the reasoner).

Settings shape

Mirrors the existing native `deepseek/deepseek-reasoner` and `deepseek/deepseek-chat` blocks exactly:

  • `edit_format: diff` (the DeepSeek family's preferred edit grammar)
  • `use_repo_map: true`
  • `examples_as_sys_msg: true`
  • `caches_by_default: true` (DeepSeek-style implicit caching)
  • `reminder: sys` for chat variants
  • `use_temperature: false` + `max_tokens: 64000` for the reasoner; `max_tokens: 8192` for chat variants
  • Reasoner routes to v3 for weak/editor model duties

Test plan

  • YAML validity: `python -c "import yaml; yaml.safe_load(open('aider/resources/model-settings.yml'))"` parses cleanly.
  • Diff is purely additive: 40 lines in model-settings.yml + 1 line in HISTORY.md.

Litellm's pricing JSON tracks azure_ai/deepseek-r1, azure_ai/deepseek-v3,
azure_ai/deepseek-v3-0324, and azure_ai/deepseek-v3.2 (DeepSeek hosted
via Azure AI Foundry — a managed alternative to the native DeepSeek
API). Aider had no `azure_ai/` provider entries, so users routing
DeepSeek through Foundry got the wrong defaults.

Mirrors the existing `deepseek/deepseek-reasoner` and
`deepseek/deepseek-chat` blocks:

- azure_ai/deepseek-r1 — reasoner-style, `max_tokens: 64000`,
  `use_temperature: false`, weak model routes to azure_ai/deepseek-v3
- azure_ai/deepseek-v3 / v3-0324 / v3.2 — chat-style, `max_tokens: 8192`,
  `reminder: sys`, `caches_by_default: true`

All entries use the standard `edit_format: diff` with repo map and
sys-message examples.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant