fix(vllm-metal): enable tool calling support in backend args by doringeman · Pull Request #783 · docker/model-runner

doringeman · 2026-03-24T11:35:56Z

Fixes

$ ldm run huggingface.co/mlx-community/SmolLM2-135M-Instruct hi
Failed to generate a response: error response: status=400 body={"error":{"message":"\"auto\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set","type":"BadRequestError","param":null,"code":400}}

after #771.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>

sourcery-ai

Hey - I've left some high level feedback:

The --enable-auto-tool-choice and --tool-call-parser hermes flags are now always enabled for the metal backend; consider making these options configurable or conditional so non-tool-calling use cases are not forced into this behavior.
If other vllm backends (non-metal) also need tool-calling support, it may be worth aligning the argument construction across backends to avoid inconsistent behavior between runtimes.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `--enable-auto-tool-choice` and `--tool-call-parser hermes` flags are now always enabled for the metal backend; consider making these options configurable or conditional so non-tool-calling use cases are not forced into this behavior.
- If other vllm backends (non-metal) also need tool-calling support, it may be worth aligning the argument construction across backends to avoid inconsistent behavior between runtimes.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request enables tool calling support for the vllm-metal backend by adding the necessary command-line arguments. While this fixes the reported issue, the current implementation hardcodes these arguments, which applies them to all models. My review includes a suggestion to make this feature configurable to ensure flexibility and prevent potential issues with models that do not support tool calling.

fix(vllm-metal): enable tool calling support in backend args

5d92360

Signed-off-by: Dorin Geman <dorin.geman@docker.com>

sourcery-ai Bot reviewed Mar 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed Mar 24, 2026

View reviewed changes

Comment thread pkg/inference/backends/vllm/vllm_metal.go

ilopezluna approved these changes Mar 24, 2026

View reviewed changes

doringeman merged commit 1047b07 into docker:main Mar 24, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vllm-metal): enable tool calling support in backend args#783

fix(vllm-metal): enable tool calling support in backend args#783
doringeman merged 1 commit into
docker:mainfrom
doringeman:vllm-metal-tool-calling

doringeman commented Mar 24, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

doringeman commented Mar 24, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants