Skip to content

fix(vllm-metal): enable tool calling support in backend args#783

Merged
doringeman merged 1 commit into
docker:mainfrom
doringeman:vllm-metal-tool-calling
Mar 24, 2026
Merged

fix(vllm-metal): enable tool calling support in backend args#783
doringeman merged 1 commit into
docker:mainfrom
doringeman:vllm-metal-tool-calling

Conversation

@doringeman
Copy link
Copy Markdown
Contributor

Fixes

$ ldm run huggingface.co/mlx-community/SmolLM2-135M-Instruct hi
Failed to generate a response: error response: status=400 body={"error":{"message":"\"auto\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set","type":"BadRequestError","param":null,"code":400}}

after #771.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The --enable-auto-tool-choice and --tool-call-parser hermes flags are now always enabled for the metal backend; consider making these options configurable or conditional so non-tool-calling use cases are not forced into this behavior.
  • If other vllm backends (non-metal) also need tool-calling support, it may be worth aligning the argument construction across backends to avoid inconsistent behavior between runtimes.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `--enable-auto-tool-choice` and `--tool-call-parser hermes` flags are now always enabled for the metal backend; consider making these options configurable or conditional so non-tool-calling use cases are not forced into this behavior.
- If other vllm backends (non-metal) also need tool-calling support, it may be worth aligning the argument construction across backends to avoid inconsistent behavior between runtimes.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables tool calling support for the vllm-metal backend by adding the necessary command-line arguments. While this fixes the reported issue, the current implementation hardcodes these arguments, which applies them to all models. My review includes a suggestion to make this feature configurable to ensure flexibility and prevent potential issues with models that do not support tool calling.

Comment thread pkg/inference/backends/vllm/vllm_metal.go
@doringeman doringeman merged commit 1047b07 into docker:main Mar 24, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants