Skip to content

Conversation

@dtrawins
Copy link
Collaborator

@dtrawins dtrawins commented Dec 9, 2025

🛠 Summary

JIRA/Issue if applicable.
Describe the changes.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

DevstralToolParser() = delete;
DevstralToolParser(ov::genai::Tokenizer& tokenizer, const ToolsSchemas_t& toolSchemas) :
BaseOutputParser(tokenizer),
argsTokenId(tokenizer.encode("[ARGS]", {{"add_special_tokens", false}}).input_ids.data<int64_t>()[0]),
Copy link
Collaborator

@dkalinowski dkalinowski Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we ensured that [ARGS] / [TOOL_CALLS] are single tokens, treated as special, not as string, for example [AR and GS]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are a special tokens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that doesnt answer my question

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devstral parser is setting requiresStreamingWithSpecialTokens() as true

DevstralToolParser(ov::genai::Tokenizer& tokenizer, const ToolsSchemas_t& toolSchemas) :
BaseOutputParser(tokenizer),
argsTokenId(tokenizer.encode("[ARGS]", {{"add_special_tokens", false}}).input_ids.data<int64_t>()[0]),
botTokenId(tokenizer.encode("[TOOL_CALLS]", {{"add_special_tokens", false}}).input_ids.data<int64_t>()[0]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate if input_ids token count is == 1?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we could also do that for argsTokenId and fail if in any of those cases we have more than one token from encoding

ToolCall toolCall;
std::string tool_name = tokenizer.decode(tool_name_tokens, ov::AnyMap{ov::genai::skip_special_tokens(true)});
if (this->toolSchemas.find(tool_name) == this->toolSchemas.end()) {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool name '{}' not valid.", tool_name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is behavior we havent implemented in other parsers, is it really worth return early? if we return function name that is not part of the toolschemas spec, we might be able to debug it in bfcl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in line with current behavior in other parsers. I wouldn't do that check if it's only for this parser. Either drop it or create a task for alignment of other parsers.

if (pos == 0) {
this->streamContent.clear();
} else {
this->streamContent = this->streamContent.substr(pos + 13); // "[TOOLS_CALLS]" length is 13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should avoid magic numbers, this way if we change this->streamingParsingToolCallsStartTag to another value, this part will be incorrect

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can look up how Adrian handles that in qwen coder

this->lastProcessedPosition = pos + Qwen3CoderToolParser::PARAMETER_END_TAG.length();

ToolCall toolCall;
toolCall.arguments = arguments;
toolCall.name = this->toolName;
return sendFullDelta(toolCall);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldntg we stream partial function argument chunks? if i understand correctly you send full delta at the end of generation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already accepted such approach for qwen3 coder, so I suppose we can have it in other parsers as well unless there are specific requirements for "real" streaming.

bool requiresStreamingWithSpecialTokens() const {
return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) &&
return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) ||
(toolParser && toolParser->requiresStreamingWithSpecialTokens());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzegla when i implemented it i remember your comment why it should really be && instead of ||, do you remember what was the reason?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been implemented this way to make sure we don't allow two parsers with different special tokens approach as they will receive the same model output, so they must both either require it or not.
I guess for this case, where we don't have reasoning parser but want to require special tokens for tool parser we should modify this function like:

if (!reasoningParser) {
	return toolParser && toolParser->requiresStreamingWithSpecialTokens();
} else if (!toolParser) {
	return reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens();
} else {
	return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) &&
    	(toolParser && toolParser->requiresStreamingWithSpecialTokens());
}

}

TEST_F(DevstralOutputParserTest, ParseToolCallOutputWithSingleToolCall_MissingEndTag) {
std::string testInput = "Reasoninig before tool call [TOOL_CALLS]example_tool[ARGS]{\"arg1\":\"value1\",\"arg2\":42}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add test for scenarios with whitespace between the tags? i saw other models often put spaces or new lines before/after the function name

{"{\"", ov::genai::GenerationFinishReason::NONE, std::nullopt},
{"city\":", ov::genai::GenerationFinishReason::NONE, std::nullopt},
{" \"Paris", ov::genai::GenerationFinishReason::NONE, std::nullopt},
// Last chunk is added in the for loop below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing test for enclosing scenario

DevstralToolParser(ov::genai::Tokenizer& tokenizer, const ToolsSchemas_t& toolSchemas) :
BaseOutputParser(tokenizer),
argsTokenId(tokenizer.encode("[ARGS]", {{"add_special_tokens", false}}).input_ids.data<int64_t>()[0]),
botTokenId(tokenizer.encode("[TOOL_CALLS]", {{"add_special_tokens", false}}).input_ids.data<int64_t>()[0]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, we could also do that for argsTokenId and fail if in any of those cases we have more than one token from encoding

Comment on lines +31 to +36
const int64_t argsTokenId; // [ARGS]
const int64_t botTokenId; // [TOOL_CALLS]

// in streaming mode we can rely on tags in string format as tokens are not available
const std::string streamingParsingArgsStartTag = "[ARGS]";
const std::string streamingParsingToolCallsStartTag = "[TOOL_CALLS]";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those tags/tokens are not specific to streaming, so I think we can drop streamingParsing prefix
Those are variables are about the same thing - please unify naming:
either botToken or toolCallsStart:
toolCallsStartTokenId, toolCallsStartTag or
botTokenId, botTag
and either argsTokenId or ArgsStartTag:
argsStartTokenId, argsStartTag or
argsTokenId, argsTag

bool requiresStreamingWithSpecialTokens() const {
return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) &&
return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) ||
(toolParser && toolParser->requiresStreamingWithSpecialTokens());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been implemented this way to make sure we don't allow two parsers with different special tokens approach as they will receive the same model output, so they must both either require it or not.
I guess for this case, where we don't have reasoning parser but want to require special tokens for tool parser we should modify this function like:

if (!reasoningParser) {
	return toolParser && toolParser->requiresStreamingWithSpecialTokens();
} else if (!toolParser) {
	return reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens();
} else {
	return (reasoningParser && reasoningParser->requiresStreamingWithSpecialTokens()) &&
    	(toolParser && toolParser->requiresStreamingWithSpecialTokens());
}

EXPECT_EQ(parsedOutput.reasoning, "");
ASSERT_EQ(parsedOutput.toolCalls.size(), 0);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


void DevstralToolParser::parse(ParsedOutput& parsedOutput, const std::vector<int64_t>& generatedTokens) {
std::vector<std::string> tools;
// Parser will consume entire model output only if the first generated token is the beginning of tools token.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not look like this comment is true for this parser

}
}
if (this->internalState == AWAITING_ARGS_TAG) {
// check if [ARGS] tag is present in the chunk and update state accordingly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// check if [ARGS] tag is present in the chunk and update state accordingly
// check if [ARGS] tag is present in the streamContent and update state accordingly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically we check streamContent but it will be the case only if [ARGS] is added in the chunk. Otherwise it would be different state

if (pos != std::string::npos) {
this->internalState = PROCESSING_ARGS;
this->toolName = this->streamContent.substr(0, pos);
if (this->toolSchemas.find(this->toolName) == this->toolSchemas.end()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the unary part - this check is unique to this parser and I don't think it's a good idea to have different behavior for different parsers. Either remove or create a task for alignment of other parsers.

SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Tool name '{}' not valid.", this->toolName);
return std::nullopt;
}
this->streamContent = this->streamContent.substr(pos + 6); // "[ARGS]" length is 6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic number

}
}
if (finishReason != ov::genai::GenerationFinishReason::NONE) {
size_t end_pos = this->streamContent.find("</s>");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this token? If it has some significant value for the parsing it should be a member of the parser class like args and tool calls token. Also:

Suggested change
size_t end_pos = this->streamContent.find("</s>");
size_t endPos = this->streamContent.find("</s>");

ToolCall toolCall;
toolCall.arguments = arguments;
toolCall.name = this->toolName;
return sendFullDelta(toolCall);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already accepted such approach for qwen3 coder, so I suppose we can have it in other parsers as well unless there are specific requirements for "real" streaming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants