Skip to main content

Tool Use

New in v14.29.0

Tool use lets the LLM call functions in your app — running calculations, fetching data, querying models — before composing its final response. For simple chat without tool calling, return to the Viktor LLM overview.

Viktor LLM supports OpenAI's function-calling schema via the Responses API. It maintains conversation context on the server side: instead of re-sending the full message history on each call, you chain requests using previous_response_id.

Tool schema definition

Define tools using the standard OpenAI function schema:

WEATHER_TOOL = {
"type": "function",
"name": "get_weather",
"description": "Retrieve the current temperature for a given city.",
"parameters": {
"type": "object",
"properties": {
"city_name": {
"type": "string",
"description": "The name of the city to get the weather for.",
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit", "Kelvin"],
"description": "The temperature unit to return the result in.",
},
},
"required": ["city_name", "unit"],
},
}

Tool handler

Implement a Python function that executes when the model requests the tool:

def call_weather_tool(city_name: str, unit: str) -> str:
# Replace with your actual data source or calculation
return f"22 {unit}s"

Agentic loop

The agentic loop runs non-streamed calls until all tool requests are resolved, then streams the final answer.

import json
import openai
import viktor as vkt
from openai import OpenAI


class Parametrization(vkt.Parametrization):
chat = vkt.Chat("Chat", method="submit_responses")


class Controller(vkt.Controller):
parametrization = Parametrization

def submit_responses(self, params, **kwargs) -> vkt.ChatResult:
client = OpenAI(
base_url=vkt.ViktorOpenAI.get_base_url(version="v1"),
api_key=vkt.ViktorOpenAI.get_api_key(),
)

conversation = params.chat
input_messages = conversation.get_messages()

tool_call_display = ""
previous_response_id = None

try:
# Initial non-streamed call
response = client.responses.create(
model="openai.gpt-oss-120b",
input=input_messages,
tools=[WEATHER_TOOL],
max_output_tokens=96000,
)
previous_response_id = response.id

# Agentic loop: resolve tool calls until none remain
while any(item.type == "function_call" for item in response.output):
tool_inputs = []

for output_item in response.output:
if output_item.type != "function_call":
continue

args = json.loads(output_item.arguments)

if output_item.name == "get_weather":
result = call_weather_tool(
city_name=args["city_name"],
unit=args["unit"],
)
tool_call_display += f"Tool called: {output_item.name}, result: {result}\n\n"
tool_inputs.append({
"type": "function_call_output",
"call_id": output_item.call_id,
"output": result,
})

if not tool_inputs:
break # No recognised tool calls — avoid infinite loop

# Feed tool results back, chaining via previous_response_id
response = client.responses.create(
model="openai.gpt-oss-120b",
input=tool_inputs,
previous_response_id=previous_response_id,
tools=[WEATHER_TOOL],
max_output_tokens=96000,
)
previous_response_id = response.id

except openai.APIStatusError as exc:
raise vkt.UserError(f"API error {exc.status_code}: {exc.message}")
except openai.APIConnectionError as exc:
raise vkt.UserError(f"Network error while calling the API: {exc}")

def response_generator():
try:
if tool_call_display:
yield tool_call_display

# Stream the final answer
with client.responses.create(
model="openai.gpt-oss-120b",
input=[],
previous_response_id=previous_response_id,
tools=[WEATHER_TOOL],
max_output_tokens=96000,
stream=True,
) as stream:
for event in stream:
if event.type == "response.output_text.delta":
yield event.delta

except openai.APIStatusError as exc:
raise vkt.UserError(f"API error {exc.status_code}: {exc.message}")
except openai.APIConnectionError as exc:
raise vkt.UserError(f"Network error while calling the API: {exc}")

return vkt.ChatResult(conversation, response_generator())

Key patterns

  • Non-streamed tool loop — use client.responses.create(stream=False) for tool-call iterations so you can inspect response.output items synchronously.
  • previous_response_id chaining — pass the ID of the previous response to maintain context without re-sending the full history.
  • function_call_output items — feed tool results back as input in the next call.
  • Streamed final answer — once all tool calls are resolved, make one final streamed call with input=[] and the last previous_response_id.