Integrating with LLMs
Viktor LLM is a built-in large language model service — no external API key or third-party account required. It uses an OpenAI-compatible API, so you can use the standard openai Python client directly in your app.
Why use Viktor LLM
- No setup — no API key, no third-party account, no environment variables to manage
- OpenAI-compatible — use the standard
openaiPython client - Works in App Builder — the
openaipackage is pre-installed.
How chat completions work
LLMs are stateless — the model has no memory of previous messages. Every request must include the complete conversation history from the beginning of the session. The model reads the full history, generates the next reply, and discards all state. The next message will need to include that reply in the history too.
Messages are passed as an ordered list, each with a role and content:
| Role | Who sends it | Purpose |
|---|---|---|
system | You (developer) | Sets the model's persona, constraints, and task context. Sent first, before any user messages. |
user | The end user | The user's input in each turn. |
assistant | The model | The model's reply from previous turns. |
VIKTOR's ChatConversation (from params.chat) tracks only the user and assistant turns — it does not store system messages. Call conversation.get_messages() to retrieve the full conversation history, then prepend your system message before passing to the API.
Basic usage
The following example shows a complete chat integration with a system prompt and streaming. The ViktorOpenAI helper provides the base URL and API key needed to configure the client:
import openai
import viktor as vkt
from openai import OpenAI # add to requirements.txt if not using App Builder
client = OpenAI(
base_url=vkt.ViktorOpenAI.get_base_url(version="v1"),
api_key=vkt.ViktorOpenAI.get_api_key(),
)
class Parametrization(vkt.Parametrization):
chat = vkt.Chat("Chat", method="call_llm")
class Controller(vkt.Controller):
parametrization = Parametrization
def call_llm(self, params, **kwargs):
conversation = params.chat
if not conversation:
return None
messages = [
{"role": "system", "content": "You are a helpful assistant."},
*conversation.get_messages(),
]
try:
stream = client.chat.completions.create(
model="openai.gpt-oss-120b",
messages=messages,
stream=True,
)
except openai.RateLimitError:
raise vkt.UserError("Fair usage limit reached. Please try again later.")
text_stream = (
chunk.choices[0].delta.content
for chunk in stream
if chunk.choices[0].delta.content is not None
)
return vkt.ChatResult(conversation, text_stream)
The VIKTOR SDK supports only the user and assistant roles in ChatConversation. Prepend system messages manually, as shown above.
The service enforces fair-usage limits. If your app exceeds the allowed token usage, the API returns a 429 Too Many Requests response, which the openai library raises as openai.RateLimitError. Catch it and raise a UserError to surface a clear message, as shown above.
Next step
To give the LLM access to your app's functions — running calculations, fetching data, querying models — see Tool Use.