Skip to main content

Prompts and LLM Configuration

Use LlmModel and Prompt to control which model your agent uses, how it generates responses, and what instructions it follows.

Prerequisites

  • AgenticAI Core SDK installed and configured.
  • A valid connection configured for your LLM provider (OpenAI, Anthropic, or Azure OpenAI).

Configure the LLM model

Basic configuration

from agenticai_core.designtime.models.llm_model import LlmModel, LlmModelConfig

llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="Default Connection",
    max_timeout="60 Secs",
    max_iterations="25",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )
)

Builder pattern

Use LlmModelBuilder and LlmModelConfigBuilder for a fluent configuration style:
from agenticai_core.designtime.models.llm_model import (
    LlmModelBuilder, LlmModelConfigBuilder
)

# Build config
config_dict = LlmModelConfigBuilder() \
    .set_temperature(0.7) \
    .set_max_tokens(1600) \
    .set_top_p(0.9) \
    .build()

config = LlmModelConfig(**config_dict)

# Build model
llm_dict = LlmModelBuilder() \
    .set_model("gpt-4o") \
    .set_provider("Open AI") \
    .set_connection_name("Default") \
    .set_model_config(config) \
    .build()

llm = LlmModel(**llm_dict)

Supported providers

OpenAI

llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="OpenAI Connection",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        frequency_penalty=0.0,
        presence_penalty=0.0,
        top_p=1.0
    )
)

Anthropic (Claude)

llm = LlmModel(
    model="claude-3-5-sonnet-20240620",
    provider="Anthropic",
    connection_name="Anthropic Connection",
    modelConfig=LlmModelConfig(
        temperature=1.0,
        max_tokens=1024,
        top_p=0.7,
        top_k=5  # Anthropic-specific
    )
)

Azure OpenAI

llm = LlmModel(
    model="gpt-4",
    provider="Azure OpenAI",
    connection_name="Azure Connection",
    modelConfig=LlmModelConfig(
        temperature=0.8,
        max_tokens=2048
    )
)

LLM parameters

Temperature (0.0–2.0)

Controls output randomness. Lower values produce more predictable responses; higher values produce more varied ones.
RangeBehaviorUse for
0.0–0.3Deterministic, focusedFactual queries, data extraction
0.4–0.7BalancedGeneral-purpose agents
0.8–1.5Creative, diverseBrainstorming, content generation
1.6–2.0Highly randomExperimental use cases
# Factual task
config = LlmModelConfig(temperature=0.1)

# Balanced
config = LlmModelConfig(temperature=0.7)

# Creative
config = LlmModelConfig(temperature=1.2)

Max tokens

Sets the maximum number of tokens the model generates per response.
Response typeRecommended range
Short answers500–1000
Detailed responses1000–2000
Long-form content2000–4000
config = LlmModelConfig(
    max_tokens=1600  # Moderate response length
)

Top P (0.0–1.0)

Nucleus sampling parameter — controls the token pool the model samples from.
  • 0.1–0.5: Focused, less diverse sampling.
  • 0.6–0.9: Balanced diversity.
  • 0.95–1.0: Maximum diversity.
config = LlmModelConfig(top_p=0.9)

Penalties (−2.0 to 2.0)

Reduce repetition in responses:
  • frequency_penalty: Penalizes tokens that appear frequently in the output.
  • presence_penalty: Encourages the model to introduce new topics.
config = LlmModelConfig(
    frequency_penalty=0.5,  # Penalize frequent tokens
    presence_penalty=0.3    # Encourage topic diversity
)

Configure prompts

System prompt

Sets the base role for the agent:
from agenticai_core.designtime.models.prompt import Prompt

prompt = Prompt(
    system="You are a helpful assistant."
)

Custom prompt

Provides detailed instructions and context beyond the system role:
prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an intelligent banking assistant designed to help
    customers manage their financial needs efficiently and securely.

    ## Your Capabilities
    - Check account balances
    - Process transactions
    - Answer banking policy questions
    - Provide loan information

    ## Customer Context
    You have access to:
    {{memory.accountInfo.accounts}}

    Use this information for quick responses.
    """
)

Instructions

Pass structured rules as a list. Use instructions for compliance, tone, and handling guidelines — especially for sensitive domains:
prompt = Prompt(
    system="You are a banking assistant.",
    custom="Help customers with account management.",
    instructions=[
        """### Security Protocols
        - Never ask for passwords, PINs, or CVV numbers
        - If request seems suspicious, politely decline""",

        """### Speaking Style
        - Use natural, conversational language
        - Keep responses concise
        - Provide key information first""",

        """### Handling Requests
        1. Greet the customer warmly
        2. Identify their need
        3. Execute the request efficiently
        4. Summarize and ask if anything else needed"""
    ]
)
Security guidance: Always include a security instruction block for apps that handle sensitive data:
instructions=[
    """### Security
    - Never ask for passwords, PINs, CVV, or OTPs
    - Verify unusual requests
    - Escalate suspicious activity"""
]
Voice agent guidance: For voice or audio agents, add a speaking style instruction:
instructions=[
    """### Speaking Style
    - Use natural, conversational language
    - Avoid markdown formatting
    - Speak numbers clearly
    - Use pauses with commas
    - Keep responses concise"""
]

Template variables

Prompts support runtime variable substitution using {{variable}} syntax:
VariableDescription
{{app_name}}Application name.
{{app_description}}Application description.
{{agent_name}}Current agent name.
{{memory.store.field}}Access memory store data.
{{session_id}}Current session identifier.
prompt = Prompt(
    custom="""You are acting as {{agent_name}} for the application "{{app_name}}".

    Application Description:
    {{app_description}}

    Customer Account Information:
    {{memory.accountInfo.accounts}}

    Use the above context to provide quick, accurate responses.
    """
)

Orchestrator prompts

For supervisor or orchestrator agents, define routing rules in the custom prompt:
supervisor_prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an AI Supervisor for "{{app_name}}".

    ### Your Team
    You manage multiple workers:
    - BillingAgent: Handles payments and billing
    - SupportAgent: General customer support
    - TechnicalAgent: Technical issues

    ### Routing Rules
    1. **Small-talk**: Route to user with friendly response
    2. **Direct Routing**: Match requests to worker expertise
    3. **Follow-up**: Route responses to same worker
    4. **Route to user**: When unrelated or complete
    5. **Multi-Intent**: Break into sequential requests
    """
)

Task-specific configurations

Match your LlmModelConfig to the nature of the agent’s task: Factual tasks — use low temperature for consistent, accurate responses:
LlmModelConfig(
    temperature=0.1,  # Low for consistency
    max_tokens=800
)
Creative tasks — use higher temperature for varied output:
LlmModelConfig(
    temperature=1.0,  # Higher for creativity
    max_tokens=2000
)
Balanced (general-purpose):
LlmModelConfig(
    temperature=0.7,
    max_tokens=1600,
    top_p=0.9
)

Optimization tips

Cost
  • Use smaller models for simple, repetitive tasks.
  • Set max_tokens to the minimum needed for the expected response length.
  • Set max_iterations to limit unnecessary tool calls.
  • Configure reasonable timeouts to avoid runaway sessions.
Quality
  • Use the latest model versions for your provider.
  • Increase max_tokens when detailed responses are required.
  • Lower temperature for tasks that require consistency.
  • Increase max_iterations for complex multi-step workflows.