Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.hellofriday.ai/llms.txt

Use this file to discover all available pages before exploring further.

Class: Llm

class Llm:
    def generate(
        self,
        messages: list[dict[str, str]],
        *,
        model: str | None = None,
        max_tokens: int | None = None,
        temperature: float | None = None,
        provider_options: dict | None = None,
    ) -> LlmResponse: ...

    def generate_object(
        self,
        messages: list[dict[str, str]],
        schema: dict,
        *,
        model: str | None = None,
        max_tokens: int | None = None,
        temperature: float | None = None,
        provider_options: dict | None = None,
    ) -> LlmResponse: ...

Methods

generate()

Generate text from an LLM. Parameters:
ParameterTypeRequiredDescription
messageslist[dict[str, str]]YesConversation messages with role and content
modelstr | NoneNoModel identifier (resolution order applies)
max_tokensint | NoneNoMaximum tokens to generate
temperaturefloat | NoneNoSampling temperature (0.0 - 2.0)
provider_optionsdict | NoneNoProvider-specific options passthrough
Returns: LlmResponse Raises: LlmError on generation failure Example:
result = ctx.llm.generate(
    messages=[{"role": "user", "content": "Summarise this article"}],
    model="anthropic:claude-sonnet-4-6",
    max_tokens=1000,
    temperature=0.7,
)
print(result.text)

generate_object()

Generate structured output conforming to a JSON Schema. Parameters:
ParameterTypeRequiredDescription
messageslist[dict[str, str]]YesConversation messages
schemadictYesJSON Schema for output structure
modelstr | NoneNoModel identifier
max_tokensint | NoneNoMaximum tokens
temperaturefloat | NoneNoSampling temperature
provider_optionsdict | NoneNoProvider-specific options
Returns: LlmResponse with .object populated Raises: LlmError on generation failure Example:
schema = {
    "type": "object",
    "properties": {
        "summary": {"type": "string"},
        "tags": {"type": "array", "items": {"type": "string"}},
    },
    "required": ["summary"],
}

result = ctx.llm.generate_object(
    messages=[{"role": "user", "content": "Analyse this"}],
    schema=schema,
    model="anthropic:claude-haiku-4-5",
)

data = result.object  # Parsed JSON object
print(data["summary"])
print(data.get("tags", []))

Model Resolution

Resolution order (first match wins):
  1. Fully qualified per-call - model="anthropic:claude-sonnet-4-6" used directly
  2. Bare per-call + decorator default - model="claude-sonnet-4-6" + @agent(llm={"provider": "anthropic"}) resolved to full identifier
  3. Decorator default only - @agent(llm={"provider": "anthropic", "model": "claude-sonnet-4-6"}) used when no model specified
  4. Error - No model specified and no decorator default

LlmResponse

@dataclass
class LlmResponse:
    text: str | None           # Generated text (None for generate_object)
    object: dict | None        # Structured output dict (None for generate)
    model: str                 # Model identifier used (e.g., "anthropic:claude-sonnet-4-6")
    usage: dict                # {"input_tokens": 120, "output_tokens": 250}
    finish_reason: str         # "stop", "length", "content_filter", etc.

Error Handling

from friday_agent_sdk import LlmError, agent, err, ok

@agent(id="resilient", version="1.0.0", description="Handles LLM failures")
def execute(prompt, ctx):
    try:
        result = ctx.llm.generate(..., model="expensive-model")
    except LlmError as e:
        # Error message from host (e.g., "Rate limit exceeded", "Invalid API key")
        return err(f"Primary model failed: {e}")

    return ok({"output": result.text})

Provider Options

Pass provider-specific configuration:
result = ctx.llm.generate(
    messages=[...],
    model="anthropic:claude-sonnet-4-6",
    provider_options={
        "anthropic": {
            "thinking": {"type": "enabled", "budgetTokens": 4000},
        },
    },
)
Options vary by provider. Common patterns: Anthropic provider:
  • thinking - Enable extended reasoning with {"type": "enabled", "budgetTokens": <int>}
Claude Code provider:
  • systemPrompt - Either {"type": "preset", "preset": "..."} or {"type": "custom", "content": "..."}
  • effort - "low", "medium", "high"
  • fallbackModel - Model to use if primary fails
  • repo - Repository to clone and work in

Message Format

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there!"},
    {"role": "user", "content": "Analyse this code..."},
]
Valid roles: system, user, assistant

Limitations

  • No streaming responses - Full response returned at once; streaming is not yet supported
  • 5MB implicit limit - Via platform constraints on response size

Why Host-Managed?

Agents run as native Python processes. You can pip install additional packages into the agent environment. Host-provided LLM calls are still preferred for credential management, rate limiting, provider routing, and audit logging.

See Also

How to Call LLMs

Task-oriented guide

AgentContext

Parent context object