Runtime

The core execution environment — pipeline orchestration, registries, failure handling, and configuration.

Version: v0.5.x Model Provider System

Overview

The COREtex runtime is the core execution environment of the CortX platform. It provides the primitives needed to execute intelligent request pipelines while remaining completely independent of any specific module implementation.

Runtime code lives in coretex/.

Runtime Responsibilities

The runtime is responsible for exactly these systems:

System	Description
Execution lifecycle	Receiving requests, managing context, returning responses
Pipeline orchestration	Running the classifier → router → worker → executor flow
Module loading	Importing modules and calling their `register()` function
Registry management	Holding registered classifiers, routers, workers, tools, model providers
Event emission	Emitting structured log events for observability
Configuration	Loading settings from environment or `.env` file

The runtime must never contain integrations, tools, model providers, or application logic.

Module Architecture

Modules extend the runtime by registering capabilities at startup. The runtime then looks up these capabilities by name during pipeline execution.

Dependency direction is always:

modules → runtime

The runtime never imports from modules/. All coupling is through registry lookups.

Module Structure

Each module lives in modules/<module_name>/ and must expose a module.py file with a register() function:

def register(
    module_registry: ModuleRegistry,
    tool_registry: ToolRegistry,
    model_registry: ModelProviderRegistry,
) -> None:
    ...

See the Module Development guide for full details.

Registries

Registries are the extension points between the runtime and modules.

Registry	Location	Holds
`ModuleRegistry`	`coretex/registry/module_registry.py`	Classifiers, Routers, Workers
`ToolRegistry`	`coretex/registry/tool_registry.py`	Tools
`ModelProviderRegistry`	`coretex/registry/model_registry.py`	Model backends
`PipelineRegistry`	`coretex/registry/pipeline_registry.py`	Named pipelines (v0.4+)

Registry Safety Rules

ToolRegistry and ModuleRegistry use Component already registered / Unknown component.
ModelProviderRegistry uses Model provider already registered / Unknown model provider.
PipelineRegistry uses Pipeline already registered / Unknown pipeline.

Pipeline Execution Flow

Every request follows this deterministic pipeline:

ExecutionContext created
    │
    ▼
event=request_received
    │
    ▼
classifier.classify(user_input)
    │
    ├─► event=classifier_start
    ├─► event=classifier_complete  (includes duration_ms, intent, confidence)
    │
    ▼
router.route(intent)
    │
    ├─► event=router_selected
    │
    ▼
[if handler == "clarify"]
    └─► return clarification response
    │
[if handler == "worker"]
    ├─► event=worker_start  (includes model_provider)
    ├─► worker.generate(user_input, intent)
    ├─► event=worker_complete  (includes duration_ms)
    │
    ├─► parse_agent_output(response)  → AgentAction
    │
    ├─► executor.execute(action)
    │     ├─► action=respond   → return content directly
    │     └─► action=tool      → event=tool_execute → tool.execute() → event=tool_execute_complete
    │
    ▼
event=request_complete  (includes intent, confidence, handler, all latencies)

ExecutionContext

ExecutionContext carries per-request state through the pipeline:

@dataclass
class ExecutionContext:
    user_input: str
    request_id: str         # auto-generated UUID hex
    intent: Optional[str]   # set after classification
    confidence: float       # set after classification
    handler: Optional[str]  # set after routing
    t_start: float          # monotonic timestamp at creation
    timestamp: float        # wall-clock time at creation
    metadata: Optional[Dict[str, Any]]  # optional module metadata

Model Provider Flow

v0.5 formalises inference behind ModelProvider.generate() and ModelProvider.chat(). The default model_provider_ollama module registers "ollama" once at bootstrap, and the classifier and worker receive that provider by explicit injection during module registration.

This means:

the runtime still never depends on a concrete backend
pipeline orchestration stays deterministic
future providers can be added without changing PipelineRunner

Failure Behaviour

All failure modes are handled gracefully — no pipeline error produces an unhandled exception.

Failure	Event Logged	Response
Classifier HTTP failure	`event=pipeline_classifier_failure`	Fallback: `intent=ambiguous`, clarification response
Worker HTTP failure	`event=pipeline_worker_failure`	Worker failure response, `intent=ambiguous`
Tool lookup failure	`event=pipeline_tool_failure`	Worker failure response
Tool runtime exception	`event=pipeline_tool_failure`	Worker failure response
Agent JSON parse failure	`event=pipeline_agent_parse_failure`	Raw LLM output treated as plain text

Structured Logging

All runtime log events use structured key=value format:

event=classifier_start request_id=abc123 classifier=classifier_basic model_provider=ollama
event=model_provider_chat_start request_id=abc123 model_provider=ollama model=llama3.2:3b
event=model_provider_chat_complete request_id=abc123 model_provider=ollama model=llama3.2:3b duration_ms=312
event=classifier_complete request_id=abc123 intent=execution confidence=0.92 duration_ms=312
event=router_selected request_id=abc123 intent=execution handler=worker
event=worker_start request_id=abc123 worker=worker_llm intent=execution model_provider=ollama
event=model_provider_generate_start request_id=abc123 model_provider=ollama model=llama3.2:3b
event=model_provider_generate_complete request_id=abc123 model_provider=ollama model=llama3.2:3b duration_ms=1450
event=worker_complete request_id=abc123 duration_ms=1450
event=request_complete request_id=abc123 intent=execution confidence=0.92 handler=worker classifier_latency_ms=312 worker_latency_ms=1450 total_latency_ms=1765

All log events include request_id for full request traceability.

Configuration

Runtime configuration is loaded from environment variables or a .env file:

Setting	Default	Description
`OLLAMA_BASE_URL`	`http://host.docker.internal:11434`	Ollama API endpoint
`CLASSIFIER_MODEL`	`llama3.2:3b`	Model used by the classifier
`WORKER_MODEL`	`llama3.2:3b`	Model used by the worker
`CLASSIFIER_TIMEOUT`	`60`	Classifier HTTP timeout (seconds)
`WORKER_TIMEOUT`	`300`	Worker HTTP timeout (seconds)
`MAX_TOKENS`	`256`	Maximum tokens per LLM response
`INGRESS_PORT`	`8000`	FastAPI ingress port
`LOG_LEVEL`	`INFO`	Python logging level
`DEBUG_ROUTER`	`False`	Emit `event=router_decision` at DEBUG level

EventBus

The EventBus (coretex/runtime/events.py) provides structured log emission:

event_bus.emit("classifier_complete", request_id=request_id, intent=intent, duration_ms=ms)
event_bus.emit_warning("module_registered_nothing", module="my_module")
event_bus.emit_error("pipeline_classifier_failure", request_id=request_id, error_type=...)

In v0.5.x the EventBus is still a structured log wrapper. A fuller event system is planned for a later phase.