Runtime
The core execution environment — pipeline orchestration, registries, failure handling, and configuration.
Version: v0.5.x Model Provider System
Overview
The COREtex runtime is the core execution environment of the CortX platform. It provides the primitives needed to execute intelligent request pipelines while remaining completely independent of any specific module implementation.
Runtime code lives in coretex/.
Runtime Responsibilities
The runtime is responsible for exactly these systems:
| System | Description |
|---|---|
| Execution lifecycle | Receiving requests, managing context, returning responses |
| Pipeline orchestration | Running the classifier → router → worker → executor flow |
| Module loading | Importing modules and calling their register() function |
| Registry management | Holding registered classifiers, routers, workers, tools, model providers |
| Event emission | Emitting structured log events for observability |
| Configuration | Loading settings from environment or .env file |
The runtime must never contain integrations, tools, model providers, or application logic.
Module Architecture
Modules extend the runtime by registering capabilities at startup. The runtime then looks up these capabilities by name during pipeline execution.
Dependency direction is always:
modules → runtime
The runtime never imports from modules/. All coupling is through registry lookups.
Module Structure
Each module lives in modules/<module_name>/ and must expose a module.py file with a register() function:
def register(
module_registry: ModuleRegistry,
tool_registry: ToolRegistry,
model_registry: ModelProviderRegistry,
) -> None:
...
See the Module Development guide for full details.
Registries
Registries are the extension points between the runtime and modules.
| Registry | Location | Holds |
|---|---|---|
ModuleRegistry |
coretex/registry/module_registry.py |
Classifiers, Routers, Workers |
ToolRegistry |
coretex/registry/tool_registry.py |
Tools |
ModelProviderRegistry |
coretex/registry/model_registry.py |
Model backends |
PipelineRegistry |
coretex/registry/pipeline_registry.py |
Named pipelines (v0.4+) |
Registry Safety Rules
ToolRegistryandModuleRegistryuseComponent already registered/Unknown component.ModelProviderRegistryusesModel provider already registered/Unknown model provider.PipelineRegistryusesPipeline already registered/Unknown pipeline.
Pipeline Execution Flow
Every request follows this deterministic pipeline:
ExecutionContext created
│
▼
event=request_received
│
▼
classifier.classify(user_input)
│
├─► event=classifier_start
├─► event=classifier_complete (includes duration_ms, intent, confidence)
│
▼
router.route(intent)
│
├─► event=router_selected
│
▼
[if handler == "clarify"]
└─► return clarification response
│
[if handler == "worker"]
├─► event=worker_start (includes model_provider)
├─► worker.generate(user_input, intent)
├─► event=worker_complete (includes duration_ms)
│
├─► parse_agent_output(response) → AgentAction
│
├─► executor.execute(action)
│ ├─► action=respond → return content directly
│ └─► action=tool → event=tool_execute → tool.execute() → event=tool_execute_complete
│
▼
event=request_complete (includes intent, confidence, handler, all latencies)
ExecutionContext
ExecutionContext carries per-request state through the pipeline:
@dataclass
class ExecutionContext:
user_input: str
request_id: str # auto-generated UUID hex
intent: Optional[str] # set after classification
confidence: float # set after classification
handler: Optional[str] # set after routing
t_start: float # monotonic timestamp at creation
timestamp: float # wall-clock time at creation
metadata: Optional[Dict[str, Any]] # optional module metadata
Model Provider Flow
v0.5 formalises inference behind ModelProvider.generate() and ModelProvider.chat(). The default model_provider_ollama module registers "ollama" once at bootstrap, and the classifier and worker receive that provider by explicit injection during module registration.
This means:
- the runtime still never depends on a concrete backend
- pipeline orchestration stays deterministic
- future providers can be added without changing
PipelineRunner
Failure Behaviour
All failure modes are handled gracefully — no pipeline error produces an unhandled exception.
| Failure | Event Logged | Response |
|---|---|---|
| Classifier HTTP failure | event=pipeline_classifier_failure |
Fallback: intent=ambiguous, clarification response |
| Worker HTTP failure | event=pipeline_worker_failure |
Worker failure response, intent=ambiguous |
| Tool lookup failure | event=pipeline_tool_failure |
Worker failure response |
| Tool runtime exception | event=pipeline_tool_failure |
Worker failure response |
| Agent JSON parse failure | event=pipeline_agent_parse_failure |
Raw LLM output treated as plain text |
Structured Logging
All runtime log events use structured key=value format:
event=classifier_start request_id=abc123 classifier=classifier_basic model_provider=ollama
event=model_provider_chat_start request_id=abc123 model_provider=ollama model=llama3.2:3b
event=model_provider_chat_complete request_id=abc123 model_provider=ollama model=llama3.2:3b duration_ms=312
event=classifier_complete request_id=abc123 intent=execution confidence=0.92 duration_ms=312
event=router_selected request_id=abc123 intent=execution handler=worker
event=worker_start request_id=abc123 worker=worker_llm intent=execution model_provider=ollama
event=model_provider_generate_start request_id=abc123 model_provider=ollama model=llama3.2:3b
event=model_provider_generate_complete request_id=abc123 model_provider=ollama model=llama3.2:3b duration_ms=1450
event=worker_complete request_id=abc123 duration_ms=1450
event=request_complete request_id=abc123 intent=execution confidence=0.92 handler=worker classifier_latency_ms=312 worker_latency_ms=1450 total_latency_ms=1765
All log events include request_id for full request traceability.
Configuration
Runtime configuration is loaded from environment variables or a .env file:
| Setting | Default | Description |
|---|---|---|
OLLAMA_BASE_URL |
http://host.docker.internal:11434 |
Ollama API endpoint |
CLASSIFIER_MODEL |
llama3.2:3b |
Model used by the classifier |
WORKER_MODEL |
llama3.2:3b |
Model used by the worker |
CLASSIFIER_TIMEOUT |
60 |
Classifier HTTP timeout (seconds) |
WORKER_TIMEOUT |
300 |
Worker HTTP timeout (seconds) |
MAX_TOKENS |
256 |
Maximum tokens per LLM response |
INGRESS_PORT |
8000 |
FastAPI ingress port |
LOG_LEVEL |
INFO |
Python logging level |
DEBUG_ROUTER |
False |
Emit event=router_decision at DEBUG level |
EventBus
The EventBus (coretex/runtime/events.py) provides structured log emission:
event_bus.emit("classifier_complete", request_id=request_id, intent=intent, duration_ms=ms)
event_bus.emit_warning("module_registered_nothing", module="my_module")
event_bus.emit_error("pipeline_classifier_failure", request_id=request_id, error_type=...)
In v0.5.x the EventBus is still a structured log wrapper. A fuller event system is planned for a later phase.