Database tables
The Atreides database is ClickHouse. It stores control-plane metadata and operational records only; snapshots, prepared frames, and model artifacts stay in filesystem/S3-backed stores.
The schema source of truth is src/infra/persistence/clickhouse/migrations.py.
just db-init applies the pending migrations and records applied versions in
schema_migrations.
Current migration-created tables
These tables are created by the current ClickHouse migrations.
| Table | Owner | Purpose |
|---|---|---|
schema_migrations | migration runner | Records which ClickHouse schema migrations have already been applied. |
model_definitions | modeling repository | Stores the latest persisted snapshots of ModelDefinition aggregates. |
promotion_audits | modeling repository | Stores promotion attempts and their audit payloads. |
forecast_runs | forecast result persistence job | Stores requested forecast runs, execution status, execution reference, and canonical forecast frames. |
demand_snapshots | snapshot repository | Stores snapshot materialization command state and manifest URI. |
configuration_revisions | configuration revision repository | Stores runtime configuration revisions by tenant, scope, kind, and status. |
schema_migrations
Internal migration ledger used by infra.persistence.clickhouse.migrations.
| Column | Type | Meaning |
|---|---|---|
version | String | Migration version identifier. |
description | String | Human-readable migration description. |
applied_at | DateTime64(6, 'UTC') | Time when the migration was recorded. Defaults to now64(6). |
Engine: ReplacingMergeTree(applied_at)
Sort key: version
model_definitions
Durable write-side state for model-definition aggregates. The table keeps an opaque JSON aggregate payload plus top-level columns used for tenant and entity lookup.
| Column | Type | Meaning |
|---|---|---|
id | UUID | Model definition id. |
organization_id | UUID | Tenant id. This must match the command tenant scope. |
payload | String | JSON-encoded ModelDefinitionRecord. |
created_at | DateTime64(6, 'UTC') | Timestamp written by the repository for the saved aggregate. |
updated_at | DateTime64(6, 'UTC') | Timestamp written by the repository for the saved aggregate. |
recorded_at | DateTime64(6, 'UTC') | ClickHouse replacement/version timestamp. Defaults to now64(6). |
Engine: ReplacingMergeTree(recorded_at)
Sort key: organization_id, id
Notes:
payloadincludes model versions, strategy specs, validation policy, projection recipe, feature recipe, champion reuse policy, trained instances, serving references, and metric snapshots.- Reads fetch the newest row for
idbyrecorded_at.
promotion_audits
Append-oriented audit table for model promotion attempts. The top-level columns
support tenant, model, series, version, and status access. The full immutable
audit detail is stored in payload.
| Column | Type | Meaning |
|---|---|---|
id | UUID | Promotion audit id. |
organization_id | UUID | Tenant id. |
model_definition_id | UUID | Promoted model definition id. |
series_key | String | Canonical serialized series identity. |
target_version | UInt32 | Candidate model instance version targeted by promotion. |
status | LowCardinality(String) | Promotion audit status. |
payload | String | JSON-encoded PromotionAuditRecordPayload. |
created_at | DateTime64(6, 'UTC') | Audit creation time. |
completed_at | Nullable(DateTime64(6, 'UTC')) | Completion time for finished attempts. |
recorded_at | DateTime64(6, 'UTC') | ClickHouse replacement/version timestamp. Defaults to now64(6). |
Engine: ReplacingMergeTree(recorded_at)
Sort key: organization_id, model_definition_id, series_key,
target_version, id
forecast_runs
Stores write-side forecast run results produced by the prediction job path. The
table keeps queryable run identifiers and stores forecast details in payload.
| Column | Type | Meaning |
|---|---|---|
id | UUID | Forecast run id. |
organization_id | UUID | Tenant id. |
model_definition_id | UUID | Model definition used for the forecast. |
model_version | UInt32 | Requested model version. |
series_key | String | Canonical serialized series identity. |
status | LowCardinality(String) | Forecast run status. |
payload | String | JSON payload with horizon, unit, execution reference, and canonical forecast frame columns. |
requested_at | DateTime64(6, 'UTC') | Time the forecast was requested. |
recorded_at | DateTime64(6, 'UTC') | ClickHouse replacement/version timestamp. Defaults to now64(6). |
Engine: ReplacingMergeTree(recorded_at)
Sort key: organization_id, model_definition_id, series_key,
model_version, id
Completed forecast runs require an execution_reference object in payload.
It records the active model instance version, strategy key, artifact URI,
resolved config hash, validation policy hash, projection recipe hash, feature
recipe hash, strategy params hash, and selection policy hash used by the
forecast execution.
Forecast frame payloads are produced by execute_model_forecasts, written as
Parquet artifacts, and then loaded by persist_forecast_results. The final
write uses Spark’s ClickHouse catalog writer against forecast_runs.
demand_snapshots
Tracks the state of snapshot materialization commands. The control plane stores the source locator and materialization state here; the canonical frame and manifest stay in the configured demand storage backend.
| Column | Type | Meaning |
|---|---|---|
id | UUID | Snapshot id. |
organization_id | UUID | Tenant id. |
input_path | String | Original input locator supplied to the command. |
source_format | LowCardinality(String) | Declared source format, currently csv or parquet. |
status | LowCardinality(String) | Snapshot materialization status. |
snapshot_manifest_uri | Nullable(String) | URI to the canonical snapshot manifest after successful materialization. |
error_message | Nullable(String) | Failure detail for failed materializations. |
created_at | DateTime64(6, 'UTC') | Snapshot creation time. |
updated_at | DateTime64(6, 'UTC') | Last status update time. |
completed_at | Nullable(DateTime64(6, 'UTC')) | Completion time for terminal states. |
recorded_at | DateTime64(6, 'UTC') | ClickHouse replacement/version timestamp. Defaults to now64(6). |
Engine: ReplacingMergeTree(recorded_at)
Sort key: organization_id, id
Reads fetch the newest row for id by recorded_at.
configuration_revisions
Stores runtime configuration revisions that can be activated or deprecated by tenant, scope, and configuration kind.
| Column | Type | Meaning |
|---|---|---|
id | UUID | Configuration revision id. |
organization_id | UUID | Tenant id. |
scope_type | LowCardinality(String) | Runtime config scope type. |
scope_key | String | Runtime config scope key. |
config_kind | LowCardinality(String) | Kind of configuration stored in the payload. |
payload_hash | String | Stable hash of the JSON payload. |
status | LowCardinality(String) | Revision status, such as active or deprecated. |
created_by | String | Actor that created the revision. |
payload | String | JSON object with the effective runtime configuration. |
created_at | DateTime64(6, 'UTC') | Revision creation time. |
activated_at | Nullable(DateTime64(6, 'UTC')) | Activation time for active revisions. |
recorded_at | DateTime64(6, 'UTC') | ClickHouse replacement/version timestamp. Defaults to now64(6). |
Engine: ReplacingMergeTree(recorded_at)
Sort key: organization_id, scope_type, scope_key, config_kind,
status, id
Reads fetch the newest active row by organization_id, scope_type,
scope_key, and config_kind, ordered by recorded_at.
Adding or changing tables
When a persistence adapter adds a new table or column:
- Add an ordered
ClickHouseMigrationinsrc/infra/persistence/clickhouse/migrations.py. - Keep table ownership in
src/infra/persistence/clickhouse/*adapters; do not leak ClickHouse details into application or domain code. - Update this page and any affected operation docs.
- Run the narrow persistence tests first, then the relevant quality gate.