dbt Pipeline¶

dbt powers Kiket’s tenant-scoped analytics (we ship the Fusion binary, exposed as the dbt executable). Each organisation gets its own schema (e.g. analytics_org_acme) populated by dbt models so dashboards and alerts can query data safely.

Project layout¶

The dbt project lives in analytics/dbt/:

dbt_project.yml – project configuration, points models at analytics_org_<slug> schemas.
models/staging/ – raw tables wrapped with tenancy filters.
models/marts/ – fact/dimension models consumed by dashboards.
macros/ – helpers for tenant schema resolution and filters.
profiles/ – example profiles.yml for local use.

Cloud Build workflow¶

Scheduled runs leverage cloudbuild.analytics.yaml:

Build the runner image (reuses the test stage from Dockerfile).
Invoke the analytics runner (analytics:dbt:run) for every tenant (emits OTEL logs by default).
Optionally (_DBT_GENERATE_DOCS=true) generate dbt docs and publish the site artefacts.
Upload logs (dbt.log, dbt-otel.json), manifests, run_results, catalogs, and docs to gs://${_DBT_ARTIFACT_BUCKET}/runs/${BUILD_ID}.
Propagate the exit status so Cloud Logging / Monitoring can alert on failures.

The main cloudbuild.yaml now runs dbt parse on each commit and both pipelines use the lowest available machine tier by default. Increase capacity only if runtimes exceed your SLAs.

Running dbt¶

Workspace administrators can opt‑in to the analytics pipeline; each run materialises models into the organisation schema based on configuration (schedules, tenants). When enabled, the system automatically refreshes dbt models on the published cadence—no manual steps required. Manual refreshes can be triggered via the platform’s analytics settings (API/CLI routes described in the admin runbook).

Profiles & credentials¶

Terraform creates the kiket_dbt Cloud SQL user via the Google Cloud API. PostgreSQL-level grants (schema permissions, table access) are managed manually via SQL because Cloud Build lacks VPC access to the Cloud SQL private IP. The connection string is published to Secret Manager (dbt-database-url). Platform services continue to rely on the primary database-url secret for write access, while customer tooling can safely consume the read-only URI.

Set the following environment variables when invoking dbt directly (the defaults already align with our managed pipeline):

DBT_HOST, DBT_PORT, DBT_DATABASE
DBT_USER / DBT_PASSWORD (defaults to the app credentials unless overridden)
DBT_SHARED_SCHEMA (analytics_shared)

Each run writes to analytics_org_<slug> so data remains isolated per organisation.

Scheduling & cadence controls¶

Every organisation stores analytics preferences in analytics_settings:

{
  "enabled": true,
  "refresh_interval_hours": 24
}

enabled lets admins pause scheduled analytics completely.
refresh_interval_hours overrides the plan-based default (starter 24h, team 12h, business 6h, enterprise 1h).
last_run_at is maintained automatically to prevent back-to-back executions before the interval elapses.

When a run succeeds, the runner records UsageEvent[analytics_dbt_run] (model id, status, duration) so Ops and Billing can trace high-cost tenants.

Adding models¶

When you ship new analytics content (via definition repositories or templates), dbt models and dashboards are bundled together. Kiket validates those assets during sync and publishes them automatically.

Troubleshooting¶

Missing data – confirm the relevant dbt models are included in your definition repositories and that the latest sync completed successfully.
Delayed refreshes – check the analytics usage dashboard for recent run status; large datasets may take longer to rebuild.
Permission errors – ensure the analytics role has access to the required schemas; contact support if the default role was rotated or revoked.
Changelog comparisons – use the analytics export automation to produce automated-versus-human changelog evaluations, which writes docs-site/data/changelog_evaluations.json for downstream docs dashboards (see Changelog Evaluation Metrics).