Back to DATATEAM projects

Data Warehouse

A governed Snowflake warehouse that standardizes raw operational data into bronze, silver, and gold layers for planning, operations, BI, and AI workloads.

Problems we had to solve

  • Operational teams used different extracts, spreadsheets, and source-system queries for the same coal logistics metrics.
  • Definitions for contracts, shipments, stock position, vendor performance, and plant demand were inconsistent across reports.
  • Downstream systems needed stable, tested tables instead of rebuilding joins and cleansing logic in every application.

What we implemented

  • Built a medallion warehouse pattern: bronze for raw ingestion, silver for cleaned source-aligned entities, and gold for business-ready models.
  • Standardized entity keys and metric definitions across contracts, suppliers, vessels, plants, calorific quality, and delivery status.
  • Added data quality checks, lineage-friendly transformations, and role-based access so analytics and applications can consume the same trusted layer.

Tools used

Stack selected for production data work, not a demo.

AWS
Terraform
Snowflake
dbt
Airflow
SQL
Python
dlt
Elementary Data
Git

Architecture

How the project is structured technically.

01

Bronze ingestion

Raw tables preserve source payloads and load metadata so failed downstream transformations can be replayed without asking each source owner for another extract.

02

Silver normalization

Type casting, deduplication, entity resolution, and source-specific validation happen before data is exposed to application or dashboard layers.

03

Gold semantic models

Business tables expose stable dimensions and facts for shipment monitoring, contract realization, coal stock, supplier performance, and planning inputs.

Work showcase

Sanitized project screens

End-to-end warehouse monitoring

End-to-end warehouse monitoring

Unified observability across freshness, data quality checks, and data catalog — one view for the full warehouse health.

Related PLN EPI work