Warning

🚧 Work in Progress: This page is currently under construction. Content may be incomplete or subject to change. To contribute, see the contribution guide.

Data Modeling Standards

Medallion architecture

The entire data platform follows a three-layer architecture:

Layer	Dataset	Responsibility	Transformations
Raw	`raw_*`	Raw data, original source format	None — ingestion only
Stage	`stage_*`	Clean, standardized data	Typing, deduplication, normalization
Gold	`gold_*`	Modeled data for consumption	Aggregations, joins, business metrics

Fact × dimension tables (star schema)

For analytical datasets in the gold layer:

fact_fundraising          dim_fund
├── fund_id (FK) ──────→  ├── fund_id (PK)
├── reference_date        ├── fund_name
├── raised_amount         ├── strategy
└── _load_date            └── inception_date

dim_period
├── date (PK)
├── year
├── month
└── quarter

Mandatory rules for gold layer

Every fact table must have a surrogate primary key ({entity}_id + GENERATE_UUID())
Every table must have load metadata: _load_date TIMESTAMP, _source STRING
Partitioning mandatory on tables > 1 GB by date column
Clustering mandatory on columns most frequently used in WHERE/JOIN

Patria Tech Docs

Explorer

Patria Tech Docs

Data Modeling Standards

Data Modeling Standards

Medallion architecture

Fact × dimension tables (star schema)

Mandatory rules for gold layer

Table of Contents