Warning

🚧 Work in Progress: This page is currently under construction. Content may be incomplete or subject to change. To contribute, see the contribution guide.

Data & Analytics

Patria’s data platform is built on Google Cloud Platform, with BigQuery as the centralized data lake and Airflow for pipeline orchestration.


Platform overview

flowchart LR
    subgraph Sources["Data sources"]
        OP[Operational systems]
        EXT[External sources]
        ARQ[Files / SharePoint]
    end

    subgraph GCP["GCP Platform"]
        CR[Cloud Run
Ingestion APIs]
        AF[Airflow
Orchestration]
        BQ[BigQuery
Data lake]
    end

    subgraph Consumption
        BI[BI & Dashboards]
        ML[AI & Models]
        API[Data APIs]
    end

    Sources --> GCP
    CR --> BQ
    AF --> BQ
    BQ --> Consumption

Sections

SectionWhat you’ll find
Data ArchitectureOverview, medallion layers, principles
Data CatalogDatasets and tables by domain
Pipelines (Airflow)Production DAGs, development standards
GovernanceQuality, LGPD, data classification
AI & AutomationProduction models, UIPath and N8N flows