Data Engineer Β· Azure Β· Databricks Β· Microsoft Fabric Β· ETL & Cloud Pipelines
I build reliable, observable, scalable data systems.
Data Engineer with 5+ years of experience turning fragmented, inconsistent data into reliable platforms across healthcare, retail, and enterprise systems. I focus on the spots where small data inconsistencies have real business impact β and design pipelines that stay trustworthy as complexity grows.
Currently focused on the move from legacy ETL to modern cloud lakehouse architectures on Azure, Fabric, and Databricks.
π Vancouver, Canada Β· π¨π¦ Open to remote / hybrid roles
What I work on: end-to-end pipelines (ingestion β reporting) Β· medallion & dimensional modeling Β· data quality, validation & monitoring Β· multi-source integration Β· cloud lakehouse migrations.
Medallion-architecture pipeline that turns messy, multi-source raw data into validated, observable, ML-ready feature tables.
- Bronze β Silver β Gold with embedded data-quality checks at every layer
- Per-run DQ report (JSON + Markdown) for observability
- ML consumer example showing the DE β ML handoff with scikit-learn
Stack: Python Β· Pandas Β· SQL Β· Medallion Β· Data Quality
End-to-end data warehouse on real GTFS transit data using a medallion architecture.
- Bronze β Silver β Gold layers with embedded data-quality checks
- Handled domain edge cases like GTFS times beyond 24:00
- Dimensional models built for time-based ridership analysis
Stack: Python Β· SQL Β· PySpark Β· Medallion architecture
Multi-region retail lakehouse with unified customer / product / sales models.
- Standardized ingestion & transformation across regions
- Consistent datasets for scalable Power BI reporting
Stack: Microsoft Fabric Β· OneLake Β· Power BI Β· Medallion
Medallion-based pipeline using Delta Lake + Unity Catalog.
- Governed access, scalable processing, reusable transformations
Stack: Databricks Β· Delta Lake Β· Unity Catalog Β· PySpark
π¬οΈ Airflow + Spark + AWS Pipeline
Containerized ETL reflecting production patterns β orchestration, retries, and scheduling.
Stack: Apache Airflow Β· Spark Β· AWS S3 Β· Docker
- β Microsoft Certified: Azure Data Fundamentals (DP-900)
- π In progress: Microsoft Fabric Data Engineer (DP-700)
- π Currently building hands-on lakehouse projects on Fabric & Databricks
I'm open to Data Engineering roles (full-time, contract, or remote). Reach out if you're hiring, collaborating, or just want to talk about pipelines.
LinkedIn Β· Portfolio Β· Email
Build systems that remain reliable as complexity grows.


