Skip to content
View bashoori's full-sized avatar

Block or report bashoori

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
bashoori/README.md

Hi, I'm Bita πŸ‘‹

Data Engineer Β· Azure Β· Databricks Β· Microsoft Fabric Β· ETL & Cloud Pipelines
I build reliable, observable, scalable data systems.

LinkedIn Portfolio Email Profile views


🎯 About

Data Engineer with 5+ years of experience turning fragmented, inconsistent data into reliable platforms across healthcare, retail, and enterprise systems. I focus on the spots where small data inconsistencies have real business impact β€” and design pipelines that stay trustworthy as complexity grows.

Currently focused on the move from legacy ETL to modern cloud lakehouse architectures on Azure, Fabric, and Databricks.

πŸ“ Vancouver, Canada Β· πŸ‡¨πŸ‡¦ Open to remote / hybrid roles


πŸ›  Tech Stack

What I work on: end-to-end pipelines (ingestion β†’ reporting) Β· medallion & dimensional modeling Β· data quality, validation & monitoring Β· multi-source integration Β· cloud lakehouse migrations.


πŸ“Œ Featured Projects

Medallion-architecture pipeline that turns messy, multi-source raw data into validated, observable, ML-ready feature tables.

  • Bronze β†’ Silver β†’ Gold with embedded data-quality checks at every layer
  • Per-run DQ report (JSON + Markdown) for observability
  • ML consumer example showing the DE β†’ ML handoff with scikit-learn

Stack: Python Β· Pandas Β· SQL Β· Medallion Β· Data Quality

End-to-end data warehouse on real GTFS transit data using a medallion architecture.

  • Bronze β†’ Silver β†’ Gold layers with embedded data-quality checks
  • Handled domain edge cases like GTFS times beyond 24:00
  • Dimensional models built for time-based ridership analysis

Stack: Python Β· SQL Β· PySpark Β· Medallion architecture

Multi-region retail lakehouse with unified customer / product / sales models.

  • Standardized ingestion & transformation across regions
  • Consistent datasets for scalable Power BI reporting

Stack: Microsoft Fabric Β· OneLake Β· Power BI Β· Medallion

Medallion-based pipeline using Delta Lake + Unity Catalog.

  • Governed access, scalable processing, reusable transformations

Stack: Databricks Β· Delta Lake Β· Unity Catalog Β· PySpark

Containerized ETL reflecting production patterns β€” orchestration, retries, and scheduling.

Stack: Apache Airflow Β· Spark Β· AWS S3 Β· Docker


πŸ“Š GitHub Stats


πŸŽ“ Certifications & Learning

  • βœ… Microsoft Certified: Azure Data Fundamentals (DP-900)
  • πŸ“˜ In progress: Microsoft Fabric Data Engineer (DP-700)
  • πŸ“š Currently building hands-on lakehouse projects on Fabric & Databricks

πŸ“« Let's Connect

I'm open to Data Engineering roles (full-time, contract, or remote). Reach out if you're hiring, collaborating, or just want to talk about pipelines.

LinkedIn Β· Portfolio Β· Email

Build systems that remain reliable as complexity grows.

Popular repositories Loading

  1. SQL SQL Public

    TSQL 1

  2. TSQL-Scripts TSQL-Scripts Public

    Forked from SQL-Server-projects/TSQL-Scripts

    🐸 Various scripts I use for SQL Server databases. These include Reporting Services, Primavera P6, and general administration T-SQL backup and restore, etc.

    TSQL 1

  3. email-icon email-icon Public

    Forked from ErickSimoes/email-icon

    Directory for storing icons for email signature

    1

  4. data-scientist-roadmap data-scientist-roadmap Public

    Forked from ahull002/data-scientist-roadmap

    Toturial coming with "data science roadmap" graphe.

    Python 1

  5. English-Fake-News-Project English-Fake-News-Project Public

    Forked from bijaykahar/English-Fake-News-Project

    Jupyter Notebook 1

  6. Python_Tutorials Python_Tutorials Public

    Forked from mGalarnyk/Python_Tutorials

    Python tutorials in both Jupyter Notebook and youtube format.

    Jupyter Notebook 1