Skip to content

AugustAllYear/AugustAllYear

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 

Repository files navigation

AugustAllYear

August Vollbrecht

Data Scientist | Machine Learning | Cloud | Python

Building production‑ready pipelines and turning data into actionable insights. Specializing in NLP, predictive modeling, and scalable data architectures.


Technical Skills

  • Languages: Python, SQL, Java
  • Data Science & ML: Pandas, NumPy, scikit‑learn, XGBoost, PyCaret, BERTopic, spaCy, Transformers, NLTK, SHAP
  • Deep Learning: PyTorch, TensorFlow, Keras, Autoencoders
  • Big Data: PySpark, Hadoop, SparkML, SparkDL
  • Orchestration & MLOps: Apache Airflow, MLflow, Docker, GitHub Actions
  • Cloud: AWS (S3, EC2, Lambda), Azure Data Factory, Azure Databricks
  • Databases: PostgreSQL, Redis, SQLite
  • APIs & Visualization: FastAPI, Uvicorn, Streamlit, Plotly, Matplotlib, Seaborn
  • Testing & CI/CD: pytest, flake8, GitHub Actions
  • Other: BeautifulSoup, Requests, statsmodels, scipy, horovod, Git, Jupyter Notebook, joblib, PyYAML

Featured Projects

Trendscape Analysis for Partnership Development

Production pipeline that ingests daily news and Reddit posts, detects emerging trends using BERTopic, and generates partnership recommendations. Orchestrated with Airflow, served via FastAPI, visualized with Streamlit.
Tech: Python, BERTopic, spaCy, Airflow, MLflow, Docker, GitHub Actions
Repository

Sentinel_AI – Real‑Time Fraud Detection Pipeline

End‑to‑end machine learning system for transaction fraud detection, featuring both Random Forest and deep learning autoencoder models. Includes automated retraining, drift monitoring, SHAP explainability, and a Streamlit dashboard. Deployed with CI/CD (GitHub Actions) and MLflow tracking. Tech: Python, scikit‑learn, PyTorch, SHAP, MLflow, Docker, GitHub Actions, Streamlit, pandas, numpy, seaborn, matplotlib Repository

Propensity‑Based Audience Optimization

Developed a predictive model (Random Forest / XGBoost) to identify the top 30% of customers most likely to open marketing emails. The model increased reach by 25% while maintaining send volume, validated through A/B testing and an automated retraining pipeline. Included hyperparameter tuning (GridSearchCV), MLflow tracking, and a six‑month simulation.
Tech: scikit‑learn, XGBoost, MLflow, pandas, matplotlib, seaborn, SHAP, FastAPI, Streamlit, GitHub Actions Repository

Instasight – Multi‑Agent Instagram Intelligence

Built a multi‑agent system (Google ADK) that ingests Instagram data via CSV or Meta API, normalizes it, and applies LLM‑powered analytics (engagement rates, posting patterns, forecasting). Outputs BI‑ready datasets for PowerBI and Tableau with a Streamlit interface.
Tech: Python, Google ADK (Gemini), pandas, Streamlit, Meta Graph API
Repository


Certifications

  • General Assembly - Data Science Bootcamp
    Verification
  • Microsoft - Build and Operate Machine Learning Solutions with Azure
    Verification
  • Microsoft - Create Machine Learning Models in Microsoft Azure
    Verification
  • Google - AI Agent Intensive Course
    Verification
  • Microsoft - Azure Machine Learning for Data Scientists
    Verification
  • Microsoft - Perform Data Science with Azure Databricks
    Verification
  • IBM - Big Data with Spark and Hadoop Essentials
    Verification
  • University of California, Davis - SQL for Data Science
    Verification

Contact


GitHub Stats

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors