Skip to content
View AugustAllYear's full-sized avatar

Block or report AugustAllYear

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AugustAllYear/README.md

AugustAllYear

August Vollbrecht

Data Scientist | Machine Learning | Cloud | Python

Building production‑ready pipelines and turning data into actionable insights. Specializing in NLP, predictive modeling, and scalable data architectures.


Technical Skills

  • Languages: Python, SQL, Java
  • Data Science & ML: Pandas, NumPy, scikit‑learn, XGBoost, PyCaret, BERTopic, spaCy, Transformers, NLTK
  • Big Data: PySpark, Hadoop, SparkML, SparkDL
  • Orchestration & MLOps: Apache Airflow, MLflow, Docker, GitHub Actions
  • Cloud: AWS (S3, EC2, Lambda), Azure Data Factory, Azure Databricks
  • Databases: PostgreSQL, Redis, SQLite
  • APIs & Visualization: FastAPI, Streamlit, Plotly, Matplotlib, Seaborn
  • Other: BeautifulSoup, Requests, statsmodels, Keras, PyTorch, TensorFlow, horovod, Agile Methodologies, Git, Jupyter Notebook

Featured Projects

Trendscape Analysis for Partnership Development

Production pipeline that ingests daily news and Reddit posts, detects emerging trends using BERTopic, and generates partnership recommendations. Orchestrated with Airflow, served via FastAPI, visualized with Streamlit.
Tech: Python, BERTopic, spaCy, Airflow, MLflow, Docker, GitHub Actions
Repository

Propensity‑Based Audience Optimization

Developed a predictive model (Random Forest / XGBoost) to identify the top 30% of customers most likely to open marketing emails. The model increased reach by 25% while maintaining send volume, validated through A/B testing and an automated retraining pipeline.
Tech: scikit‑learn, XGBoost, MLflow, pandas, matplotlib, seaborn
Repository

Instasight – Multi‑Agent Instagram Intelligence

Built a multi‑agent system (Google ADK) that ingests Instagram data via CSV or Meta API, normalizes it, and applies LLM‑powered analytics (engagement rates, posting patterns, forecasting). Outputs BI‑ready datasets for PowerBI and Tableau with a Streamlit interface.
Tech: Python, Google ADK (Gemini), pandas, Streamlit, Meta Graph API
Repository


Certifications

  • General Assembly - Data Science Bootcamp
    Verification
  • Microsoft - Build and Operate Machine Learning Solutions with Azure
    Verification
  • Microsoft - Create Machine Learning Models in Microsoft Azure
    Verification
  • Google - AI Agent Intensive Course
    Verification
  • Microsoft - Azure Machine Learning for Data Scientists
    Verification
  • Microsoft - Perform Data Science with Azure Databricks
    Verification
  • IBM - Big Data with Spark and Hadoop Essentials
    Verification
  • University of California, Davis - SQL for Data Science
    Verification

Contact


GitHub Statistics

August's GitHub stats

Top Languages

Pinned Loading

  1. Trendscape_Analysis_for_Partnership_Development Trendscape_Analysis_for_Partnership_Development Public

    Python

  2. Propensity-Based_Audience_Optimization Propensity-Based_Audience_Optimization Public

    HTML

  3. Emotion-Voice-Recognition Emotion-Voice-Recognition Public

    Modeling Emotion Voice Recognition trained on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

    Jupyter Notebook

  4. Instasight_Multiagent Instasight_Multiagent Public

    A start to finish data science multi-agent that delivers Instagram user insights extending beyond Meta Business, customizable to user-defined KPIs.

    Python

  5. Vox_Pop_Analytics_Historic_Presidential_Contests Vox_Pop_Analytics_Historic_Presidential_Contests Public

    Jupyter Notebook 1