<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Home on Apache Airflow</title>
    <link>/</link>
    <description>Recent content in Home on Apache Airflow</description>
    <generator>Hugo</generator>
    <language>en</language>
    <atom:link href="/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Agentic Workloads on Airflow: Observable, Retryable, and Auditable by Design</title>
      <link>/blog/agentic-workloads-airflow-3/</link>
      <pubDate>Wed, 15 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/agentic-workloads-airflow-3/</guid>
      <description>&lt;p&gt;A question like &amp;ldquo;How does AI tool usage vary across Airflow versions?&amp;rdquo; has a natural SQL shape: one cross-tabulation, one result. A question like &amp;ldquo;What does a typical Airflow deployment look like for practitioners who are actively using AI in their workflow?&amp;rdquo; does not. It requires querying executor type, deployment method, cloud provider, and Airflow version independently, each filtered to the same respondent group, then synthesizing the results into a coherent picture. No single query returns the answer. The answer emerges from the relationship between all of them.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Ask Your Survey Anything: Building AI Analysis Pipelines with Airflow 3</title>
      <link>/blog/ai-survey-analysis-pipelines/</link>
      <pubDate>Wed, 15 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/ai-survey-analysis-pipelines/</guid>
      <description>&lt;p&gt;The &lt;a href=&#34;https://airflow.apache.org/survey/&#34;&gt;2025 Airflow Community Survey&lt;/a&gt; collected responses&#xA;from nearly 6,000 practitioners across 168 questions. You can open a spreadsheet and filter,&#xA;or write SQL by hand. But what if you could just ask a question and have Airflow figure out&#xA;the query, run it, and bring the result back for your approval?&lt;/p&gt;&#xA;&lt;p&gt;This post builds two pipelines that do exactly that, using the&#xA;&lt;a href=&#34;https://pypi.org/project/apache-airflow-providers-common-ai/&#34;&gt;&lt;code&gt;apache-airflow-providers-common-ai&lt;/code&gt;&lt;/a&gt;&#xA;provider for Airflow 3.&lt;/p&gt;&#xA;&lt;p&gt;The first pipeline is &lt;strong&gt;interactive&lt;/strong&gt;: a human reviews the question before it reaches the LLM&#xA;and approves the result before the DAG finishes. The second is &lt;strong&gt;scheduled&lt;/strong&gt;: it downloads&#xA;fresh survey data, validates the schema, runs the query unattended, and emails the result.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Introducing the Common AI Provider: LLM and AI Agent Support for Apache Airflow</title>
      <link>/blog/common-ai-provider/</link>
      <pubDate>Tue, 14 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/common-ai-provider/</guid>
      <description>&lt;p&gt;At &lt;a href=&#34;https://airflowsummit.org/sessions/2025/airflow-as-an-ai-agents-toolkit-unlocking-1000-integrations-with-mcp/&#34;&gt;Airflow Summit 2025&lt;/a&gt;, we previewed what native AI integration in Apache Airflow could look like. Today we&amp;rsquo;re shipping it.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://pypi.org/project/apache-airflow-providers-common-ai/&#34;&gt;&lt;code&gt;apache-airflow-providers-common-ai&lt;/code&gt;&lt;/a&gt; 0.1.0&lt;/strong&gt; adds LLM and agent capabilities directly to Airflow. Not a wrapper around another framework, but a provider package that plugs into the orchestrator you already run. It&amp;rsquo;s built on &lt;a href=&#34;https://ai.pydantic.dev/&#34;&gt;Pydantic AI&lt;/a&gt; and supports 20+ model providers (OpenAI, Anthropic, Google, Azure, Bedrock, Ollama, and more) through a single install.&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install &lt;span class=&#34;s1&#34;&gt;&amp;#39;apache-airflow-providers-common-ai&amp;#39;&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Requires Apache Airflow 3.0+.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 3.2.0: Data-Aware Workflows at Scale</title>
      <link>/blog/airflow-3.2.0/</link>
      <pubDate>Tue, 07 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-3.2.0/</guid>
      <description>&lt;p&gt;We&amp;rsquo;re proud to announce the release of &lt;strong&gt;Apache Airflow 3.2.0&lt;/strong&gt;! Airflow 3.1 puts humans at the center of automated workflows. 3.2 brings that same precision to data: Asset partitioning for granular pipeline orchestration, multi-team deployments for enterprise scale, synchronous deadline alert callbacks, and continued progress toward full Task SDK separation.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/3.2.0/&#34;&gt;https://pypi.org/project/apache-airflow/3.2.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/3.2.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/3.2.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/3.2.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/3.2.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &lt;code&gt;docker pull apache/airflow:3.2.0&lt;/code&gt; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-3.2.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-3.2.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h1 id=&#34;-asset-partitioning-aip-76-only-the-right-work-gets-triggered&#34;&gt;🗂️ Asset Partitioning (AIP-76): Only the Right Work Gets Triggered&lt;/h1&gt;&#xA;&lt;p&gt;Asset partitioning has been one of the most requested additions to data-aware scheduling. If you work with date-partitioned S3 paths, Hive table partitions, BigQuery partitions, or really any partitioned data store, you&amp;rsquo;ve dealt with this: An upstream task updates one partition, and every downstream Dag fires regardless of which slice actually changed. It&amp;rsquo;s wasteful, and for large deployments it creates real operational noise.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Introducing the Apache Airflow Registry</title>
      <link>/blog/airflow-registry/</link>
      <pubDate>Thu, 19 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-registry/</guid>
      <description>&lt;p&gt;Today we&amp;rsquo;re launching the &lt;strong&gt;&lt;a href=&#34;https://airflow.apache.org/registry/&#34;&gt;Apache Airflow Registry&lt;/a&gt;&lt;/strong&gt; — a searchable catalog of every official Airflow provider and its modules, live at &lt;a href=&#34;https://airflow.apache.org/registry/&#34;&gt;airflow.apache.org/registry/&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;p&gt;Need an S3 operator? A Snowflake hook? An OpenAI sensor? The Registry helps you find, compare, and configure the right components for your data pipelines — without digging through docs or PyPI pages.&lt;/p&gt;&#xA;&lt;p&gt;&lt;img src=&#34;/blog/airflow-registry/images/registry-homepage.png&#34; alt=&#34;Registry Homepage&#34;&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;by-the-numbers&#34;&gt;By the Numbers&lt;/h2&gt;&#xA;&lt;table&gt;&#xA;  &lt;thead&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;th&gt;&lt;/th&gt;&#xA;          &lt;th&gt;&lt;/th&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/thead&gt;&#xA;  &lt;tbody&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;strong&gt;98&lt;/strong&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Official providers&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;strong&gt;1,602&lt;/strong&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Modules (operators, hooks, sensors, triggers, transfers, and more)&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;strong&gt;329M+&lt;/strong&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Monthly PyPI downloads across all providers&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;strong&gt;125+&lt;/strong&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Integrations with cloud platforms, databases, ML tools, and messaging services&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/tbody&gt;&#xA;&lt;/table&gt;&#xA;&lt;h2 id=&#34;search-everything&#34;&gt;Search Everything&lt;/h2&gt;&#xA;&lt;p&gt;Hit &lt;strong&gt;Cmd+K&lt;/strong&gt; from any page and start typing. Results show up instantly, grouped by Providers and Modules, with type badges so you can tell a hook from an operator at a glance.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2025</title>
      <link>/blog/airflow-survey-2025/</link>
      <pubDate>Thu, 22 Jan 2026 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey-2025/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;/blog/airflow-survey-2025/images/Airflow-Survey-2025-Results.png&#34; alt=&#34;Airflow Survey 2025&#34; title=&#34;airflow_survey_2025&#34;&gt;&lt;/p&gt;&#xA;&lt;div style=&#34;display: flex; align-items: flex-start; gap: 0.75rem;&#34;&gt;&#xA;  &lt;a href=&#34;https://www.astronomer.io/&#34; style=&#34;flex-shrink: 0;&#34;&gt;&lt;img src=&#34;images/astronomer-logo.svg&#34; alt=&#34;Astronomer&#34; width=&#34;40&#34; height=&#34;40&#34; /&gt;&lt;/a&gt;&#xA;  &lt;div&gt;&#xA;    &lt;p style=&#34;margin: 0;&#34;&gt;The interactive report is hosted by &lt;a href=&#34;https://www.astronomer.io/&#34;&gt;Astronomer&lt;/a&gt;. The Apache Airflow community thanks &lt;a href=&#34;https://www.astronomer.io/&#34;&gt;Astronomer&lt;/a&gt; for running this survey, for sponsoring it and providing the report in this form, and for their effort in marketing, analysis, and preparing the graphics.&lt;/p&gt;&#xA;  &lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;hr style=&#34;margin: 1rem 0; border: none; border-top: 1px solid #ccc;&#34; /&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://astronomer.typeform.com/reports/01KESPS8SJ2Y80THJAEYCECE5B&#34;&gt;View raw data&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;/data/survey-responses/airflow-user-survey-responses-2025.csv.zip&#34;&gt;Download survey responses (CSV)&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow CTL aka airflowctl 0.1.0</title>
      <link>/blog/airflowctl-0.1.0/</link>
      <pubDate>Wed, 15 Oct 2025 00:00:00 +0000</pubDate>
      <guid>/blog/airflowctl-0.1.0/</guid>
      <description>&lt;p&gt;We are thrilled to announce the first major release of &lt;strong&gt;&lt;code&gt;airflowctl&lt;/code&gt; 0.1.0&lt;/strong&gt;, the new &lt;strong&gt;secure, API-driven command-line interface (CLI)&lt;/strong&gt; for Apache Airflow — built under &lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-81&amp;#43;Enhanced&amp;#43;Security&amp;#43;in&amp;#43;CLI&amp;#43;via&amp;#43;Integration&amp;#43;of&amp;#43;API&#34;&gt;&lt;strong&gt;AIP-81&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;p&gt;This release marks CLI to join the general posture on communicating through API. Airflow CLI joins the modern era of secure, auditable, and remote-first operations.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 &lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href=&#34;https://pypi.org/project/apache-airflow-ctl/0.1.0/&#34;&gt;https://pypi.org/project/apache-airflow-ctl/0.1.0/&lt;/a&gt;  &lt;br&gt;&#xA;🛠️ &lt;strong&gt;Release Notes:&lt;/strong&gt; &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow-ctl/stable/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow-ctl/stable/release_notes.html&lt;/a&gt;  &lt;br&gt;&#xA;🪶 &lt;strong&gt;Source Code:&lt;/strong&gt; &lt;a href=&#34;https://github.com/apache/airflow/tree/main/airflow-ctl&#34;&gt;https://github.com/apache/airflow/tree/main/airflow-ctl&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;-what-is-airflowctl&#34;&gt;🎯 What is airflowctl?&lt;/h2&gt;&#xA;&lt;p&gt;&lt;code&gt;airflowctl&lt;/code&gt; is a new command-line interface for Apache Airflow that interacts exclusively with the Airflow REST API.&#xA;It provides a secure, auditable, and consistent way to manage Airflow deployments — without direct access to the metadata database.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 3.1.0: Human-Centered Workflows</title>
      <link>/blog/airflow-3.1.0/</link>
      <pubDate>Thu, 25 Sep 2025 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-3.1.0/</guid>
      <description>&lt;p&gt;We are thrilled to announce the release of &lt;strong&gt;Apache Airflow 3.1.0&lt;/strong&gt;, an update that puts humans at the center of data&#xA;workflows. This release introduces powerful new capabilities for human decision-making in automated&#xA;processes, comprehensive internationalization support, and significant developer experience improvements.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/3.1.0/&#34;&gt;https://pypi.org/project/apache-airflow/3.1.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Core Airflow Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/3.1.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/3.1.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Task SDK Docs: &lt;a href=&#34;https://airflow.apache.org/docs/task-sdk/1.1.0/&#34;&gt;https://airflow.apache.org/docs/task-sdk/1.1.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/3.1.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/3.1.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🪶 Sources: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/3.1.0/installation/installing-from-sources.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/3.1.0/installation/installing-from-sources.html&lt;/a&gt; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-3.1.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-3.1.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;-human-in-the-loop-hitl-when-automation-meets-human-judgment&#34;&gt;🤝 Human-in-the-Loop (HITL): When Automation Meets Human Judgment&lt;/h2&gt;&#xA;&lt;p&gt;This powerful capability bridges the gap between automated processes and human expertise, making Airflow invaluable for:&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow® 3 is Generally Available!</title>
      <link>/blog/airflow-three-point-oh-is-here/</link>
      <pubDate>Tue, 22 Apr 2025 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-three-point-oh-is-here/</guid>
      <description>&lt;p&gt;We announced our intent to focus on Apache Airflow 3.0® as the next big milestone for the Airflow project at the Airflow Summit in September 2024. We are delighted to announce that Airflow 3.0 is now released!&lt;/p&gt;&#xA;&lt;h2 id=&#34;a-major-release-four-years-in-the-making&#34;&gt;A Major Release, Four Years in the Making&lt;/h2&gt;&#xA;&lt;p&gt;Airflow 3.0 is the biggest release in Airflow’s history—2.0 was released in 2020, and the last 4 years have seen incremental updates and releases every quarter with version 2.10 released in Q4 2024. With over 30 million monthly downloads (up over 30x since 2020) and 80,000 organizations (up from 25,000 in 2020) now using Airflow, we’ve seen an incredible growth in popularity since 2.0.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2024</title>
      <link>/blog/airflow-survey-2024/</link>
      <pubDate>Thu, 27 Feb 2025 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey-2024/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;/blog/airflow-survey-2024/images/Airflow-Survey-2024-Results-v2.png&#34; alt=&#34;Airflow Survey 2024&#34; title=&#34;airflow_survey_2024&#34;&gt;&lt;/p&gt;&#xA;&lt;div style=&#34;display: flex; align-items: flex-start; gap: 0.75rem;&#34;&gt;&#xA;  &lt;a href=&#34;https://www.astronomer.io/&#34; style=&#34;flex-shrink: 0;&#34;&gt;&lt;img src=&#34;images/astronomer-logo.svg&#34; alt=&#34;Astronomer&#34; width=&#34;40&#34; height=&#34;40&#34; /&gt;&lt;/a&gt;&#xA;  &lt;div&gt;&#xA;    &lt;p style=&#34;margin: 0;&#34;&gt;The interactive report is hosted by &lt;a href=&#34;https://www.astronomer.io/&#34;&gt;Astronomer&lt;/a&gt;. The Apache Airflow community thanks &lt;a href=&#34;https://www.astronomer.io/&#34;&gt;Astronomer&lt;/a&gt; for running this survey, for sponsoring it and providing the report in this form, and for their effort in marketing, analysis, and preparing the graphics.&lt;/p&gt;&#xA;  &lt;/div&gt;&#xA;&lt;/div&gt;&#xA;&lt;hr style=&#34;margin: 1rem 0; border: none; border-top: 1px solid #ccc;&#34; /&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://astronomer.typeform.com/report/SF2VGNTc/fRSeRcKKJ3kgYXVl&#34;&gt;View raw data&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;/data/survey-responses/airflow-user-survey-responses-2024.csv.zip&#34;&gt;Download survey responses (CSV)&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.10.0 is here</title>
      <link>/blog/airflow-2.10.0/</link>
      <pubDate>Thu, 08 Aug 2024 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.10.0/</guid>
      <description>&lt;p&gt;I&amp;rsquo;m happy to announce that Apache Airflow 2.10.0 is now available, bringing an array of noteworthy enhancements and new features that will greatly serve our community.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.10.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.10.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.10.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.10.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠 Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.10.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.10.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &amp;ldquo;docker pull apache/airflow:2.10.0&amp;rdquo; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.10.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.10.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;airflow-now-collects-telemetry-data-by-default&#34;&gt;Airflow now collects Telemetry data by default&lt;/h2&gt;&#xA;&lt;p&gt;With the release of Airflow 2.10.0, we’ve introduced the collection of basic telemetry data, as outlined &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.10.0/faq.html#does-airflow-collect-any-telemetry-data&#34;&gt;here&lt;/a&gt;. This data will play a crucial role in helping Airflow maintainers gain a deeper understanding of how Airflow is utilized across various deployments. The insights derived from this information are invaluable in guiding the prioritization of patches, minor releases, and security fixes. Moreover, this data will inform key decisions regarding the development roadmap, ensuring that Airflow continues to evolve in line with community needs.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.9.0: Dataset and UI Improvements</title>
      <link>/blog/airflow-2.9.0/</link>
      <pubDate>Mon, 08 Apr 2024 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.9.0/</guid>
      <description>&lt;p&gt;I’m happy to announce that Apache Airflow 2.9.0 has been released! This time around we have new features for data-aware scheduling and a bunch of UI-related improvements.&lt;/p&gt;&#xA;&lt;p&gt;Apache Airflow 2.9.0 contains over 550 commits, which include 38 new features, 70 improvements, 31 bug fixes, and 18 documentation changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.9.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.9.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.9.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.9.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠 Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.9.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.9.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &amp;ldquo;docker pull apache/airflow:2.9.0&amp;rdquo; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.9.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.9.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;Airflow 2.9.0 is also the first release that supports Python 3.12. However, Pendulum 2 does not support Python 3.12, so you’ll need to use &lt;a href=&#34;https://pendulum.eustace.io/blog/announcing-pendulum-3-0-0.html&#34;&gt;Pendulum 3&lt;/a&gt; if you upgrade to Python 3.12.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Vulnerability in long deprecated OpenID authentication method in Flask AppBuilder</title>
      <link>/blog/fab-oid-vulnerability/</link>
      <pubDate>Mon, 26 Feb 2024 00:00:00 +0000</pubDate>
      <guid>/blog/fab-oid-vulnerability/</guid>
      <description>&lt;h1 id=&#34;vulnerability-in-long-deprecated-openid-authentication-method-in-flask-appbuilder&#34;&gt;Vulnerability in long deprecated OpenID authentication method in Flask AppBuilder&lt;/h1&gt;&#xA;&lt;p&gt;Recently &lt;a href=&#34;https://www.linkedin.com/in/islam-rzayev&#34;&gt;Islam Rzayev&lt;/a&gt; made us aware of a vulnerability in the&#xA;long deprecated OpenID authentication method in Flask AppBuilder. This vulnerability allowed a malicious user&#xA;to take over the identity of any Airflow UI user by forging a specially crafted request and implementing&#xA;their own OpenID service. While this is an old, deprecated and almost not used authentication method, we still&#xA;took the issue seriously.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.8.0 is here</title>
      <link>/blog/airflow-2.8.0/</link>
      <pubDate>Fri, 15 Dec 2023 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.8.0/</guid>
      <description>&lt;p&gt;I am thrilled to announce the release of Apache Airflow 2.8.0, featuring a host of significant enhancements and new features that will greatly benefit our community.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.8.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.8.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.8.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.8.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠 Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.8.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.8.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &amp;ldquo;docker pull apache/airflow:2.8.0&amp;rdquo; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.8.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.8.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;airflow-object-storage-aip-58&#34;&gt;Airflow Object Storage (AIP-58)&lt;/h2&gt;&#xA;&lt;p&gt;&lt;em&gt;This feature is experimental and subject to change.&lt;/em&gt;&lt;/p&gt;&#xA;&lt;p&gt;Airflow now offers a generic abstraction layer over various object stores like S3, GCS, and Azure Blob Storage, enabling the use of different storage systems in DAGs without code modification.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2023</title>
      <link>/blog/airflow-survey-2023/</link>
      <pubDate>Thu, 21 Sep 2023 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey-2023/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;/blog/airflow-survey-2023/images/Astronomer_Demographics.png&#34; alt=&#34;Demographics&#34; title=&#34;airflow_usage&#34;&gt;&#xA;&lt;img src=&#34;/blog/airflow-survey-2023/images/Astronomer_Community_and_Contribution.png&#34; alt=&#34;Community and Contribution&#34; title=&#34;community_and_contributions&#34;&gt;&#xA;&lt;img src=&#34;/blog/airflow-survey-2023/images/Airflow-Survey-2023-Results--Airflow-Usage-Page-1-Revised.png&#34; alt=&#34;Airflow Usage Page 1&#34;&gt;&#xA;&lt;img src=&#34;/blog/airflow-survey-2023/images/Astronomer-Airflow-Survey-2023-Results-Airflow-Usage-Page-2-Landscape.png&#34; alt=&#34;Airflow Usage Page 2&#34;&gt;&#xA;&lt;img src=&#34;/blog/airflow-survey-2023/images/Airflow-Survey-2023-Results-Airflow-Usage-Page-3-Revised-Landscape@2x.png&#34; alt=&#34;Airflow Usage Page 3&#34;&gt;&#xA;&lt;img src=&#34;/blog/airflow-survey-2023/images/Astronomer-Airflow-Survey-2023-Results-Future-Landscape@2x.png&#34; alt=&#34;Future&#34;&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://docs.google.com/forms/d/1wYm6c5Gn379zkg7zD7vcWB-1fCjnOocT0oZm-tjft_Q/viewanalytics&#34;&gt;View Raw Data&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.7.0 is here</title>
      <link>/blog/airflow-2.7.0/</link>
      <pubDate>Fri, 18 Aug 2023 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.7.0/</guid>
      <description>&lt;p&gt;I’m happy to announce that Apache Airflow 2.7.0 has been released! Some notable features have been added that we are excited for the community to use.&lt;/p&gt;&#xA;&lt;p&gt;Apache Airflow 2.7.0 contains over 500 commits, which include 40 new features, 49 improvements, 53 bug fixes, and 15 documentation changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.7.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.7.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.7.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.7.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠 Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.7.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.7.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &amp;ldquo;docker pull apache/airflow:2.7.0&amp;rdquo; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.7.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.7.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;Airflow 2.7.0 is a release that focuses on security. The Airflow security team, working together with security researchers, identified a number of areas that required strengthening of security. This resulted in, among others things, an improved description of the &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/stable/security/security_model/&#34;&gt;Airflow security model&lt;/a&gt;, a better explanation of our &lt;a href=&#34;https://github.com/apache/airflow/security/policy&#34;&gt;security policy&lt;/a&gt; and the disabling of certain, potentially dangerous, features by default - like, for example, connection testing (#32052).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Introducing Setup and Teardown tasks</title>
      <link>/blog/introducing_setup_teardown/</link>
      <pubDate>Fri, 18 Aug 2023 00:00:00 +0000</pubDate>
      <guid>/blog/introducing_setup_teardown/</guid>
      <description>&lt;p&gt;In data pipelines, commonly we need to create infrastructure resources, like a cluster or GPU nodes in an existing cluster, before doing the actual “work” and delete them after the work is done. Airflow 2.7 adds “setup” and “teardown” tasks to better support this type of pipeline. This blog post aims to highlight the key features so you know what’s possible. For full documentation on how to use setup and teardown tasks, see the &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.7.0/howto/setup-and-teardown.html&#34;&gt;setup and teardown docs&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    <item>
      <title>what&#39;s new in Apache Airflow 2.6.0</title>
      <link>/blog/airflow-2.6.0/</link>
      <pubDate>Sun, 30 Apr 2023 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.6.0/</guid>
      <description>&lt;p&gt;I am excited to announce that Apache Airflow 2.6.0 has been released, bringing many minor features and improvements to the community.&lt;/p&gt;&#xA;&lt;p&gt;Apache Airflow 2.6.0 contains over 500 commits, which include 42 new features, 58 improvements, 38 bug fixes, and 17 documentation changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.6.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.6.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.6.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.6.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠 Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.6.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.6.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: &amp;ldquo;docker pull apache/airflow:2.6.0&amp;rdquo; &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.6.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.6.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;As the changelog is quite large, the following are some notable new features that shipped in this release.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.5.0: Tick-Tock</title>
      <link>/blog/airflow-2.5.0/</link>
      <pubDate>Fri, 02 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.5.0/</guid>
      <description>&lt;p&gt;Apache Airflow 2.5 has just been released, barely two and a half months after 2.4!&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.5.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.5.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.5.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.5.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.5.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.5.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: docker pull apache/airflow:2.5.0 &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.5.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.5.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;This quicker release cadence is a departure from our previous habit of releasing every five-to-seven months and was a deliberate effort to listen to you, our users, and get the changes and improvements into your workflows earlier.&lt;/p&gt;&#xA;&lt;h2 id=&#34;usability-improvements-to-the-datasets-ui&#34;&gt;Usability improvements to the Datasets UI&lt;/h2&gt;&#xA;&lt;p&gt;When we released Dataset aware scheduling in September we knew that the tools we gave to manage the Datasets were very much a Minimum Viable Product, and in the last two months the committers and contributors have been hard at work at making the UI much more usable when it comes to Datasets.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.4.0: That Data Aware Release</title>
      <link>/blog/airflow-2.4.0/</link>
      <pubDate>Mon, 19 Sep 2022 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.4.0/</guid>
      <description>&lt;p&gt;Apache Airflow 2.4.0 contains over 650 &amp;ldquo;user-facing&amp;rdquo; commits (excluding commits to providers or chart) and over 870 total. That includes 46 new features, 39 improvements, 52 bug fixes, and several documentation changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.4.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.4.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.4.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.4.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.4.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.4.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: docker pull apache/airflow:2.4.0 &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.4.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.4.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h2 id=&#34;data-aware-scheduling-aip-48&#34;&gt;Data-aware scheduling (AIP-48)&lt;/h2&gt;&#xA;&lt;p&gt;This one is big. Airflow now has the ability to schedule DAGs based on other tasks updating datasets.&lt;/p&gt;&#xA;&lt;p&gt;What does this mean, exactly? This is a great new feature that lets DAG authors create smaller, more self-contained DAGs, which chain together into a larger data-based workflow. If you are currently using &lt;code&gt;ExternalTaskSensor&lt;/code&gt; or &lt;code&gt;TriggerDagRunOperator&lt;/code&gt; you should take a look at datasets &amp;ndash; in most cases you can replace them with something that will speed up the scheduling!&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2022</title>
      <link>/blog/airflow-survey-2022/</link>
      <pubDate>Fri, 17 Jun 2022 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey-2022/</guid>
      <description>&lt;h1 id=&#34;airflow-user-survey-2022&#34;&gt;Airflow User Survey 2022&lt;/h1&gt;&#xA;&lt;p&gt;This year’s survey has come and gone, and with it we’ve got a new batch of data for everyone! We collected 210 responses over two weeks. We continue to see growth in both contributions and downloads over the last two years, and expect that trend will continue through 2022.&lt;/p&gt;&#xA;&lt;p&gt;The raw response data will be made available here soon, in the meantime, feel free to email &lt;a href=&#34;mailto:john.thomas@astronomer.io&#34;&gt;john.thomas@astronomer.io&lt;/a&gt; for a copy.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Summit 2022</title>
      <link>/blog/airflow_summit_2022/</link>
      <pubDate>Mon, 16 May 2022 00:00:00 +0000</pubDate>
      <guid>/blog/airflow_summit_2022/</guid>
      <description>&lt;p&gt;The biggest Airflow Event of the Year returns May 23–27! Airflow Summit 2022 will bring together the global&#xA;community of Apache Airflow practitioners and data leaders.&lt;/p&gt;&#xA;&lt;h3 id=&#34;whats-on-the-agenda&#34;&gt;What’s on the Agenda&lt;/h3&gt;&#xA;&lt;p&gt;During the free conference, you will hear about Apache Airflow best practices, trends in building data&#xA;pipelines, data governance, Airflow and machine learning, and the future of Airflow. There will also be&#xA;a series of presentations on non-code contributions driving the open-source project.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.3.0 is here</title>
      <link>/blog/airflow-2.3.0/</link>
      <pubDate>Sat, 30 Apr 2022 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.3.0/</guid>
      <description>&lt;p&gt;Apache Airflow 2.3.0 contains over 700 commits since 2.2.0 and includes 50 new features, 99 improvements, 85 bug fixes, and several doc changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.3.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.3.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.3.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.3.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Release Notes: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.3.0/release_notes.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.3.0/release_notes.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: docker pull apache/airflow:2.3.0 &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.3.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.3.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;As the changelog is quite large, the following are some notable new features that shipped in this release.&lt;/p&gt;&#xA;&lt;h2 id=&#34;dynamic-task-mappingaip-42&#34;&gt;Dynamic Task Mapping(AIP-42)&lt;/h2&gt;&#xA;&lt;p&gt;There&amp;rsquo;s now first-class support for dynamic tasks in Airflow. What this means is that you can generate tasks dynamically at runtime. Much like using a &lt;code&gt;for&lt;/code&gt; loop&#xA;to create a list of tasks, here you can create the same tasks without having to know the exact number of tasks ahead of time.&lt;/p&gt;</description>
    </item>
    <item>
      <title>What&#39;s new in Apache Airflow 2.2.0</title>
      <link>/blog/airflow-2.2.0/</link>
      <pubDate>Mon, 11 Oct 2021 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-2.2.0/</guid>
      <description>&lt;p&gt;I’m proud to announce that Apache Airflow 2.2.0 has been released. It contains over 600 commits since 2.1.4 and includes 30 new features, 84 improvements, 85 bug fixes, and many internal and doc changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;p&gt;📦 PyPI: &lt;a href=&#34;https://pypi.org/project/apache-airflow/2.2.0/&#34;&gt;https://pypi.org/project/apache-airflow/2.2.0/&lt;/a&gt; &lt;br&gt;&#xA;📚 Docs: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.2.0/&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.2.0/&lt;/a&gt; &lt;br&gt;&#xA;🛠️ Changelog: &lt;a href=&#34;https://airflow.apache.org/docs/apache-airflow/2.2.0/changelog.html&#34;&gt;https://airflow.apache.org/docs/apache-airflow/2.2.0/changelog.html&lt;/a&gt; &lt;br&gt;&#xA;🐳 Docker Image: docker pull apache/airflow:2.2.0 &lt;br&gt;&#xA;🚏 Constraints: &lt;a href=&#34;https://github.com/apache/airflow/tree/constraints-2.2.0&#34;&gt;https://github.com/apache/airflow/tree/constraints-2.2.0&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;As the changelog is quite large, the following are some notable new features that shipped in this release.&lt;/p&gt;&#xA;&lt;h2 id=&#34;custom-timetables-aip-39&#34;&gt;Custom Timetables (AIP-39)&lt;/h2&gt;&#xA;&lt;p&gt;Airflow has historically used cron expressions and timedeltas to represent when a DAG should run. This worked for a lot of use cases, but not all. For example, running daily on Monday-Friday, but not on weekends wasn’t possible.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Summit 2021</title>
      <link>/blog/airflow_summit_2021/</link>
      <pubDate>Sun, 21 Mar 2021 00:00:00 +0000</pubDate>
      <guid>/blog/airflow_summit_2021/</guid>
      <description>&lt;h2 id=&#34;airflow-summit-2021-is-here&#34;&gt;Airflow Summit 2021 is here!&lt;/h2&gt;&#xA;&lt;p&gt;The summit will be held online, July 8-16, 2021. Join us from all over the world to find&#xA;out how Airflow is being used by leading companies, what is its roadmap and how you can&#xA;participate in its development.&lt;/p&gt;&#xA;&lt;h2 id=&#34;useful-information&#34;&gt;Useful information:&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;The official website: &lt;a href=&#34;https://airflowsummit.org&#34;&gt;https://airflowsummit.org&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;Call for proposals is open until &lt;strong&gt;12 April 2021&lt;/strong&gt;. To submit your talk go to &lt;a href=&#34;https://sessionize.com/airflow-summit-2021/&#34;&gt;https://sessionize.com/airflow-summit-2021/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;In case of any questions reach out to us via &lt;a href=&#34;mailto:info@airflowsummit.org&#34;&gt;info@airflowsummit.org&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2020</title>
      <link>/blog/airflow-survey-2020/</link>
      <pubDate>Tue, 09 Mar 2021 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey-2020/</guid>
      <description>&lt;h1 id=&#34;apache-airflow-survey-2020&#34;&gt;Apache Airflow Survey 2020&lt;/h1&gt;&#xA;&lt;p&gt;World of data processing tools is growing steadily. Apache Airflow seems to be already considered as&#xA;crucial component of this complex ecosystem. We observe steady growth in number of users as well as in&#xA;an amount of active contributors. So listening and understanding our community is of high importance.&lt;/p&gt;&#xA;&lt;p&gt;It&amp;rsquo;s worth to note that the 2020 survey was still mostly about 1.10.X version of Apache Airflow and&#xA;possibly many drawbacks were addressed in the 2.0 version that was released in December 2020. But if this&#xA;is true, we will learn next year!&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 2.0 is here!</title>
      <link>/blog/airflow-two-point-oh-is-here/</link>
      <pubDate>Thu, 17 Dec 2020 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-two-point-oh-is-here/</guid>
      <description>&lt;p&gt;I am proud to announce that Apache Airflow 2.0.0 has been released.&lt;/p&gt;&#xA;&lt;p&gt;The full changelog is about 3,000 lines long (already excluding everything backported to 1.10), so for now I&amp;rsquo;ll simply share some of the major features in 2.0.0 compared to 1.10.14:&lt;/p&gt;&#xA;&lt;h2 id=&#34;a-new-way-of-writing-dags-the-taskflow-api-aip-31&#34;&gt;A new way of writing dags: the TaskFlow API (AIP-31)&lt;/h2&gt;&#xA;&lt;p&gt;(Known in 2.0.0alphas as Functional DAGs.)&lt;/p&gt;&#xA;&lt;p&gt;DAGs are now much much nicer to author especially when using PythonOperator. Dependencies are handled more clearly and XCom is nicer to use&lt;/p&gt;</description>
    </item>
    <item>
      <title>Journey with Airflow as an Outreachy Intern</title>
      <link>/blog/experience-with-airflow-as-an-outreachy-intern/</link>
      <pubDate>Sun, 30 Aug 2020 00:00:00 +0000</pubDate>
      <guid>/blog/experience-with-airflow-as-an-outreachy-intern/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://www.outreachy.org/&#34;&gt;Outreachy&lt;/a&gt; is a program which organises three months paid internships with FOSS&#xA;projects for people who are typically underrepresented in those projects.&lt;/p&gt;&#xA;&lt;h3 id=&#34;contribution-period&#34;&gt;Contribution Period&lt;/h3&gt;&#xA;&lt;p&gt;The first thing I had to do was choose a project under an organisation. After going through all the projects&#xA;I chose “Extending the REST API of Apache Airflow”, because I had a good idea of what  REST API(s) are, so I&#xA;thought it would be easier to get started with the contributions. The next step was to set up Airflow’s dev&#xA;environment which thanks to &lt;a href=&#34;https://github.com/apache/airflow/blob/master/BREEZE.rst&#34;&gt;Breeze&lt;/a&gt;, was a breeze.&#xA;Since I had never contributed to FOSS before so this part was overwhelming but there were plenty of issues&#xA;labelled “good first issues” with detailed descriptions and some even had code snippets so luckily that nudged&#xA;me in the right direction. These things about Airflow and the positive vibes from the community were the reasons&#xA;why I chose to stick with Airflow as my Outreachy project.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 1.10.12</title>
      <link>/blog/airflow-1.10.12/</link>
      <pubDate>Tue, 25 Aug 2020 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-1.10.12/</guid>
      <description>&lt;p&gt;Airflow 1.10.12 contains 113 commits since 1.10.11 and includes 5 new features, 23 improvements, 23 bug fixes,&#xA;and several doc changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href=&#34;https://pypi.org/project/apache-airflow/1.10.12/&#34;&gt;https://pypi.org/project/apache-airflow/1.10.12/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href=&#34;https://airflow.apache.org/docs/1.10.12/&#34;&gt;https://airflow.apache.org/docs/1.10.12/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Changelog&lt;/strong&gt;: &lt;a href=&#34;http://airflow.apache.org/docs/1.10.12/changelog.html&#34;&gt;http://airflow.apache.org/docs/1.10.12/changelog.html&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;&lt;strong&gt;Airflow 1.10.11 has breaking changes with respect to&#xA;KubernetesExecutor &amp;amp; KubernetesPodOperator so I recommend users to directly upgrade to Airflow 1.10.12 instead&lt;/strong&gt;.&lt;/p&gt;&#xA;&lt;p&gt;Some of the noteworthy new features (user-facing) are:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/8560&#34;&gt;Allow defining custom XCom class&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/9645&#34;&gt;Get Airflow configs with sensitive data from Secret Backends&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/10282&#34;&gt;Add AirflowClusterPolicyViolation support to Airflow local settings&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;allow-defining-custom-xcom-class&#34;&gt;Allow defining Custom XCom class&lt;/h3&gt;&#xA;&lt;p&gt;Until Airflow 1.10.11, the XCom data was only stored in Airflow Metadatabase. From Airflow 1.10.12, users&#xA;would be able to define custom XCom classes. This will allow users to transfer larger data between tasks.&#xA;An example here would be to store XCom in S3 or GCS Bucket if the size of data that needs to be stored is larger&#xA;than &lt;code&gt;XCom.MAX_XCOM_SIZE&lt;/code&gt; (48 KB).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow For Newcomers</title>
      <link>/blog/apache-airflow-for-newcomers/</link>
      <pubDate>Mon, 17 Aug 2020 00:00:00 +0000</pubDate>
      <guid>/blog/apache-airflow-for-newcomers/</guid>
      <description>&lt;p&gt;Apache Airflow is a platform to programmatically author, schedule, and monitor workflows.&#xA;A workflow is a sequence of tasks that processes a set of data. You can think of workflow as the&#xA;path that describes how tasks go from being undone to done. Scheduling, on the other hand, is the&#xA;process of planning, controlling, and optimizing when a particular task should be done.&lt;/p&gt;&#xA;&lt;h3 id=&#34;authoring-workflow-in-apache-airflow&#34;&gt;Authoring Workflow in Apache Airflow.&lt;/h3&gt;&#xA;&lt;p&gt;Airflow makes it easy to author workflows using python scripts. A &lt;a href=&#34;https://en.wikipedia.org/wiki/Directed_acyclic_graph&#34;&gt;Directed Acyclic Graph&lt;/a&gt;&#xA;(DAG) represents a workflow in Airflow. It is a collection of tasks in a way that shows each task&amp;rsquo;s&#xA;relationships and dependencies. You can have as many DAGs as you want, and Airflow will execute&#xA;them according to the task&amp;rsquo;s relationships and dependencies. If task B depends on the successful&#xA;execution of another task A, it means Airflow will run task A and only run task B after task A.&#xA;This dependency is very easy to express in Airflow. For example, the above scenario is expressed as&lt;/p&gt;</description>
    </item>
    <item>
      <title>Implementing Stable API for Apache Airflow</title>
      <link>/blog/implementing-stable-api-for-apache-airflow/</link>
      <pubDate>Sun, 19 Jul 2020 00:00:00 +0000</pubDate>
      <guid>/blog/implementing-stable-api-for-apache-airflow/</guid>
      <description>&lt;p&gt;My &lt;a href=&#34;https://outreachy.org&#34;&gt;Outreachy internship&lt;/a&gt; is coming to its ends which is also the best time to look back and&#xA;reflect on the progress so far.&lt;/p&gt;&#xA;&lt;p&gt;The goal of my project is to Extend and Improve the Apache Airflow REST API. In this post,&#xA;I will be sharing my progress so far.&lt;/p&gt;&#xA;&lt;p&gt;We started a bit late implementing the REST API because it took time for the OpenAPI 3.0&#xA;specification we were to use for the project to be merged. Thanks to &lt;a href=&#34;https://github.com/mik-laj&#34;&gt;Kamil&lt;/a&gt;,&#xA;who paved the way for us to start implementing the REST API endpoints. Below are the endpoints I&#xA;implemented and the challenges I encountered, including how I overcame them.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 1.10.10</title>
      <link>/blog/airflow-1.10.10/</link>
      <pubDate>Thu, 09 Apr 2020 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-1.10.10/</guid>
      <description>&lt;p&gt;Airflow 1.10.10 contains 199 commits since 1.10.9 and includes 11 new features, 43 improvements, 44 bug fixes, and several doc changes.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href=&#34;https://pypi.org/project/apache-airflow/1.10.10/&#34;&gt;https://pypi.org/project/apache-airflow/1.10.10/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href=&#34;https://airflow.apache.org/docs/1.10.10/&#34;&gt;https://airflow.apache.org/docs/1.10.10/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Changelog&lt;/strong&gt;: &lt;a href=&#34;http://airflow.apache.org/docs/1.10.10/changelog.html&#34;&gt;http://airflow.apache.org/docs/1.10.10/changelog.html&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;Some of the noteworthy new features (user-facing) are:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/8046&#34;&gt;Allow user to choose timezone to use in the RBAC UI&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/7832&#34;&gt;Add Production Docker image support&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;http://airflow.apache.org/docs/1.10.10/howto/use-alternative-secrets-backend.html&#34;&gt;Allow Retrieving Airflow Connections &amp;amp; Variables from various Secrets backend&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;http://airflow.apache.org/docs/1.10.10/dag-serialization.html&#34;&gt;Stateless Webserver using DAG Serialization&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/7880&#34;&gt;Tasks with Dummy Operators are no longer sent to executor&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/7312&#34;&gt;Allow passing DagRun conf when triggering dags via UI&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;allow-user-to-choose-timezone-to-use-in-the-rbac-ui&#34;&gt;Allow user to choose timezone to use in the RBAC UI&lt;/h3&gt;&#xA;&lt;p&gt;By default the Web UI will show times in UTC. It is possible to change the timezone shown by using the menu in the top&#xA;right (click on the clock to activate it):&lt;/p&gt;</description>
    </item>
    <item>
      <title>Apache Airflow 1.10.8 &amp; 1.10.9</title>
      <link>/blog/airflow-1.10.8-1.10.9/</link>
      <pubDate>Sun, 23 Feb 2020 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-1.10.8-1.10.9/</guid>
      <description>&lt;p&gt;Airflow 1.10.8 contains 160 commits since 1.10.7 and includes 4 new features, 42 improvements, 36 bug fixes, and several doc changes.&lt;/p&gt;&#xA;&lt;p&gt;We released 1.10.9 on the same day as one of the Flask dependencies (Werkzeug) released 1.0 which broke Airflow 1.10.8.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Details&lt;/strong&gt;:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href=&#34;https://pypi.org/project/apache-airflow/1.10.9/&#34;&gt;https://pypi.org/project/apache-airflow/1.10.9/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href=&#34;https://airflow.apache.org/docs/1.10.9/&#34;&gt;https://airflow.apache.org/docs/1.10.9/&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Changelog (1.10.8)&lt;/strong&gt;: &lt;a href=&#34;http://airflow.apache.org/docs/1.10.8/changelog.html#airflow-1-10-8-2020-01-07&#34;&gt;http://airflow.apache.org/docs/1.10.8/changelog.html#airflow-1-10-8-2020-01-07&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;strong&gt;Changelog (1.10.9)&lt;/strong&gt;: &lt;a href=&#34;http://airflow.apache.org/docs/1.10.9/changelog.html#airflow-1-10-9-2020-02-10&#34;&gt;http://airflow.apache.org/docs/1.10.9/changelog.html#airflow-1-10-9-2020-02-10&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;Some of the noteworthy new features (user-facing) are:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/6489&#34;&gt;Add tags to DAGs and use it for filtering in the UI (RBAC only)&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;http://airflow.apache.org/docs/1.10.9/executor/debug.html&#34;&gt;New Executor: DebugExecutor for Local debugging from your IDE&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/7281&#34;&gt;Allow passing conf in &amp;ldquo;Add DAG Run&amp;rdquo; (Triggered Dags) view&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/apache/airflow/pull/7038&#34;&gt;Allow dags to run for future execution dates for manually triggered DAGs (only if &lt;code&gt;schedule_interval=None&lt;/code&gt;)&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://airflow.apache.org/docs/1.10.9/configurations-ref.html&#34;&gt;Dedicated page in documentation for all configs in airflow.cfg&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;add-tags-to-dags-and-use-it-for-filtering-in-the-ui&#34;&gt;Add tags to DAGs and use it for filtering in the UI&lt;/h3&gt;&#xA;&lt;p&gt;In order to filter DAGs (e.g. by team), you can add tags in each dag. The filter is saved in a cookie and can be reset by the reset button.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Experience in Google Season of Docs 2019 with Apache Airflow</title>
      <link>/blog/experience-in-google-season-of-docs-2019-with-apache-airflow/</link>
      <pubDate>Fri, 20 Dec 2019 00:00:00 +0000</pubDate>
      <guid>/blog/experience-in-google-season-of-docs-2019-with-apache-airflow/</guid>
      <description>&lt;p&gt;I came across &lt;a href=&#34;https://developers.google.com/season-of-docs&#34;&gt;Google Season of Docs&lt;/a&gt; (GSoD) almost by accident, thanks to my extensive HackerNews and Twitter addiction.  I was familiar with the Google Summer of Code but not with this program.&#xA;It turns out it was the inaugural phase. I read the details, and the process felt a lot like GSoC except that this was about documentation.&lt;/p&gt;&#xA;&lt;h2 id=&#34;about-me&#34;&gt;About Me&lt;/h2&gt;&#xA;&lt;p&gt;I have been writing tech articles on medium as well as my blog for the past 1.5 years.  Blogging helps me test my understanding of the concepts as untangling the toughest of ideas in simple sentences requires a considerable time investment.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Airflow Survey 2019</title>
      <link>/blog/airflow-survey/</link>
      <pubDate>Wed, 11 Dec 2019 00:00:00 +0000</pubDate>
      <guid>/blog/airflow-survey/</guid>
      <description>&lt;h1 id=&#34;apache-airflow-survey-2019&#34;&gt;Apache Airflow Survey 2019&lt;/h1&gt;&#xA;&lt;p&gt;Apache Airflow is &lt;a href=&#34;https://www.astronomer.io/blog/why-airflow/&#34;&gt;growing faster than ever&lt;/a&gt;.&#xA;Thus, receiving and adjusting to our users’ feedback is a must. We created&#xA;&lt;a href=&#34;https://forms.gle/XAzR1pQBZiftvPQM7&#34;&gt;survey&lt;/a&gt; and we got &lt;strong&gt;308&lt;/strong&gt; responses.&#xA;Let’s see who Airflow users are, how they play with it, and what they miss.&lt;/p&gt;&#xA;&lt;h1 id=&#34;overview-of-the-user&#34;&gt;Overview of the user&lt;/h1&gt;&#xA;&lt;p&gt;&lt;strong&gt;What best describes your current occupation?&lt;/strong&gt;&lt;/p&gt;&#xA;&lt;table&gt;&#xA;  &lt;thead&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;th&gt;&lt;/th&gt;&#xA;          &lt;th&gt;No.&lt;/th&gt;&#xA;          &lt;th&gt;%&lt;/th&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/thead&gt;&#xA;  &lt;tbody&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Data Engineer&lt;/td&gt;&#xA;          &lt;td&gt;194&lt;/td&gt;&#xA;          &lt;td&gt;62.99%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Developer&lt;/td&gt;&#xA;          &lt;td&gt;34&lt;/td&gt;&#xA;          &lt;td&gt;11.04%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Architect&lt;/td&gt;&#xA;          &lt;td&gt;23&lt;/td&gt;&#xA;          &lt;td&gt;7.47%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Data Scientist&lt;/td&gt;&#xA;          &lt;td&gt;19&lt;/td&gt;&#xA;          &lt;td&gt;6.17%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Data Analyst&lt;/td&gt;&#xA;          &lt;td&gt;13&lt;/td&gt;&#xA;          &lt;td&gt;4.22%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;DevOps&lt;/td&gt;&#xA;          &lt;td&gt;13&lt;/td&gt;&#xA;          &lt;td&gt;4.22%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;IT Administrator&lt;/td&gt;&#xA;          &lt;td&gt;2&lt;/td&gt;&#xA;          &lt;td&gt;0.65%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Machine Learning Engineer&lt;/td&gt;&#xA;          &lt;td&gt;2&lt;/td&gt;&#xA;          &lt;td&gt;0.65%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Manager&lt;/td&gt;&#xA;          &lt;td&gt;2&lt;/td&gt;&#xA;          &lt;td&gt;0.65%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Operations&lt;/td&gt;&#xA;          &lt;td&gt;2&lt;/td&gt;&#xA;          &lt;td&gt;0.65%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Chief Data Officer&lt;/td&gt;&#xA;          &lt;td&gt;1&lt;/td&gt;&#xA;          &lt;td&gt;0.32%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Engineering Manager&lt;/td&gt;&#xA;          &lt;td&gt;1&lt;/td&gt;&#xA;          &lt;td&gt;0.32%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Intern&lt;/td&gt;&#xA;          &lt;td&gt;1&lt;/td&gt;&#xA;          &lt;td&gt;0.32%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Product owner&lt;/td&gt;&#xA;          &lt;td&gt;1&lt;/td&gt;&#xA;          &lt;td&gt;0.32%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;Quant&lt;/td&gt;&#xA;          &lt;td&gt;1&lt;/td&gt;&#xA;          &lt;td&gt;0.32%&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/tbody&gt;&#xA;&lt;/table&gt;&#xA;&lt;p&gt;&lt;strong&gt;In your day to day job, what do you use Airflow for?&lt;/strong&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>New Airflow website</title>
      <link>/blog/announcing-new-website/</link>
      <pubDate>Wed, 11 Dec 2019 00:00:00 +0000</pubDate>
      <guid>/blog/announcing-new-website/</guid>
      <description>&lt;p&gt;The brand &lt;a href=&#34;https://airflow.apache.org/&#34;&gt;new Airflow website&lt;/a&gt; has arrived! Those who have been following the process know that the journey to update &lt;a href=&#34;https://airflow.readthedocs.io/en/1.10.6/&#34;&gt;the old Airflow website&lt;/a&gt; started at the beginning of the year.&#xA;Thanks to sponsorship from the Cloud Composer team at Google that allowed us to&#xA;collaborate with &lt;code&gt;Polidea&lt;/code&gt; and with their design studio &lt;code&gt;Utilo&lt;/code&gt;, and deliver an awesome website.&lt;/p&gt;&#xA;&lt;p&gt;Documentation of open source projects is key to engaging new contributors in the maintenance,&#xA;development, and adoption of software. We want the Apache Airflow community to have&#xA;the best possible experience to contribute and use the project. We also took this opportunity to make the project&#xA;more accessible, and in doing so, increase its reach.&lt;/p&gt;</description>
    </item>
    <item>
      <title>ApacheCon Europe 2019 — Thoughts and Insights by Airflow Committers</title>
      <link>/blog/apache-con-europe-2019-thoughts-and-insights-by-airflow-committers/</link>
      <pubDate>Fri, 22 Nov 2019 00:00:00 +0000</pubDate>
      <guid>/blog/apache-con-europe-2019-thoughts-and-insights-by-airflow-committers/</guid>
      <description>&lt;p&gt;Is it possible to create an organization that delivers tens of projects used by millions, nearly no one is paid for doing their job, and still, it has been fruitfully carrying on for more than 20 years? Apache Software Foundation proves it is possible. For the last two decades, ASF has been crafting a model called the Apache Way—a way of organizing and leading tech open source projects. Due to this approach, which is strongly based on the “community over code” motto, we can enjoy such awesome projects like Apache Spark, Flink, Beam, or Airflow (and many more).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Documenting using local development environment</title>
      <link>/blog/documenting-using-local-development-environments/</link>
      <pubDate>Fri, 22 Nov 2019 00:00:00 +0000</pubDate>
      <guid>/blog/documenting-using-local-development-environments/</guid>
      <description>&lt;h2 id=&#34;documenting-local-development-environment-of-apache-airflow&#34;&gt;Documenting local development environment of Apache Airflow&lt;/h2&gt;&#xA;&lt;p&gt;From Sept to November 2019 I have been participating in a wonderful initiative, &lt;a href=&#34;https://developers.google.com/season-of-docs&#34;&gt;Google Season of Docs&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;p&gt;I had a pleasure to contribute to the Apache Airflow open source project as a technical writer.&#xA;My initial assignment was an extension to the GitHub-based Contribution guide.&lt;/p&gt;&#xA;&lt;p&gt;From the very first days I have been pretty closely involved into inter-project communications&#xA;via emails/slack and had regular 1:1s with my mentor, Jarek Potiuk.&lt;/p&gt;</description>
    </item>
    <item>
      <title>It&#39;s a &#34;Breeze&#34; to develop Apache Airflow</title>
      <link>/blog/its-a-breeze-to-develop-apache-airflow/</link>
      <pubDate>Fri, 22 Nov 2019 00:00:00 +0000</pubDate>
      <guid>/blog/its-a-breeze-to-develop-apache-airflow/</guid>
      <description>&lt;h2 id=&#34;the-story-behind-the-airflow-breeze-tool&#34;&gt;The story behind the Airflow Breeze tool&lt;/h2&gt;&#xA;&lt;p&gt;Initially, we started contributing to this fantastic open-source project [Apache Airflow] with a team of three which then grew to five. When we kicked it off a year ago, I realized pretty soon where the biggest bottlenecks and areas for improvement in terms of productivity were. Even with the help of our client, who provided us with a “homegrown” development environment it took us literally days to set it up and learn some basics.&lt;/p&gt;</description>
    </item>
    <item>
      <title></title>
      <link>/community/resources/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/community/resources/</guid>
      <description>&lt;div class=&#34;resources--grid&#34;&gt;&#xA;    &lt;div class=&#34;community--dev&#34;&gt;&#xA;        &lt;div class=&#34;community--accordion-container&#34;&gt;&#xA;            &#xA;&#xA;&#xA;&lt;details class=&#34;accordion&#34; &gt;&#xA;    &lt;summary&gt;&#xA;        &lt;h4 class=&#34;accordion__summary-content--header  subtitle__large--greyish-brown &#34;&gt;Promo Materials&lt;/h4&gt;&#xA;        &lt;div class=&#34;accordion__arrow&#34;&gt;&#xA;            &lt;svg xmlns=&#34;http://www.w3.org/2000/svg&#34; width=&#34;18.904&#34; height=&#34;11.451&#34;&gt;&lt;path d=&#34;M16.905 1.414 9.452 8.867 1.999 1.414l-.585.585 7.453 7.453.585.585.585-.585 7.453-7.453z&#34; data-name=&#34;Path 684&#34; fill=&#34;#017cee&#34; stroke=&#34;#017cee&#34; stroke-width=&#34;2&#34;/&gt;&lt;/svg&gt;&#xA;        &lt;/div&gt;&#xA;        &lt;div class=&#34;accordion__summary-content&#34;&gt;&#xA;            &#xA;                &lt;div class=&#34;accordion__summary-content--icon&#34;&gt;&#xA;                    &lt;svg xmlns=&#34;http://www.w3.org/2000/svg&#34; width=&#34;60&#34; height=&#34;42.748&#34; viewBox=&#34;0 0 60 42.748&#34;&gt;&lt;g id=&#34;Group_1455&#34; data-name=&#34;Group 1455&#34; transform=&#34;translate(853.723 -652.626)&#34;&gt;&lt;g id=&#34;Group_1453&#34; data-name=&#34;Group 1453&#34;&gt;&lt;path id=&#34;Path_1235&#34; d=&#34;M-804.022 695.374h-46.324a3.224 3.224.0 01-3.377-3.037.627.627.0 01.02-.155l6.9-26.219a3.236 3.236.0 013.376-2.954h46.327a3.224 3.224.0 013.377 3.037.627.627.0 01-.02.155l-6.9 26.22a3.236 3.236.0 01-3.379 2.953zm-48.483-2.965a2.032 2.032.0 002.159 1.749h46.324a2.015 2.015.0 002.161-1.821.584.584.0 01.02-.155l6.9-26.208a2.032 2.032.0 00-2.159-1.749h-46.324a2.015 2.015.0 00-2.161 1.821.627.627.0 01-.02.155zm58.174-26.363z&#34; fill=&#34;#017cee&#34; data-name=&#34;Path 1235&#34;/&gt;&lt;/g&gt;&lt;g id=&#34;Group_1454&#34; data-name=&#34;Group 1454&#34;&gt;&lt;path id=&#34;Path_1236&#34; d=&#34;M-815.831 695.374h-34.515a3.381 3.381.0 01-3.377-3.374v-38.766a.608.608.0 01.608-.608h14.665a.61.61.0 01.477.231l2.919 3.691h31.032a3.381 3.381.0 013.377 3.377v3.692a.608.608.0 01-.608.608.608.608.0 01-.608-.608v-3.692a2.163 2.163.0 00-2.161-2.16h-31.326a.61.61.0 01-.477-.231l-2.92-3.692h-13.762V692a2.163 2.163.0 002.161 2.161h34.515a.608.608.0 01.608.608.608.608.0 01-.608.605z&#34; fill=&#34;#017cee&#34; data-name=&#34;Path 1236&#34;/&gt;&lt;/g&gt;&lt;/g&gt;&lt;/svg&gt;&#xA;                &lt;/div&gt;&#xA;            &#xA;            &lt;div class=&#34;accordion__summary-content--text&#34;&gt;&#xA;                &lt;span class=&#34;bodytext__medium--brownish-grey&#34;&gt;Download official Apache Airflow branding materials, including logos and banners, to accurately represent and promote the project.&lt;/span&gt;&#xA;            &lt;/div&gt;&#xA;        &lt;/div&gt;&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;accordion__content indented&#34;&gt;&#xA;                &lt;ul class=&#34;ticks-blue mx-auto&#34;&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+logos&#34;&gt;Airflow logos&lt;/a&gt;&#xA;                    &lt;/li&gt;&#xA;                    &lt;li&gt;Logo SVGs (light):&#xA;                        &lt;a href=&#34;/images/airflow-logo.svg&#34;&gt;Horizontal&lt;/a&gt;,&#xA;                        &lt;a href=&#34;/images/airflow-logo-small.svg&#34;&gt;Small&lt;/a&gt;,&#xA;                        &lt;a href=&#34;/images/airflow-icon.svg&#34;&gt;Icon&lt;/a&gt;&#xA;                    &lt;/li&gt;&#xA;                    &lt;li&gt;Logo SVGs (dark):&#xA;                        &lt;a href=&#34;/images/airflow-logo-dark.svg&#34;&gt;Horizontal&lt;/a&gt;,&#xA;                        &lt;a href=&#34;/images/airflow-logo-small-dark.svg&#34;&gt;Small&lt;/a&gt;,&#xA;                        &lt;a href=&#34;/images/airflow-icon-dark.svg&#34;&gt;Icon&lt;/a&gt;&#xA;                    &lt;/li&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Brandbook&#34;&gt;Brandbook&lt;/a&gt;&lt;/li&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Drawio+Diagrams&#34;&gt;Drawio&#xA;                            Diagrams&lt;/a&gt;&lt;/li&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Lucidchart+Diagrams&#34;&gt;Lucidchart&#xA;                            Diagrams&lt;/a&gt;&lt;/li&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Promo+stuff&#34;&gt;Promo stuff&lt;/a&gt;&lt;/li&gt;&#xA;                    &lt;li&gt;&lt;a href=&#34;https://cwiki.apache.org/confluence/display/AIRFLOW/Proposed+Logo+Redesign&#34;&gt;Proposed&#xA;                            Logo Redesign&lt;/a&gt;&lt;/li&gt;&#xA;                    &lt;/ul&gt;&#xA;                    &lt;/div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;        &lt;/div&gt;&#xA;    &lt;/div&gt;&#xA;&lt;/div&gt;</description>
    </item>
    <item>
      <title>adjoe</title>
      <link>/use-cases/adjoe/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/adjoe/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;Before adopting Airflow at adjoe, we handled job scheduling in two main ways: by setting up Kubernetes cronjobs or building AWS Lambda functions. While both approaches had their benefits, they also came with limitations, especially when it came to managing more complex workloads. As our data science teams needs evolved, it became clear that we needed a more robust and flexible orchestration tool.&lt;/p&gt;&#xA;&lt;h5 id=&#34;how-did-apache-airflow-help-to-solve-this-problem&#34;&gt;How did Apache Airflow help to solve this problem?&lt;/h5&gt;&#xA;&lt;p&gt;With the creation of a new AWS environment for the data science teams, we introduced Airflow on Kubernetes as our primary orchestration solution, addressing both stability and scalability requirements.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Adobe</title>
      <link>/use-cases/adobe/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/adobe/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;Modern big data platforms need sophisticated data pipelines connecting to many backend services enabling complex workflows. These workflows need to be deployed, monitored, and run either on regular schedules or triggered by external events. Adobe Experience Platform component services architected and built an orchestration service to enable their users to author, schedule, and monitor complex hierarchical (including sequential and parallel) workflows for Apache Spark (TM) and non-Spark jobs.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Adyen</title>
      <link>/use-cases/adyen/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/adyen/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;Many years ago we started out with our own orchestration framework. Due to all the required custom functionality it made sense at the time. However, quickly we realized creating an orchestration tool is not to be underestimated.  With the quickly increasing number of users and teams, time spent on fixing issues increased, severely limiting development speed. Furthermore, due to it not being open source, we constantly had to make the effort ourselves to stay up to date with the industry standards and tools. We needed a tool for our Big Data Platform to schedule and execute many ETL jobs while at the same time, giving our users the possibility to redo or undo their tasks.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Big Fish Games</title>
      <link>/use-cases/big-fish-games/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/big-fish-games/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;The main challenge is the lack of standardized  ETL workflow orchestration tools. PowerShell and Python-based ETL frameworks built in-house are currently used for scheduling and running analytical workloads. However, there is no web UI through which we can monitor these workflows and it requires additional effort to maintain this framework. These scheduled jobs based on external dependencies are not well suited to modern Big Data platforms and their complex workflows. Although we experimented with Apache Oozie for certain workflows, it did not handle failed jobs properly. For late data arrival, these tools are not flexible enough to enforce retry attempts for the job failures.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Business Operations</title>
      <link>/use-cases/business_operations/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/business_operations/</guid>
      <description>&lt;div style=&#34;display: flex; justify-content: center; align-items: center;&#34;&gt;&#xA;&lt;h1 id=&#34;use-airflow-for-business-operations-pipelines&#34;&gt;Use Airflow for Business operations pipelines&lt;/h1&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;Airflow can be the starting point for your business idea! For many companies, Airflow delivers the data that powers their core business applications. Whether you need to aggregate user data to power personalized recommendations, display analytics in a user-facing dashboard, or prepare the input data for an LLM, Airflow is the perfect orchestrator.&lt;/p&gt;&#xA;&lt;p&gt;This video shows an example of using Airflow to run the pipelines that power a customer-facing analytics dashboard. You can find the code shown in this example &lt;a href=&#34;https://github.com/astronomer/business-operations-structure-example&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Dish</title>
      <link>/use-cases/dish/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/dish/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;We faced increasing complexity managing lengthy crontabs with scheduling being an issue, this required carefully planning timing due to resource constraints, usage patterns, and especially custom code needed for retry logic.  In the last case, having to verify success of previous jobs and/or steps prior to running the next.  Furthermore, time to results is important, but we were increasingly relying on buffers for processing, where things were effectively sitting idle and not processing, waiting for the next stage, in an effort to not rely as much on custom code/logic.&lt;/p&gt;</description>
    </item>
    <item>
      <title>ETL/ELT</title>
      <link>/use-cases/etl_analytics/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/etl_analytics/</guid>
      <description>&lt;div style=&#34;display: flex; justify-content: center; align-items: center;&#34;&gt;&#xA;&lt;h1 id=&#34;use-airflow-for-etlelt-pipelines&#34;&gt;Use Airflow for ETL/ELT pipelines&lt;/h1&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;Extract-Transform-Load (ETL) and Extract-Load-Transform (ELT) data pipelines are the most common use case for Apache Airflow. 90% of respondents in the 2023 Apache Airflow survey are using Airflow for ETL/ELT to power analytics use cases.&lt;/p&gt;&#xA;&lt;p&gt;The video below shows a simple ETL/ELT pipeline in Airflow that extracts climate data from a CSV file, as well as weather data from an API, runs transformations and then loads the results into a database to power a dashboard.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Experity</title>
      <link>/use-cases/experity/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/experity/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;We had to deploy our complex, flagship app to multiple nodes in multiple ways. This required tasks to communicate across Windows nodes and coordinate timing perfectly. We did not want to buy an expensive enterprise scheduling tool and needed ultimate flexibility.&lt;/p&gt;&#xA;&lt;h5 id=&#34;how-did-apache-airflow-help-to-solve-this-problem&#34;&gt;How did Apache Airflow help to solve this problem?&lt;/h5&gt;&#xA;&lt;p&gt;Ultimately we decided flexible, multi-node, DAG capable tooling was key and airflow was one of the few tools that fit that bill. Having it based on open source and python were large factors that upheld our core principles. At the time, Airflow was missing a windows hook and operator so we contributed the WinRM hook and operator back to the community. Given its flexibility we also use DAG generators to have our metadata drive our DAGs and keep maintenance costs down.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Infrastructure Management</title>
      <link>/use-cases/infrastructure-management/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/infrastructure-management/</guid>
      <description>&lt;div style=&#34;display: flex; justify-content: center; align-items: center;&#34;&gt;&#xA;&lt;h1 id=&#34;use-airflow-for-infrastructure-management&#34;&gt;Use Airflow for Infrastructure Management&lt;/h1&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;Airflow can interact with any API, which makes it a great tool to manage your infrastructure, such as Kubernetes or Spark clusters running in any cloud. As of Airflow 2.7, the setup/teardown feature is available, a special type of task with intelligent behavior to spin up and tear down infrastructure at the exact time you need it.&lt;/p&gt;&#xA;&lt;p&gt;Infrastructure management is often needed within the context of other use cases, such as MLOps, or implementing data quality checks. This video shows an example of how it might be used for an MLOps pipeline. You can find the code shown in this example &lt;a href=&#34;https://github.com/astronomer/use-case-setup-teardown-data-quality&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    <item>
      <title>MLOps</title>
      <link>/use-cases/mlops/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/mlops/</guid>
      <description>&lt;div style=&#34;display: flex; justify-content: center; align-items: center;&#34;&gt;&#xA;&lt;h1 id=&#34;use-airflow-for-machine-learning-operations-mlops&#34;&gt;Use Airflow for Machine Learning Operations (MLOps)&lt;/h1&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;Machine Learning Operations (MLOps) is a broad term encompassing everything needed to run machine learning models in production. MLOps is a rapidly evolving field with many different best practices and behavioral patterns, with Apache Airflow providing tool agnostic orchestration capabilities for all steps. An emerging subset of MLOps is Large Language Model Operations (LLMOps), which focuses on developing pipelines around applications of large language models like GPT-4 or Command.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Onefootball</title>
      <link>/use-cases/onefootball/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/onefootball/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;With millions of daily active users, managing the complexity of data engineering at Onefootball is a constant challenge. Lengthy crontabs, multiplication of custom API clients, erosion of confidence in the analytics served, increasing heroism (&amp;ldquo;only one person can solve this issue&amp;rdquo;). Those are the challenges that most teams face unless they consciously invest in their tools and processes.&lt;/p&gt;&#xA;&lt;p&gt;On top of that, new data tools appear each month: third party data sources, cloud providers solutions, different storage technologies&amp;hellip; Managing all those integrations is costly and brittle, especially for small data engineering teams that are trying to do more with less.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Plarium Krasnodar</title>
      <link>/use-cases/plarium-krasnodar/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/plarium-krasnodar/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;Our Research &amp;amp; Development department carries out various experiments, and in all of them, we need to create workflow orchestrations for solving tasks in game dev. Previously, we didn&amp;rsquo;t have any suitable tools with a sufficient number of built-in functions, and we had to orchestrate processes manually and entirely from scratch every time. This led to difficulties with dependencies and monitoring when building complex workflows. We needed a tool that would provide a more centralized approach so that we could see all the logs, the number of retries, and the task performance time. The most important thing that we lacked was the ability to backfill historical data and restart failed tasks.&lt;/p&gt;</description>
    </item>
    <item>
      <title>RancherBySUSE</title>
      <link>/use-cases/suse/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/suse/</guid>
      <description>&lt;h4 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h4&gt;&#xA;&lt;p&gt;Our aim was to build, package, test and distribute curated and trusted containers at scale in an automated way. Those containers can be of any nature, meaning that we need a solution that allows us to build any kind of software with any kind of building tools like Maven, Rust, Java, Ant, or Go.&lt;/p&gt;&#xA;&lt;p&gt;The construction of these containers requires the installation of several libraries (which may even conflict) and the orchestration of complex workflows with several integrations, executed either on a scheduled basis or triggered by events from external systems.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Search Results</title>
      <link>/search/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/search/</guid>
      <description></description>
    </item>
    <item>
      <title>Seniorlink</title>
      <link>/use-cases/seniorlink/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/seniorlink/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;Here at Seniorlink, we provide services, support, and technology that engages family caregivers. One of our focuses is using data to bolster our knowledge and improve the experience of our users. Like many looking to build an effective data stack, we adopted a Python, Spark, Redshift, and Tableau core toolset.&lt;/p&gt;&#xA;&lt;p&gt;We had built a robust stack of batch processes to deliver value to the business, deploying these data services in AWS using a mixture of EMR, ECS, Lambda, and EC2. Moving fast, as many new endeavors do, we ultimately ended up with one monolithic batch process with many smaller satellite jobs. Given the scale and quantity of jobs, we began to lose transparency as to what was happening. Additionally, many jobs were launched in a single EMR cluster and so tightly coupled that a failure in one job required the recompute of all the jobs run on that cluster. These behaviors are highly inefficient, difficult to debug and result in long iteration periods given the duration of these batch jobs.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Sift</title>
      <link>/use-cases/sift/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/sift/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;At Sift, we’re constantly training machine learning models that feed into the core of Sift’s Digital Trust &amp;amp; Safety platform. The platform gives our customers a way to discern suspicious online behavior from trustworthy behavior, allowing our customers to protect their online transactions, maintain the integrity of their content platforms, and keep their users’ accounts secure. To make this possible, we’ve built model training pipelines that consist of hundreds of steps in MapReduce and Spark, with complex requirements between them.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Snapp</title>
      <link>/use-cases/snapp/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      <guid>/use-cases/snapp/</guid>
      <description>&lt;h5 id=&#34;what-was-the-problem&#34;&gt;What was the problem?&lt;/h5&gt;&#xA;&lt;p&gt;As the Map team at Snapp, one of the largest and fastest-growing internet companies in the Middle East, we have experienced significant growth over the past couple of years, expanding from a team of 7 to a team of 60. However, with this growth came the realization that some of our crucial tasks were being performed manually. This manual approach consumed valuable time and hindered our ability to execute these tasks efficiently.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
