Apache Airflow Documentation — Airflow Documentation (2024)

Airflow is a platform to programmatically author, schedule and monitorworkflows.

Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks.The airflow scheduler executes your tasks on an array of workers whilefollowing the specified dependencies. Rich command line utilities makeperforming complex surgeries on DAGs a snap. The rich user interfacemakes it easy to visualize pipelines running in production,monitor progress, and troubleshoot issues when needed.

When workflows are defined as code, they become more maintainable,versionable, testable, and collaborative.

Beyond the Horizon

Airflow is not a data streaming solution. Tasks do not move data fromone to the other (though tasks can exchange metadata!). Airflow is notin the Spark Streamingor Storm space, it is more comparable toOozie orAzkaban.

Workflows are expected to be mostly static or slowly changing. You can thinkof the structure of the tasks in your workflow as slightly more dynamicthan a database structure would be. Airflow workflows are expected to looksimilar from a run to the next, this allows for clarity aroundunit of work and continuity.

Apache Airflow Documentation — Airflow Documentation (2024)
Top Articles
Latest Posts
Article information

Author: Ouida Strosin DO

Last Updated:

Views: 6122

Rating: 4.6 / 5 (76 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Ouida Strosin DO

Birthday: 1995-04-27

Address: Suite 927 930 Kilback Radial, Candidaville, TN 87795

Phone: +8561498978366

Job: Legacy Manufacturing Specialist

Hobby: Singing, Mountain biking, Water sports, Water sports, Taxidermy, Polo, Pet

Introduction: My name is Ouida Strosin DO, I am a precious, combative, spotless, modern, spotless, beautiful, precious person who loves writing and wants to share my knowledge and understanding with you.