CV Mantra
Sale!
,

Live Online Apache Airflow Course for Data Engineering

Original price was: ₹45,000.00.Current price is: ₹25,000.00.

Duration: 4 Weeks | Total Time: 40 Hours

Format: Live online sessions using Google meet or MS Teams with hands-on coding, mini-projects, and a capstone project by an industry expert.
Target Audience: College Students, Professionals in Finance, HR, Marketing, Operations, Analysts, and Entrepreneurs
Tools Required: Laptop with internet
Trainer: Industry professional with hands on expertise

Live Course Module: Apache Airflow Course for Data Engineering

Total Duration: 40 Hours (4 Weeks)


Week 1: Introduction to Apache Airflow & Core Concepts

Duration: 8 hours (4 sessions × 2 hrs)

Topics:

  1. Introduction to Workflow Orchestration (2 hrs)

    • What is orchestration?

    • Role of Airflow in Data Engineering

    • Airflow vs Luigi vs Prefect comparison

    • Airflow architecture: Scheduler, Executor, Worker, Web Server

  2. Airflow Installation & Environment Setup (2 hrs)

    • Installing Airflow using pip and Docker

    • Understanding Airflow components

    • Navigating Airflow UI (DAGs, Logs, Tasks, Graphs)

  3. Understanding DAGs & Tasks (2 hrs)

    • Creating a simple DAG in Python

    • Operators: PythonOperator, BashOperator, DummyOperator

    • Dependencies: set_upstream(), set_downstream(), >> and <<

  4. Mini Project + Q&A (2 hrs)

    • Build a simple ETL DAG to extract and transform CSV data

    • Schedule and run through the Airflow UI


Week 2: Building & Managing Complex DAGs

Duration: 10 hours (5 sessions × 2 hrs)

Topics:

  1. Advanced DAG Design (2 hrs)

    • DAG parameters, default_args, retries, SLAs

    • Dynamic task generation

    • Branching and SubDAGs

  2. Using Airflow Operators (2 hrs)

    • FileSensor, EmailOperator, SimpleHttpOperator, PostgresOperator

    • Working with external APIs and SQL databases

  3. XComs and Data Sharing (2 hrs)

    • Passing data between tasks

    • Using XComs effectively in data pipelines

  4. Error Handling & Task Monitoring (2 hrs)

    • Handling task failures and retries

    • Alerting & notifications (Slack/Email integration)

  5. Mini Project + Q&A (2 hrs)

    • Build a multi-stage DAG integrating API extraction + data transformation + DB loading


Week 3: Airflow with Big Data & Cloud Integration

Duration: 10 hours (5 sessions × 2 hrs)

Topics:

  1. Airflow with Apache Spark (2 hrs)

    • Submitting Spark jobs using Airflow

    • Using SparkSubmitOperator for batch data pipelines

  2. Airflow with Hadoop & HDFS (2 hrs)

    • Managing data in HDFS

    • Using Airflow for daily ingestion & transformation jobs

  3. Airflow with AWS / GCP / Azure (2 hrs)

    • AWS S3, Redshift, BigQuery, and Azure Blob Storage integrations

    • Using Airflow Hooks and Connections

  4. Airflow with Kafka & Streaming Data (2 hrs)

    • Triggering workflows from Kafka topics

    • Real-time batch pipeline simulation

  5. Mini Project + Q&A (2 hrs)

    • Build a batch pipeline integrating Airflow + Spark + S3


Week 4: Airflow in Production, Scaling & Capstone Project

Duration: 12 hours (6 sessions × 2 hrs)

Topics:

  1. Scheduling, Triggers, and Backfills (2 hrs)

    • Airflow scheduling and cron expressions

    • Manual triggers and backfilling DAG runs

  2. Airflow in Production Environments (2 hrs)

    • Airflow Executors: Sequential, Local, Celery, Kubernetes

    • Configuring Airflow for scalability and high availability

  3. CI/CD and Version Control (2 hrs)

    • DAG versioning using Git

    • Deploying Airflow pipelines through CI/CD tools (GitHub Actions, Jenkins)

  4. Monitoring, Logging & Security (2 hrs)

    • Airflow Metrics, Logging, Prometheus, Grafana integration

    • Authentication & Role-Based Access Control (RBAC)

  5. Capstone Project Development (2 hrs)

    • Design and build an end-to-end data pipeline using Airflow and Cloud Storage

  6. Capstone Presentation & Feedback (2 hrs)

    • Present final DAG and pipeline workflow

    • Instructor feedback and best practices discussion


🧩 Capstone Project Example

Project Title: Automated Data Pipeline for E-Commerce Analytics
Goal:
Extract transactional data from APIs → Load into AWS S3 → Transform using Spark → Load into Redshift → Orchestrate with Airflow
Tech Stack: Airflow, Python, Spark, AWS S3, Redshift

Reviews

There are no reviews yet.

Be the first to review “Live Online Apache Airflow Course for Data Engineering”

Your email address will not be published. Required fields are marked *

Shopping Cart

Loading...

WhatsApp Icon Join our WhatsApp community for Jobs & Career help
Scroll to Top
Call Now Button