Apache Kafka for Data Science – Live Course Module
Total Duration: 2 Weeks (15 Hours)
Day 1: Introduction & Setup (2 Hours)
-
Role of Real-Time Data in Data Science (10 min)
-
Kafka Basics: Producers, Consumers, Brokers, Topics (30 min)
-
Kafka Architecture Overview (20 min)
-
Hands-on: Install Kafka & Run First Producer/Consumer (60 min)
Day 2: Producers & Consumers (2 Hours)
-
Producer API & Partitioning Strategies (30 min)
-
Consumer API, Groups & Offsets (30 min)
-
Hands-on: Write Python Producer & Consumer (60 min)
Day 3: Data Pipelines with Kafka (2 Hours)
-
Kafka Connect for ETL & Database Integration (30 min)
-
Integration with Spark Streaming / Flink (30 min)
-
Hands-on: Stream Data into Kafka & Process with Spark (60 min)
Day 4: Kafka Streams API (2 Hours)
-
Kafka Streams Concepts: KStreams, KTables, Windowing (30 min)
-
Real-Time Processing for ML (30 min)
-
Hands-on: Build a Streaming App (e.g., Word Count / Anomaly Detection) (60 min)
Day 5: Capstone Project & Wrap-Up (2 Hours)
-
Build a Mini Real-Time Data Science Pipeline (90 min)
-
Ingest live data (stock prices / IoT / social media)
-
Process with Kafka Streams/Spark
-
Apply ML model (sentiment / anomaly detection)
-
Visualize results
-
-
Q&A + Best Practices (30 min)
✅ Delivery Mode: Live sessions with hands-on labs
✅ Outcome: Build end-to-end real-time data pipelines using Kafka for data science
Reviews
There are no reviews yet.