CV Mantra
Sale!
,

Live Online Apache Spark Course for Data Science

Original price was: ₹45,000.00.Current price is: ₹30,000.00.

Duration: 6 Weeks | Total Time: 36 Hours

Format: Live online sessions using Google meet or MS Teams with hands-on coding, mini-projects, and a capstone project by an industry expert.
Target Audience: College Students, Professionals in Finance, HR, Marketing, Operations, Analysts, and Entrepreneurs
Tools Required: Laptop with internet
Trainer: Industry professional with hands on expertise

Apache Spark for Data Science – Live Course Module

Total Duration: 36 Hours (6 Weeks)


Week 1: Introduction & Foundations (6 hrs)

  1. Introduction to Big Data & Spark (2 hrs)

    • Evolution from Hadoop to Spark

    • Why Spark for Data Science?

    • Spark ecosystem overview (Spark Core, SQL, MLlib, Streaming, GraphX)

    • Real-world use cases

  2. Spark Architecture & Setup (2 hrs)

    • Spark architecture (Driver, Executors, Cluster Manager)

    • RDD vs DataFrames vs Datasets

    • Installing & running Spark (Standalone, YARN, Databricks, Google Colab, Jupyter)

  3. Hands-on with Spark Shell & PySpark (2 hrs)

    • Spark Shell (Scala/Python) basics

    • Using PySpark with Jupyter Notebook

    • Simple Spark applications


Week 2: Spark Core – RDD Operations (6 hrs)

  1. RDD Basics (2 hrs)

    • Creating RDDs

    • Transformations & Actions

    • Lazy evaluation & DAG

  2. Advanced RDD Operations (2 hrs)

    • Map, FlatMap, Filter, ReduceByKey, GroupByKey

    • Joins & Aggregations

    • Persisting & caching RDDs

  3. Hands-on RDD Case Study (2 hrs)

    • Word Count Example

    • Log File Analysis

    • Performance tuning with RDDs


Week 3: DataFrames & Spark SQL (6 hrs)

  1. Introduction to DataFrames (2 hrs)

    • Creating DataFrames from files (CSV, JSON, Parquet)

    • Schema & Data types

    • DataFrame operations (select, filter, groupBy, join, agg)

  2. Spark SQL (2 hrs)

    • Registering DataFrames as SQL tables

    • Writing SQL queries in Spark

    • Integration with BI tools

  3. Hands-on Data Analysis with Spark SQL (2 hrs)

    • Case study: Analyzing large dataset with DataFrames & SQL

    • Optimization techniques (Catalyst Optimizer, Tungsten)


Week 4: Machine Learning with MLlib (6 hrs)

  1. Introduction to Spark MLlib (2 hrs)

  • Machine Learning in Spark

  • MLlib vs Scikit-learn

  • Pipelines & Transformers

  1. Supervised Learning with MLlib (2 hrs)

  • Regression & Classification (Linear Regression, Logistic Regression, Decision Trees, Random Forest)

  • Model training & evaluation

  1. Unsupervised Learning with MLlib (2 hrs)

  • Clustering (K-Means, Gaussian Mixture)

  • Dimensionality Reduction (PCA)

  • Hands-on project with MLlib


Week 5: Spark Streaming & Real-Time Analytics (6 hrs)

  1. Introduction to Spark Streaming (2 hrs)

  • Batch vs Streaming

  • DStreams & Structured Streaming basics

  • Streaming architecture

  1. Structured Streaming Operations (2 hrs)

  • Reading real-time data (Kafka, Socket, Files)

  • Window operations

  • Aggregations & checkpoints

  1. Hands-on Streaming Project (2 hrs)

  • Real-time Twitter sentiment analysis / Log monitoring

  • Building streaming pipeline


Week 6: Capstone Project & Deployment (6 hrs)

  1. GraphX & Advanced Topics (2 hrs)

  • Basics of GraphX

  • Graph analysis use cases in Data Science

  1. Capstone Project Work (2 hrs)

  • End-to-end project (e.g., Movie Recommendation, Customer Churn Prediction, Real-time Fraud Detection)

  • Data ingestion → Processing → ML pipeline → Results

  1. Deployment & Wrap-up (2 hrs)

  • Deploying Spark jobs (Standalone / Cluster)

  • Integrating with Hadoop, AWS EMR, Databricks

  • Best practices & course recap


Outcome:
By the end of this course, learners will be able to:

  • Build and optimize Spark applications

  • Perform large-scale data analysis using Spark SQL

  • Train ML models using Spark MLlib

  • Work with streaming data in real-time

  • Deploy Spark solutions in production

Reviews

There are no reviews yet.

Be the first to review “Live Online Apache Spark Course for Data Science”

Your email address will not be published. Required fields are marked *

Shopping Cart

Loading...

WhatsApp Icon Join our WhatsApp community for Jobs & Career help
Scroll to Top
Call Now Button