Duration: ~4.5 Months (16–20 Weeks)
Total Hours: ~90 hours
Format: Weekly 4–6 hours, including lectures, hands-on labs, assignments, and project work
Module 1: Foundations of Python Programming (15 Hours)
Objective: Build strong Python basics tailored for data tasks.
Topics:
-
Introduction to Python & IDEs (Pycharm, Jupyter, VSCode, Colab)
-
Variables, Data Types, Operators
-
Control Structures: if, for, while
-
Functions & Scope
-
Data Structures: Lists, Tuples, Dictionaries, Sets
-
String Manipulation
-
File Handling
-
Error Handling & Debugging
-
Working with
datetime
,os
,sys
Lab: Mini-project – Expense Tracker using core Python
Module 2: SQL for Data Science (10 Hours)
Objective: Learn how to query, filter, and join datasets using SQL.
Topics:
-
Introduction to Databases & Relational Models
-
Basic SQL Queries: SELECT, WHERE, ORDER BY
-
Filtering & Pattern Matching (LIKE, IN, BETWEEN)
-
Aggregate Functions: COUNT, SUM, AVG, etc.
-
GROUP BY & HAVING
-
JOINs: INNER, LEFT, RIGHT, FULL
-
Subqueries, CTEs, and Window Functions
-
Creating and Populating Tables
Lab: SQL Project – Analyzing a mock retail database
Module 3: Statistics & Probability for Data Science (15 Hours)
Objective: Grasp essential statistical methods for data analysis and ML.
Topics:
-
Descriptive Statistics: Mean, Median, Mode, Variance, Std Dev
-
Probability Theory Basics
-
Combinatorics: Permutations & Combinations
-
Probability Distributions: Binomial, Normal, Poisson
-
Inferential Statistics
-
Confidence Intervals
-
Hypothesis Testing: t-test, chi-square test
-
-
Correlation vs. Causation
-
Central Limit Theorem
-
ANOVA and Regression Basics
Lab: Exploratory Data Analysis using pandas
, matplotlib
, and stats methods
Module 4: Advanced Python for Data Science (15 Hours)
Objective: Master libraries, tools, and efficient coding practices.
Topics:
-
Working with
NumPy
: arrays, slicing, broadcasting -
Data Analysis with
pandas
: Series, DataFrames, indexing, groupby -
Data Visualization:
-
Matplotlib
andSeaborn
-
Plot types, customization
-
-
List Comprehensions & Lambda Functions
-
Iterators, Generators
-
Working with APIs & JSON
-
Introduction to Web Scraping (with
requests
&BeautifulSoup
) -
Virtual Environments & Packages (
venv
,pip
,conda
)
Lab: Real-world EDA Project – Cleaning, transforming, and visualizing a dataset
Module 5: Machine Learning (30 Hours)
Objective: Learn core ML concepts and apply them on real datasets.
Part 1: Introduction & Supervised Learning
-
ML Workflow Overview
-
Types of ML: Supervised, Unsupervised, Reinforcement
-
Data Preprocessing (scaling, encoding, missing values)
-
Splitting Data & Evaluation Metrics
-
Regression
-
Linear Regression
-
Ridge, Lasso
-
-
Classification
-
Logistic Regression
-
Decision Trees
-
Random Forests
-
K-Nearest Neighbors
-
Support Vector Machines (SVM)
-
Mini Project: House Price Prediction using Regression
Part 2: Unsupervised Learning & Model Optimization
-
Clustering
-
K-Means
-
Hierarchical Clustering
-
-
Dimensionality Reduction
-
PCA
-
-
Introduction to Recommender Systems
-
Feature Engineering Techniques
-
Model Validation & Cross-Validation
-
Hyperparameter Tuning (Grid Search, Random Search)
Mini Project: Customer Segmentation using K-Means
Part 3: Capstone Project
-
Guided end-to-end Data Science Project:
-
Problem Statement
-
Data Collection
-
EDA & Feature Engineering
-
Model Building & Evaluation
-
Business Insights
-
-
Presentation Preparation
Examples:
-
Loan Default Prediction
-
Fraud Detection
-
Movie Recommendation System
Comments
Post a Comment