100 Days of Data Science Code
Starting a 100 Days Code Challenge for Learning Data Science from Scratch is my goal on Learning Data Science in Machine Learning by:
- Learning Fundamentals of Python
- Python Libraries for Data Science
- Data Manipulation and Preprocessing
- Machine Learning Basics
- Advanced Machine Learning Techniques
- Deep Learning and Neural Networks
- Model Evaluation and Deployment
- Data Science Project and Wrap-Up
Articles Published on LinkedIn
Calendar Progress
July 2023
August 2023
September 2023
October 2023
Sun |
Mon |
Tues |
Wed |
Thurs |
Fri |
Sat |
1 ✅ |
2 ✅ |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
- |
- |
- |
- |
100 Days of Data Science Code Day-to-Day Progress
DAY 1 (18 July 2023):
Goal: Python Basics
- Control flow statements like if-else conditions and loops.
Github Repository: Source Code
LinkedIn post: Daily Update
DAY 2 (19 July 2023):
Goal: Functions and Modules
- Concept of modules.
- How to import and use built-in modules as well as create your own.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 3 (20 July 2023):
Goal: Data Structures
- Python’s built-in data structures such as lists, tuples, dictionaries, and sets.
- Also, learn about indexing, slicing, and manipulating these data structures.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 4 (21 July 2023):
Goal: File Handling and Exception Handling
- Read from and write to files in Python.
- Learn about exception handling and how to handle errors using try-except blocks.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 5 (22 July 2023):
Goal: Python Classes and Objects
- Class Declaration
- Object Instantiation
- Constructor and Destructor
- Built-in Class Attributes and Functions
- Instance, Class and Static Variables and Functions.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 6 (23 July 2023):
Goal: Python OOPs Concepts and Implementation in Python
- Data Abstraction
- Encapsulation
- Inheritance
- Polymorphism.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 7 (24 July 2023):
Goal: Advanced Python Concepts
- Higher Order Functions
- List Comprehensions
- Regular Expressions (RegEx)
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 8 (25 July 2023):
Goal: Python Connectivity with MySQL Database
- Setting Up MySQL Connection
- Executing SQL Queries.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 9 (26 July 2023):
Goal: Day 1 of Bank Management System
- Database Setup
- Python Environment Setup
- Database Connectivity
- Create Basic Classes
- Customer Management.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 10 (27 July 2023):
Goal: Day 2 of Bank Management System
- Account Management(Create Account, List Account Details)
- Basic Error Handling(Apply Validations on Input values)
- Testing and Debugging(Checking Input value validations).
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 11 (28 July 2023):
Goal: Final Day of Project (Transfer Operations and Final Testing)
- Transfer Operation
- Final Testing and Documentation
- Clean Up and Deployment.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 12 (29 July 2023):
Goal: NumPy Basics and Array Manipulation
- Introduction to NumPy
- Installing NumPy
- Creating NumPy arrays
- Array indexing and slicing
- Array reshaping and resizing
- Stacking and splitting arrays.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 13 (30 July 2023):
Goal: Mathematical Operations with NumPy
- Element-wise Operations
- Aggregation Functions
- Linear Algebra with NumPy.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 14 (31 July 2023):
Goal: Statistics Functions with NumPy
- Descriptive statistics
- Random number generation
- Sorting and searching arrays
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 15 (1 Aug. 2023):
Goal: Introduction to Pandas and Data Structures in Pandas
- Introduction to Pandas
- Install Pandas
- Types of Data Structures : Series, DataFrames
- Importing and Exporting DataFrames
- DataFrame Functions
- Accessing DataFrames : Indexing, Slicing, loc[], iloc[].
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 16 (2 Aug. 2023):
Goal: Data Manipulation and Data Aggregation using Pandas
- Advanced Indexing and Selection - (Label-based indexing, boolean indexing, and advanced slicing)
- Combining DataFrames - (Concatenation, merging, and joining techniques)
- Data Manipulation
- Advanced Data Manipulation - (reshaping data, pivoting, and melting)
- Data Aggregation and Grouping - (groupby() and other aggregation Functions)
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 17 (3 Aug. 2023):
Goal: Data Cleaning
- Basic Data Cleaning and Pre-Processing:
- Removing Duplicates
- Fixing Wrong Data
- Cleaning Data of Wrong Format
- Cleaning Empty Cells
- dropna(), fillna()
- drop_duplicates()
- Data Transformation - ( apply() and map() )
- Working with Text Data - Functions of str attribute
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 18 (4 Aug. 2023):
Goal: Feature Engineering and Time Series Analysis
- Feature Engineering:
- Data Normalization
- Data Scaling
- Data Standardization
- Time Series Analysis and Resampling:
- Working with datetime data
- Date offsets
- Resampling time series data
- Datetime index
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 19 (5 Aug. 2023):
Goal: Matplotlib Introduction and Line Plots
- Matplotlib:
- Installation of Matplotlib library
- Import Matplotlib library
- Matplotlib Pyplot:
- Plotting x and y points
- Plotting without line
- Matplotlib Markers (Types, Color, Size)
- Matplotlib Line (LineStyle, Line colors, line width)
- Single Plot with multiple lines
- Matplotlib Labels and Title (Create Label, Create Title, Set font properties to Title and Label, Title Position)
- Adding Grid Lines (Line Properties of grid)
- Matplotlib Bars:
- Vertical Bars
- Horizontal Bars
- Bar colors
- Bar width
- Bar height
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 20 (6 Aug. 2023):
Goal: Matplotlib Scatter Plot and Histogram
- Subplots:
- subplot() function
- Title for each subplot
- Super title of Plot
- Matplotlib Scatter Plot:
- Create Scatter Plots
- Compare Plots
- Color each dots
- ColorMap for dots
- Combine Color, Size and Alpha values
- Matplotlib Histograms:
- Matplotlib Pie Charts:
- Create Pie Chart
- Labels
- startAngle
- Explode
- Shadow
- Colors
- Legend
- Header
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 21 (7 Aug. 2023):
Goal: Seaborn Introduction
- Seaborn:
- Installation of Seaborn
- Import Seaborn library
- Different types of plots:
- Relational Plots
- Categorical Plots
- Distribution Plots
- Regression Plots
- Categorical Plots:
- Bar Plot
- Count Plot
- Box Plot
- Violinplot
- Stripplot
- Swarmplot
- Factorplot
- Distribution Plots:
- Histogram
- Distplot
- Jointplot
- Pairplot
- Rugplot
- KDE Plot
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 22 (8 Aug. 2023):
Goal: Seaborn Visualization Plots - Relational and Regression Plots
- Customizing Seaborn Plots:
- Changing Figure Asthetics
- Removal of Spines
- Changing the Figure size
- Scaling the plots
- Setting the Style Temporarily
- Color Palette - (Diverging, Sequential, Default color palette)
- Multiple Plots with Seaborn:
- Using Matplotlib - (add_axes(), subplot(), subplot2grid() functions)
- Using Seaborn - (FacetGrid() method, PairGrid() method)
- Relational Plot Types:
- relplot()
- Scatter Plot
- Line Plot
- Regression Plot Types:
- Matrix Plots:
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 23 (9 Aug. 2023):
Goal: Python Fundamentals Notes
- Introduction
- Identifiers:
- Keywords
- Variables and Constants
- Operators in python
- Data types in python
- String data type and operations
- List data type and operations
- Tuple data type and operations
- Set data type and operations
- Dictionary data type and operations
- Control Statements in python:
- Decision making
- looping statements
- looping control statements
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 24 (10 Aug. 2023):
Goal: Python Fundamentals Notes
- Introduction
- Create arrays in python
- Array creation using NumPy Functions
- zeros
- ones
- arange
- linspace
- eye
- identity
- fromiter
- Accessing array elements
- Random number Generation
- rand()
- random()
- ranf()
- randint()
- randn()
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 25 (11 Aug 2023):
Goal: Pandas Revision
- Introduction - Install, Import
- Data Structures:
- DataFrames
- Importing and Exporting
- Functions - columns, describe(), info(), head(), tail(), isna()
- Accessing DataFrames - loc[], iloc[],
- Basic Data Cleaning:
- Empty Cells
- Wrong Format Data
- Fixing Wrong Data
- Removing Duplicates
- Apply filters
- apply()
- map() - Using Dictionary, Series, Function for mapping
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 26 (12 Aug 2023):
Goal: Introduction to Artificial Intelligence and Machine Learning Fundamentals
- Artificial Intelligence:
- Machine Learning:
- Difference between Artificial Intelligence and Machine Learning
- Applications of Machine Learning
- Limitations of Machine Learning
- Types of Machine Learning
- Supervised Learning
- Unsepervised Learning
- Reinforcement Learning
- Comparisons between all types
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 27 (13 Aug 2023):
Goal: Understanding Machine Learning Workflow
- 1. Data Preprocessing:
- Data Cleaning
- Feature Selection/Extraction
- Normalization/Scaling
- Encoding Categorical Variables
- Splitting Data
- 2. Model Training:
- Selecting a Model
- Initializing Parameters
- Training Loop
- Gradient Descent (for Optimization)
- Hyperparameter Tuning
- 3. Model Evaluation:
- Metrics
- Cross-Validation
- Confusion Matrix
- ROC and AUC
- Overfitting and Underfitting
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 28 (14 Aug 2023):
Goal: Model Evaluation Techniques in Machine Learning
- Cross-Validation
- Evaluation Metrics:
- Accuracy
- Precision
- Recall
- F1-Score
- Area Under Curve (AUC) and Receiver Operating Characteristic (ROC)
- Confusion Matrix
- Overfitting and Underfitting Detection:
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 29 (15 Aug 2023):
Goal: Diagnosing and Addressing Underfitting and Overfitting
- Underfitting:
- Choosing a more complex model
- Adding more features
- Fine-tuning hyperparameters
- Overfitting:
- Collect more data
- Feature selection
- Cross-validation
- Regularization techniques
- Early stopping
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 30 (16 Aug 2023):
Goal: Simple Linear Regression Implementation
- Linear Regression Introduction
- Simple Linear Regression:
- Assumptions of Simple LR
- Equation of Simple LR
- Applications of Linear Regression
- Working of Linear Regression
- Finding goodness of fit
- Examples of Linear Regression
- Implementation of Simple Linear Regression
- Real-world Application: Salary Prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 31 (17 Aug 2023):
- Multiple Linear Regression (MLR):
- Key points of MLR
- Equation of MLR
- Assumptions of MLR
- Implementation of MLR using Python
- Real-world Application: Student Performance Analysis
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 32 (18 Aug 2023):
Goal: Classification in Machine Learning
- Classification
- Types of Learners:
- Lazy Learners: Firstly, store dataset and wait until receive test dataset.
- Eager Learner: Develop classification model based on training dataset, before receiving testing dataset.
- Types of Classification Algorithms:
- Logistic Regression
- Decision Trees
- Random Forest
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN)
- Naive Bayes
- Neural Networks
- Terminologies in Classification:
- Features and Labels
- Training and Testing Data
- Confusion Matrix
- Precision, Recall, F1-Score
- ROC and AUC Curve
- Types of Classification:
- Binary Classification: Two classes (e.g., Yes/No)
- Multiclass Classification: Multiple distinct classes (e.g., Cat/Dog/Horse)
- Models’ Evaluation Techniques for Classification: Used for finding goodness of model’s fit:
- Accuracy
- Precision and Recall
- F1-Score
- ROC Curve and AUC
- Confusion Matrix
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 33 (19 Aug 2023):
Goal: Logistic Regression Implementation
- Logistic Regression:
- Logistic Function (Sigmoid Function)
- Assumptions of Logistic Regression
- Types of Logistic Regression:
- Binary / Binomial
- Multinomial
- Ordinal
- Terminologies involved in Logistic Regression
- Implementation of Logistic Regression
- Difference between Linear Regression and Logistic Regression
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 34 (20 Aug 2023):
Goal: Decision Tree Concepts
- Decision Tree:
- Components of a Decision Tree
- Root Node
- Internal Nodes
- Leaf Nodes
- Attribute Selection Measures(ASM):
- Entropy
- Information Gain
- Gini Index
- How Decision Trees Work
- Advantages of Decision Trees
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 35 (21 Aug 2023):
Goal: Decision Tree Implementation
- Decision Tree Implementation Setup:
- Data Pre-processing
- Model Training
- Predicting the Results
- Model Evaluation Techniques
- Examples for Decision Tree Implementation:
- IRIS Flower Classification
- Red Wine Quality Prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 36 (22 Aug 2023):
Goal: Ensemble Methods
- Ensemble Methods:
- Bagging
- Boosting
- Stacking
- Advantages of Ensemble Methods
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 37 (23 Aug 2023):
Goal: Gradient Boosting in Machine Learning
- Gradient Boosting in Machine Learning:
- What is Gradient Boosting
- Key Components of Gradient Boosting
- How Gradient Boosting Works
- Benefits of Gradient Boosting
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 38 (24 Aug 2023):
Goal: AdaBoost and XGBoost
- AdaBoost and XGBoost:
- AdaBoost (Adaptive Boosting)
- XGBoost (Extreme Gradient Boosting)
- Advantages of AdaBoost and XGBoost
- Applications
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 39 (25 Aug 2023):
Goal: Random Forests Introduction
- Random Forests:
- What are Random Forests
- Key Components of Random Forests
- How Random Forests Work
- Benefits of Random Forests
- Real-world Applications of Random Forests
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 40 (26 Aug 2023):
Goal: Random Forest Implementation and Hyperparameter Tuning
- Random Forest Implementation:
- Step-by-Step Approach
- IRIS Flower Prediction
- Red Wine Quality Prediction
- Hyperparameter Tuning:
- Unlocking Model Potential
- GridSearchCV
- RandomizedSearchCV
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 41 (27 Aug 2023):
Goal: Decision Tree and Random Forest Example
- Decision Tree in Action
- Enchantment of Random Forests
- Social Media Ads prediction
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 42 (28 Aug 2023):
Goal: Support Vector Machine (SVM) Introduction
- Introduction to SVM
- Terminologies used in SVM
- Advantages of SVM
- Limitations of SVM
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 43 (29 Aug 2023):
Goal: SVM Implementation
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 44 (30 Aug 2023):
Goal: SVM Regression Implementation
- SVM Regression Implementation:
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 45 (31 Aug 2023):
Goal: Introduction to KNN
- KNN Introduction
- Distance Metrics:
- Euclidean Distance
- Manhatten Distance
- Minkowski Distance
- How KNN works
- How to choose value of ‘K’
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 46 (1 Sept 2023):
Goal: KNN Implementation
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 47 (2 Sept 2023):
Goal: KNN Hyperparameter Tuning
- KNN Regression:
- KNN Classification:
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 48 (3 Sept 2023):
Goal: ML Fundamentals Revision
- What is AI
- What is ML
- Machine Learning
- Model Evaluation Techniques in ML
- Classification: Accuracy Score, Confusion Matrix, Classification Report
- Regression: Mean Absolute Errors,Mean Square Errors, Root Mean Square Errors
- Exploratory Data Analysis (EDA)
- Handling Outliers
- Removing Outliers
- Transforming Values
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 49 (4 Sept 2023):
Goal: 5G Resource Allocation Capstone Project - MLR, SVR and KNN Regression Models
- Resource Allocation in 5G Network Service Project:
- Data Pre-Processing
- Implementation:
- Polynomial Regression
- SVM Regression
- KNN Regression
- Model Evaluation:
- Mean Absolute Errors
- Mean Square Errors
- Root Mean Square Errors
- Kaggle Notebook : Link to Notebook
- Comparison of Model Performances (Multiple Bar Charts)
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 50 (5 Sept 2023):
Goal: Capstone Project - Gender Classification - LR, DT, RF, SVM and KNN
- Gender Classification Project:
- Data Pre-Processing
- Implementation:
- Logistic Regression
- Decision Tree
- Random Forest
- SVM Classification
- KNN Classification
- Model Evaluation:
- Accuracy Score
- Confusion Matrix
- Classification Report
- Kaggle Notebook : Link to Notebook
- Comparison of Model Performances (Bar Chart)
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 51 (6 Sept 2023):
Goal: Introduction to Cross-Validation
- Introduction to Cross-Validation
- What is Cross Validation
- Why is Cross Validation Important
- Advantages of Cross Validation
- Limitations of Cross Validation
- Types of Cross-Validation:
- Leave-One-Out Cross-Validation (LOOCV)
- Leave-P-Out Cross Validation (LPOCV)
- K-Fold Cross-Validation
- Stratified K-Fold Cross-Validation
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 52 (7 Sept 2023):
Goal: Cross-Validation Implementation
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 53 (8 Sept 2023):
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 54 (9 Sept 2023):
Goal: Introduction to Dimensionality Reduction
- The Curse of Dimensionality
- The Importance of Dimensionality Reduction
- Dimensionality Reduction Techniques:
- Feature Selection
- Feature Extraction
- Dimension Reduction
- Advantages of Dimensionality Reduction
- Limitations of Dimensionality Reduction
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 55 (10 Sept 2023):
Goal: Introduction to Principal Component Analysis (PCA)
- Some common terms used in PCA algorithm
- Uses of PCA
- Advantages of Principal Component Analysis
- Limitations of Principal Component Analysis
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 56 (11 Sept 2023):
Goal: Steps in PCA (Principal Component Analysis)
- Step 1 : Covariance Matrix Computation
- Step 2 : Compute Eigenvalues and Eigenvectors of Covariance Matrix to Identify Principal Components
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 57 (12 Sept 2023):
Goal: Solve Example of PCA
- Pre-processed Data
- Calculated Covariance Matrix
- Eigenvalues and Eigenvectors
- Sorted Eigenvalues
- Select Principal Components
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 58 (13 Sept 2023):
Goal: PCA Implementation using Scikit-Learn
- Data Preparation
- Importing Scikit-learn
- Standardization
- PCA Implementation
- Explained Variance
- Dimensionality Reduction
- Visualization
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 59 (14 Sept 2023):
Goal: Introduction to Feature Selection
- What is Feature Selection?
- Why is Feature Selection Necessary?
- Techniques in Feature Selection
- Univariate feature selection
- Feature importance from tree-based models
- Recursive Feature Elimination (RFE)
- L1-based feature selection
- Correlation-based feature selection
- Steps in Feature Selection:
- Data Pre-Processing
- Feature Scoring
- Feature Selection
- Advantages of Feature Selection:
- Improved model performance
- Faster training and prediction
- Enhanced model interpretability
- Reduced risk of overfitting
- Easier visualization of data
- Limitations of Feature Selection:
- It may result in information loss.
- It can be challenging to decide which features to select.
- Some methods might not work well for all types of data.
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 60 (15 Sept 2023):
Goal: Feature Selection : Filter Methods
- Introduction to Filter Methods
- Steps in Filter Methods:
- Data Pre-Processing
- Feature Scoring
- Feature Selection
- Common Techniques in Filter Methods:
- Correlation-based Feature Selection
- Information Gain
- Chi-square Test
- Fisher’s Score
- Missing Value Ratio
- Advantages of Filter Methods:
- Simplicity
- Speed
- Independence
- Limitations of Filter Methods:
- Independence
- Suboptimal Results
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 61 (16 Sept 2023):
Goal: Feature Selection : Wrapper Methods
- Introduction to Wrapper Methods
- Steps in Wrapper Methods:
- Subset Selection
- Model Building
- Model Evaluation
- Common Techniques in Wrapper Methods:
- Forward Selection Method
- Backward Elimination Method
- Exhaustive Feature Selection Method
- Recursive Feature Selection Method
- Advantages of Wrapper Methods:
- Optimal Features
- Model-Specific
- Limitations of Wrapper Methods:
- Computationally Intensive
- Model Dependency
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 62 (17 Sept 2023):
Goal: Feature Selection : Wrapper Methods
- Introduction to Embedded Methods
- Steps in Embedded Methods:
- Feature Selection While Building
- Model Training
- Feature Importance Assessment
- Common Techniques in Embedded Methods:
- Random Forest Importance
- Lasso (L1 Regularization)
- Ridge (L2 Regularization)
- Elastic Net (L1 and L2 Regularization)
- Advantages of Embedded Methods:
- Feature Relevance
- Model Compatibility
- Limitations of Embedded Methods:
- Model Dependency
- May Miss Correlations
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 63 (18 Sept 2023):
Goal: Exploratory Data Analysis (EDA) on IPL All Time Best Batsman Trending Dataset
- Key EDA Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Statistical Insights
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 64 (19 Sept 2023):
Goal: Support Vector Regression (SVR) on Used Car Price Prediction
- Key SVR Operations Performed:
- Data Loading
- Data Pre-processing
- Feature Selection
- Splitting Data
- SVR Model Building
- Model Training
- Model Evaluation
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 65 (20 Sept 2023):
Goal: Movie Recommendations Using Collaborative Filtering
- Key Operations Performed:
- Data Loading
- Data Pre-processing
- Collaborative Filtering
- Movie Recommendations
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 66 (21 Sept 2023):
Goal: Simple Linear Regression for Insurance Predictions
- Key Operations Performed:
- Data Loading
- Data Exploration
- Linear Regression Implementation
- Model Evaluation
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 67 (22 Sept 2023):
Goal: Simple Linear Regression for Salary Predictions
- Key Operations Performed:
- Data Loading
- Data Exploration
- Linear Regression Implementation
- Model Evaluation
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 68 (23 Sept 2023):
Goal: Exploratory Data Analysis (EDA) for Gym Exercises Data
- Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 69 (24 Sept 2023):
Goal: Exploratory Data Analysis (EDA) for Life Expectancy Data
- Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 70 (25 Sept 2023):
Goal: Exploratory Data Analysis (EDA) on Predicting Student Dropouts
- Key Operations Performed:
- Data Loading
- Data Exploration
- Data Visualization
- Insights Extraction
- Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 71 (26 Sept 2023):
Goal: Introduction to Clustering in ML
- Intro to Clustering
- Types of Clustering:
- Partitioning Clustering
- Density-Based Clustering
- Distribution Model-Based Clustering
- Hierarchical Clustering
- Fuzzy Clustering
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 72 (27 Sept 2023):
Goal: Clustering Algorithms in Machine Learning
- Commonly Used Clustering Algorithms:
- K-means Algorithm
- Hierarchical Clustering
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
- Agglomerative Clustering
- Gaussian Mixture Model (GMM)
- Applications of Clustering:
- Customer Segmentation
- Image Compression
- Anomaly Detection
- Document Classification
- Advantages of Clustering:
- Pattern Discovery
- Data Reduction
- Scalability
- Interpretability
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 73 (28 Sept 2023):
Goal: Implementing K-means Clustering
- K-means Clustering:
- Initialization
- Assignment
- Update Centroids
- Repeat
- Customer Clustering : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 74 (29 Sept 2023):
Goal: K-means Clustering Implementation
- K-means Clustering:
- Initialization
- Assignment
- Update Centroids
- Repeat
- Credit Card Clustering : Kaggle Notebook
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 75 (30 Sept 2023):
Goal: Visualizing Clusters Distribution for 30 Datasets
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 76 (1 Oct 2023):
Goal: Hierarchical Clustering Implementation
GitHub Repository: Source Code
LinkedIn post: Daily Update
DAY 77 (2 Oct 2023):
Goal: Hierarchical Clustering Concepts
- What Can We Achieve with Hierarchical Clustering:
- Hierarchical Insights
- Data Exploration
- Decision Support
GitHub Repository: Source Code
LinkedIn post: Daily Update