HOME > IT & Software > Data pre-processing for Machine Learning in Python

Data pre-processing for Machine Learning in Python

IT & Software
Dec 05, 2024

SynopsisData pre-processing for Machine Learning in Python, available...

Data pre-processing for Machine Learning in Python No.1

Data pre-processing for Machine Learning in Python, available at $79.99, has an average rating of 4.3, with 48 lectures, based on 96 reviews, and has 1835 subscribers.

You will learn about How to fill the missings in numerical and categorical variables How to encode the categorical variables How to transform the numerical variables How to scale the numerical variables Principal Component Analysis and how to use it How to apply oversampling using SMOTE How to use several useful objects in scikit-learn library This course is ideal for individuals who are Python developers or Aspiring data scientists or People interested in machine learning and artificial intelligence It is particularly useful for Python developers or Aspiring data scientists or People interested in machine learning and artificial intelligence.

Enroll now: Data pre-processing for Machine Learning in Python

Summary

Title: Data pre-processing for Machine Learning in Python

Price: $79.99

Average Rating: 4.3

Number of Lectures: 48

Number of Published Lectures: 48

Number of Curriculum Items: 48

Number of Published Curriculum Objects: 48

Original Price: $29.99

Quality Status: approved

Status: Live

What You Will Learn

How to fill the missings in numerical and categorical variables

How to encode the categorical variables

How to transform the numerical variables

How to scale the numerical variables

Principal Component Analysis and how to use it

How to apply oversampling using SMOTE

How to use several useful objects in scikit-learn library

Who Should Attend

Python developers

Aspiring data scientists

People interested in machine learning and artificial intelligence

Target Audiences

Python developers

Aspiring data scientists

People interested in machine learning and artificial intelligence

In this course, we are going to focus on pre-processing techniques for machine learning.

Pre-processing is the set of manipulations that transforma raw dataset to make it used by a machine learning model. It is necessary for making our data suitablefor some machine learning models, to reduce the dimensionality,to better identify the relevant data,and to increase model performance. It’s the most important part of a machine learning pipeline and it’s strongly able to affect the success of a project. In fact, if we don’t feed a machine learning model with the correctly shaped data, it won’t work at all.

Sometimes, aspiring Data Scientists start studying neural networks and other complex models and forget to study how to manipulate a datasetin order to make it used by their algorithms. So, they fail in creating good models and only at the end they realize that good pre-processing would make them save a lot of time and increase the performanceof their algorithms. So, handling pre-processing techniques is a very important skill. That’s why I have created an entire coursethat focuses only on data pre-processing.

With this course, you are going to learn:

Data cleaning
Encoding of the categorical variables
Transformation of the numerical features
Scikit-learn Pipeline and ColumnTransformer objects
Scaling of the numerical features
Principal Component Analysis
Filter-based feature selection
Oversampling using SMOTE

All the examples will be given using Python programming language and its powerful scikit-learn library. The environment that will be used is Jupyter, which is a standard in the data science industry. All the sections of this course end with some practical exercisesand the Jupyter notebooks are all downloadable.

Course Curriculum

Chapter 1: Introduction

Lecture 1: Introduction to the course

Lecture 2: Numerical and categorical variables

Lecture 3: The dataset

Lecture 4: Required Python packages

Lecture 5: Jupyter notebooks

Chapter 2: Data cleaning

Lecture 1: Introduction to data cleaning

Lecture 2: Selecting numerical and categorical variables

Lecture 3: Cleaning the numerical features

Lecture 4: Cleaning the categorical features

Lecture 5: KNN blank filling

Lecture 6: ColumnTransformer and make_column_selector

Lecture 7: Exercises

Chapter 3: Encoding of the categorical features

Lecture 1: Introduction to the encoding of categorical variables

Lecture 2: One-hot encoding

Lecture 3: Ordinal encoding

Lecture 4: Label encoding of the target variable

Lecture 5: Exercise

Chapter 4: Transformations of the numerical features

Lecture 1: Introduction to transformations

Lecture 2: Power Transformation

Lecture 3: Binning

Lecture 4: Binarizing

Lecture 5: Applying an arbitrary transformation

Lecture 6: Exercise

Lecture 7: About power transformations

Chapter 5: Pipelines

Lecture 1: Define a transformation pipeline

Lecture 2: Pipelines and ColumnTransformer together

Lecture 3: Exercises

Chapter 6: Scaling

Lecture 1: Introduction to scaling

Lecture 2: Normalization, Standardization, Robust scaling

Lecture 3: Exercise

Chapter 7: Principal Component Analysis

Lecture 1: Introduction to PCA

Lecture 2: How to perform PCA

Lecture 3: Exercise

Lecture 4: A comment about scaling before PCA

Chapter 8: Filter-based feature selection

Lecture 1: Introduction to feature selection

Lecture 2: Numerical features, numerical target

Lecture 3: Numerical features, categorical target

Lecture 4: Categorical features, numerical target

Lecture 5: Categorical features, categorical target

Lecture 6: Feature importance according to a model

Lecture 7: A comment on mutual information

Lecture 8: A comment on feature selection with categorical variables

Lecture 9: Exercises

Chapter 9: A complete pipeline

Lecture 1: An example of a complete pipeline

Chapter 10: Oversampling

Lecture 1: Introduction to SMOTE

Lecture 2: How to perform SMOTE

Lecture 3: Exercise

Chapter 11: General guidelines

Lecture 1: Practical suggestions

Instructors

Data pre-processing for Machine Learning in Python No.2

Gianluca Malato
Your Data Teacher

Rating Distribution

1 stars: 1 votes

2 stars: 1 votes

3 stars: 8 votes

4 stars: 28 votes

5 stars: 58 votes

Frequently Asked Questions

How long do I have access to the course materials?

You can view and review the lecture materials indefinitely, like an on-demand channel.

Can I take my courses with me wherever I go?

Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!

Random Picks
Popular
Hot Reviews

Prev：SELF-HEALING- 4 Steps To Self-Heal with Michele Paradise Next：DP-600- Implement Analytics Solutions using Microsoft Fabric