Data Processing Labs
In this section, you're going to apply your knowledge of data processing to an actual dataset, and prepare the data for future machine learning training. You'll use the California Housing Dataset, which is famous for its use in many machine learning courses. But the dataset has several issues that you'll need to discover and mitigate before you can use it for machine learning training.
You will have to detect and deal with outliers, scale and normalize columns to a sensible numeric range, bin- and one-hot encode categorical data columns, and cross latitude and longitude columns.
3 Lessons