Difference between revisions of "Data Preprocessing"
| Line 8: | Line 8: | ||
[http://www.google.com/search?q=Data+Preprocessing+machine+learning+ML ...Google search] | [http://www.google.com/search?q=Data+Preprocessing+machine+learning+ML ...Google search] | ||
| + | * [http://www.kaggle.com/rtatman/data-cleaning-challenge-json-txt-and-xls/ Data Cleaning Challenge: .json, .txt and .xls | Rachael Tatman] | ||
| + | * The Passenger Screening Kaggle challenge [http://www.kaggle.com/c/passenger-screening-algorithm-challenge/discussion/45805 1st place solution] was won in part due to data preparation/generation. | ||
| + | * [http://www.kdnuggets.com/2018/10/notes-feature-preprocessing-what-why-how.html Notes on Feature Preprocessing: The What, the Why, and the How | Matthew Mayo - KDnuggets] | ||
* [[Datasets]] | * [[Datasets]] | ||
* [[Batch Norm(alization) & Standardization]] | * [[Batch Norm(alization) & Standardization]] | ||
| Line 15: | Line 18: | ||
* [[Visualization]] | * [[Visualization]] | ||
* [[Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog]] | * [[Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog]] | ||
| − | + | ||
| − | |||
| − | |||
<youtube>0xVqLJe9_CY</youtube> | <youtube>0xVqLJe9_CY</youtube> | ||
Revision as of 08:21, 16 February 2019
YouTube search... ...Google search
- Data Cleaning Challenge: .json, .txt and .xls | Rachael Tatman
- The Passenger Screening Kaggle challenge 1st place solution was won in part due to data preparation/generation.
- Notes on Feature Preprocessing: The What, the Why, and the How | Matthew Mayo - KDnuggets
- Datasets
- Batch Norm(alization) & Standardization
- Feature Exploration/Learning
- Hyperparameters
- Data Augmentation
- Visualization
- Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog
Splitting Data - training and testing sets