Difference between revisions of "Data Preprocessing"
| Line 9: | Line 9: | ||
* [http://www.kaggle.com/rtatman/data-cleaning-challenge-json-txt-and-xls/ Data Cleaning Challenge: .json, .txt and .xls | Rachael Tatman] | * [http://www.kaggle.com/rtatman/data-cleaning-challenge-json-txt-and-xls/ Data Cleaning Challenge: .json, .txt and .xls | Rachael Tatman] | ||
| + | * [http://scikit-learn.org/stable/modules/preprocessing.html sklearn.preprocessing] | ||
* The Passenger Screening Kaggle challenge [http://www.kaggle.com/c/passenger-screening-algorithm-challenge/discussion/45805 1st place solution] was won in part due to data preparation/generation. | * The Passenger Screening Kaggle challenge [http://www.kaggle.com/c/passenger-screening-algorithm-challenge/discussion/45805 1st place solution] was won in part due to data preparation/generation. | ||
* [http://towardsdatascience.com/data-pre-processing-techniques-you-should-know-8954662716d6 Data Pre Processing Techniques You Should Know | Maneesha Rajaratne - Towards Data Science] | * [http://towardsdatascience.com/data-pre-processing-techniques-you-should-know-8954662716d6 Data Pre Processing Techniques You Should Know | Maneesha Rajaratne - Towards Data Science] | ||
Revision as of 18:19, 24 April 2019
YouTube search... ...Google search
- Data Cleaning Challenge: .json, .txt and .xls | Rachael Tatman
- sklearn.preprocessing
- The Passenger Screening Kaggle challenge 1st place solution was won in part due to data preparation/generation.
- Data Pre Processing Techniques You Should Know | Maneesha Rajaratne - Towards Data Science
- Machine Learning(ML) — Data Preprocessing | Raji Adam Bifola
- Most Influential Data Preprocessing Algorithms | S. García, J. Luengo, F. Herrera
- Datasets
- Batch Norm(alization) & Standardization
- Feature Exploration/Learning
- Hyperparameters
- Data Augmentation
- Visualization
- Python
- Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog
Splitting Data - training and testing sets
Time-Series Data
- Time-based Algorithms
- A Comparison of Time Series Databases and Netsil’s Use of Druid | Netsil
- Microsoft announces the general availability of Azure Time Series Insights | Ryan Waite - Microsoft
- Top 10 Time Series Databases | Outlyer
SQL Database Optimization