Difference between revisions of "Data Preprocessing"

From
Jump to: navigation, search
m
m
Line 5: Line 5:
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
}}
 
}}
[https://www.youtube.com/results?search_query=Data+Preprocessing+machine+learning+ML YouTube search...]
+
[https://www.youtube.com/results?search_query=ai+Data+Preprocessing YouTube]
[https://www.google.com/search?q=Data+Preprocessing+machine+learning+ML ...Google search]
+
[https://www.quora.com/search?q=ai%20Data%20Preprocessing ... Quora]
 +
[https://www.google.com/search?q=ai+Data+Preprocessing ...Google search]
 +
[https://news.google.com/search?q=ai+Data+Preprocessing ...Google News]
 +
[https://www.bing.com/news/search?q=ai+Data+Preprocessing&qft=interval%3d%228%22 ...Bing News]
  
 +
* [[Data Science]] ... [[Data Governance|Governance]] ... [[Data Preprocessing|Preprocessing]] ... [[Feature Exploration/Learning|Exploration]] ... [[Data Interoperability|Interoperability]] ... [[Algorithm Administration#Master Data Management (MDM)|Master Data Management (MDM)]] ... [[Bias and Variances]] ... [[Benchmarks]] ... [[Datasets]]
 +
* [[Data Quality]] ...[[AI Verification and Validation|validity]], [[Evaluation - Measures#Accuracy|accuracy]], [[Data Quality#Data Cleaning|cleaning]], [[Data Quality#Data Completeness|completeness]], [[Data Quality#Data Consistency|consistency]], [[Data Quality#Data Encoding|encoding]], [[Data Quality#Zero Padding|padding]], [[Data Quality#Data Augmentation, Data Labeling, and Auto-Tagging|augmentation, labeling, auto-tagging]], [[Data Quality#Batch Norm(alization) & Standardization| normalization, standardization]], and [[Data Quality#Imbalanced Data|imbalanced data]]
 
* [[AI Governance]] / [[Algorithm Administration]]
 
* [[AI Governance]] / [[Algorithm Administration]]
** [[Data Science]] / [[Data Governance]]
+
* [[Natural Language Processing (NLP)#Managed Vocabularies |Managed Vocabularies]]
*** [[Benchmarks]]
+
* [[Excel - Data Analysis]]
*** Data Preprocessing
 
**** [[Feature Exploration/Learning]]
 
**** [[Data Quality]] ...[[AI Verification and Validation|validity]], [[Evaluation - Measures#Accuracy|accuracy]], [[Data Quality#Data Cleaning|cleaning]], [[Data Quality#Data Completeness|completeness]], [[Data Quality#Data Consistency|consistency]], [[Data Quality#Data Encoding|encoding]], [[Data Quality#Zero Padding|padding]], [[Data Quality#Data Augmentation, Data Labeling, and Auto-Tagging|augmentation, labeling, auto-tagging]], [[Data Quality#Batch Norm(alization) & Standardization| normalization, standardization]], and [[Data Quality#Imbalanced Data|imbalanced data]]
 
*** [[Bias and Variances]]
 
*** [[Algorithm Administration#Master Data Management (MDM)|Master Data Management (MDM)]]
 
**** [[Natural Language Processing (NLP)#Managed Vocabularies |Managed Vocabularies]]
 
**** [[Datasets]]
 
*** [[Privacy]] in Data Science
 
*** [[Data Interoperability]]
 
*** [[Excel - Data Analysis]]
 
 
* [[Development]]  ...[[Development#AI Pair Programming Tools|AI Pair Programming Tools]] ... [[Analytics]]  ... [[Visualization]]  ... [[Diagrams for Business Analysis]]
 
* [[Development]]  ...[[Development#AI Pair Programming Tools|AI Pair Programming Tools]] ... [[Analytics]]  ... [[Visualization]]  ... [[Diagrams for Business Analysis]]
 
* [[Algorithm Administration#Hyperparameter|Hyperparameter]]s
 
* [[Algorithm Administration#Hyperparameter|Hyperparameter]]s

Revision as of 19:57, 1 May 2023

YouTube ... Quora ...Google search ...Google News ...Bing News


Overview-of-the-data-preprocessing-pipeline-The-data-preprocessing-consists-of-1_W640.jpg Article

Splitting Data - training and testing sets

Time-Series Data

578a09a1-f144-4a62-98cb-e6e3ed774817.png

Categorical Variables

Categorical variables require special attention in regression analysis because, unlike dichotomous or continuous variables, they cannot by entered into the regression equation just as they are. Instead, they need to be recoded into a series of variables which can then be entered into the regression model. There are a variety of coding systems that can be used when recoding categorical variables. Coding Systems for Categorical Variables In Regression Analysis | UCLA institute for Digital Research & Education Statistical Consulting


SQL Database Optimization