Difference between revisions of "Feature Exploration/Learning"
m |
m |
||
| Line 5: | Line 5: | ||
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | ||
}} | }} | ||
| − | [ | + | [https://www.youtube.com/results?search_query=Feature+Exploration+machine+learning+ML YouTube search...] |
| − | [ | + | [https://www.google.com/search?q=Feature+Exploration+machine+learning+ML ...Google search] |
| − | * [ | + | * [https://en.wikipedia.org/wiki/Feature_selection Feature selection | Wikipedia] |
| − | * [ | + | * [https://www.kdnuggets.com/2018/10/notes-feature-preprocessing-what-why-how.html Notes on Feature Preprocessing: The What, the Why, and the How | Matthew Mayo - KDnuggets] |
* [[Evaluating Machine Learning Models]] | * [[Evaluating Machine Learning Models]] | ||
* [[Algorithm Administration#Automated Learning|Automated Learning]] | * [[Algorithm Administration#Automated Learning|Automated Learning]] | ||
| Line 15: | Line 15: | ||
* [[Principal Component Analysis (PCA)]] | * [[Principal Component Analysis (PCA)]] | ||
* [[Representation Learning]] | * [[Representation Learning]] | ||
| − | * [ | + | * [https://bookdown.org/max/FES/ Feature Engineering and Selection: A Practical Approach for Predictive Models | Max Kuhn and Kjell Johnson] |
| − | * [ | + | * [https://github.com/jontupitza Jon Tupitza's Famous Jupyter Notebooks:] |
| − | ** [ | + | ** [https://github.com/JonTupitza/Data-Science-On-Ramp/blob/master/01-Parametric-Tests.ipynb Parametric Tests: Tests Designed for Normally-Distributed Data] |
| − | *** [ | + | *** [https://github.com/JonTupitza/Data-Science-Process/blob/master/02-EDA-Univariate-Analysis.ipynb Exploratory Data Analysis - Univariate] |
| − | ** [ | + | ** [https://github.com/JonTupitza/Data-Science-On-Ramp/blob/master/02-Non-Parametric-Tests.ipynb Non-Parametric Tests: Tests Designed for Data That's Not Normally-Distributed] |
| − | *** [ | + | *** [https://github.com/JonTupitza/Data-Science-Process/blob/master/03-EDA-Bivariate-Analysis.ipynb Exploratory Data Analysis - Bivariate] |
| − | ** [ | + | ** [https://github.com/JonTupitza/Data-Science-Process/blob/master/04-EDA-Correlation-Analysis.ipynb Exploratory Data Analysis - Correlation] |
| − | ** [ | + | ** [https://github.com/JonTupitza/Data-Science-Process/blob/master/05-Feature-Selection.ipynb Feature Selection Techniques] |
* [[AI Governance]] / [[Algorithm Administration]] | * [[AI Governance]] / [[Algorithm Administration]] | ||
** [[Data Science]] / [[Data Governance]] | ** [[Data Science]] / [[Data Governance]] | ||
| Line 38: | Line 38: | ||
* [[Visualization]] | * [[Visualization]] | ||
* Tools: | * Tools: | ||
| − | ** [ | + | ** [https://www.qubole.com/solutions/by-project/ What’s Your Project? | Qubole] |
| − | ** [ | + | ** [https://www.trifacta.com/ From Messy Files To Automated Analytics | Trifacta] |
| − | ** [ | + | ** [https://databricks.com/product/automl-on-databricks Accelerate discovery with a collaborative platform | Databricks] |
| − | ** [ | + | ** [https://www.paxata.com/ The Data Prep for AI Toolkit: Smarter ML Models Through Faster, More Accurate Data Prep | Paxata] |
| − | ** [ | + | ** [https://www.alteryx.com/e-book/age-badass-analyst The Age of The Badass Analyst | Alteryx] |
| − | A feature is an individual measurable property or characteristic of a phenomenon being observed. The concept of a “feature” is related to that of an explanatory variable, which is used in statistical techniques such as linear regression. Feature vectors combine all of the features for a single row into a numerical vector. Part of the art of choosing features is to pick a minimum set of independent variables that explain the problem. If two variables are highly correlated, either they need to be combined into a single feature, or one should be dropped. Sometimes people perform principal component analysis to convert correlated variables into a set of linearly uncorrelated variables. Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract Year of Birth from Year of Death and you construct Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious. [ | + | A feature is an individual measurable property or characteristic of a phenomenon being observed. The concept of a “feature” is related to that of an explanatory variable, which is used in statistical techniques such as linear regression. Feature vectors combine all of the features for a single row into a numerical vector. Part of the art of choosing features is to pick a minimum set of independent variables that explain the problem. If two variables are highly correlated, either they need to be combined into a single feature, or one should be dropped. Sometimes people perform principal component analysis to convert correlated variables into a set of linearly uncorrelated variables. Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract Year of Birth from Year of Death and you construct Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious. [https://www.infoworld.com/article/3394399/machine-learning-algorithms-explained.html Machine learning algorithms explained | Martin Heller - InfoWorld] |
{|<!-- T --> | {|<!-- T --> | ||
| Line 60: | Line 60: | ||
<youtube>WVclIFyCCOo</youtube> | <youtube>WVclIFyCCOo</youtube> | ||
<b>Visualize your Data with Facets | <b>Visualize your Data with Facets | ||
| − | </b><br>In this episode of AI Adventures, Yufeng explains how to use Facets, a project from Google Research, to visualize your dataset, find interesting relationships, and clean your data for machine learning. Learn more through our hands-on labs → | + | </b><br>In this episode of AI Adventures, Yufeng explains how to use Facets, a project from Google Research, to visualize your dataset, find interesting relationships, and clean your data for machine learning. Learn more through our hands-on labs → https://goo.gle/38ZUlTD Associated Medium post "Visualize your data with Facets": https://goo.gl/7FDWwk Get Facets on GitHub: https://goo.gl/Xi8dTu |
| − | Play with Facets in the browser: | + | Play with Facets in the browser: https://goo.gl/fFLCEV Watch more AI Adventures on the playlist: https://goo.gl/UC5usG Subscribe to get all the episodes as they come out: https://goo.gl/S0AS51 #AIAdventures |
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
| Line 70: | Line 70: | ||
<youtube>KvZ2KSxlWBY</youtube> | <youtube>KvZ2KSxlWBY</youtube> | ||
<b>Stephen Elston - Data Visualization and Exploration with [[Python]] | <b>Stephen Elston - Data Visualization and Exploration with [[Python]] | ||
| − | </b><br>Visualization is an essential method in any data scientist’s toolbox and is a key data exploration method and is a powerful tool for presentation of results and understanding problems with analytics. Attendees are introduced to [[Python]] visualization packages, Matplotlib, Pandas, and Seaborn. [ | + | </b><br>Visualization is an essential method in any data scientist’s toolbox and is a key data exploration method and is a powerful tool for presentation of results and understanding problems with analytics. Attendees are introduced to [[Python]] visualization packages, Matplotlib, Pandas, and Seaborn. [https://github.com/StephenElston/ExploringDataWithPython The Jupyter notebook] Visualization of complex real-world datasets presents a number of challenges to data scientists. By developing skills in data visualization, data scientists can confidently explore and understand the relationships in complex data sets. Using the [[Python]] matplotlib, pandas plotting and seaborn packages attendees will learn to: • Explore complex data sets with visualization, to develop understanding of the inherent relationships. • Create multiple views of data to highlight different aspects of the inherent relationships, with different graph types. • Use plot aesthetics to project multiple dimensions. • Apply conditioning or faceting methods to project multiple dimensions www.pydata.org |
|} | |} | ||
|<!-- M --> | |<!-- M --> | ||
| Line 84: | Line 84: | ||
= <span id="Feature Selection"></span>Feature Selection = | = <span id="Feature Selection"></span>Feature Selection = | ||
| − | [ | + | [https://www.youtube.com/results?search_query=Feature+Selection+machine+learning+ML YouTube search...] |
| − | [ | + | [https://www.google.com/search?q=Feature+Selection+machine+learning+ML ...Google search] |
| − | * [ | + | * [https://www.datacamp.com/community/tutorials/feature-selection-python Beginner's Guide to Feature Selection in Python | Sayak Paul] ...Learn about the basics of feature selection and how to implement and investigate various feature selection techniques in [[Python]] |
| − | * [ | + | * [https://machinelearningmastery.com/feature-selection-machine-learning-python/ Feature Selection For Machine Learning in Python | Jason Brownlee] |
| − | * [ | + | * [https://machinelearningmastery.com/feature-selection-with-categorical-data/ How to Perform Feature Selection with Categorical Data | Jason Brownlee] |
{|<!-- T --> | {|<!-- T --> | ||
Revision as of 19:27, 28 January 2023
YouTube search... ...Google search
- Feature selection | Wikipedia
- Notes on Feature Preprocessing: The What, the Why, and the How | Matthew Mayo - KDnuggets
- Evaluating Machine Learning Models
- Automated Learning
- Recursive Feature Elimination (RFE)
- Principal Component Analysis (PCA)
- Representation Learning
- Feature Engineering and Selection: A Practical Approach for Predictive Models | Max Kuhn and Kjell Johnson
- Jon Tupitza's Famous Jupyter Notebooks:
- AI Governance / Algorithm Administration
- Visualization
- Tools:
A feature is an individual measurable property or characteristic of a phenomenon being observed. The concept of a “feature” is related to that of an explanatory variable, which is used in statistical techniques such as linear regression. Feature vectors combine all of the features for a single row into a numerical vector. Part of the art of choosing features is to pick a minimum set of independent variables that explain the problem. If two variables are highly correlated, either they need to be combined into a single feature, or one should be dropped. Sometimes people perform principal component analysis to convert correlated variables into a set of linearly uncorrelated variables. Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract Year of Birth from Year of Death and you construct Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious. Machine learning algorithms explained | Martin Heller - InfoWorld
|
|
|
|
Feature Selection
YouTube search... ...Google search
- Beginner's Guide to Feature Selection in Python | Sayak Paul ...Learn about the basics of feature selection and how to implement and investigate various feature selection techniques in Python
- Feature Selection For Machine Learning in Python | Jason Brownlee
- How to Perform Feature Selection with Categorical Data | Jason Brownlee
|
|
|
|
|
|
Sparse Coding - Feature Extraction
|
|