YouTube search...
...Google search
- Feature selection | Wikipedia
- Notes on Feature Preprocessing: The What, the Why, and the How | Matthew Mayo - KDnuggets
- Evaluating Machine Learning Models
- Automated Machine Learning (AML) - AutoML
- Recursive Feature Elimination (RFE)
- Principal Component Analysis (PCA)
- Representation Learning
- Feature Engineering and Selection: A Practical Approach for Predictive Models | Max Kuhn and Kjell Johnson
- Jon Tupitza's Famous Jupyter Notebooks:
- AI Governance
- Data Science / Data Governance
- Benchmarks
- Data Preprocessing
- Feature Exploration/Learning
- Data Quality ...validity, accuracy, cleaning, completeness, consistency, encoding, padding, augmentation, labeling, auto-tagging, normalization, standardization, and imbalanced data
- Bias and Variances
- Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog
- Privacy in Data Science
- Data Interoperability
- Excel - Data Analysis
- Visualization
- Tools: Paxata, Trifacta, alteryx, databricks, Qubole
A feature is an individual measurable property or characteristic of a phenomenon being observed. The concept of a “feature” is related to that of an explanatory variable, which is used in statistical techniques such as linear regression. Feature vectors combine all of the features for a single row into a numerical vector. Part of the art of choosing features is to pick a minimum set of independent variables that explain the problem. If two variables are highly correlated, either they need to be combined into a single feature, or one should be dropped. Sometimes people perform principal component analysis to convert correlated variables into a set of linearly uncorrelated variables. Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract Year of Birth from Year of Death and you construct Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious. Machine learning algorithms explained | Martin Heller - InfoWorld
|
AI Explained: Feature Importance
Fiddler Labs Learn more about feature importance, the different techniques, and the pros and cons of each. #ExplainableAI
|
|
|
|
Visualize your Data with Facets
In this episode of AI Adventures, Yufeng explains how to use Facets, a project from Google Research, to visualize your dataset, find interesting relationships, and clean your data for machine learning. Learn more through our hands-on labs → http://goo.gle/38ZUlTD Associated Medium post "Visualize your data with Facets": http://goo.gl/7FDWwk Get Facets on GitHub: http://goo.gl/Xi8dTu
Play with Facets in the browser: http://goo.gl/fFLCEV Watch more AI Adventures on the playlist: http://goo.gl/UC5usG Subscribe to get all the episodes as they come out: http://goo.gl/S0AS51 #AIAdventures
|
|
|
Stephen Elston - Data Visualization and Exploration with Python
Visualization is an essential method in any data scientist’s toolbox and is a key data exploration method and is a powerful tool for presentation of results and understanding problems with analytics. Attendees are introduced to Python visualization packages, Matplotlib, Pandas, and Seaborn. The Jupyter notebook Visualization of complex real-world datasets presents a number of challenges to data scientists. By developing skills in data visualization, data scientists can confidently explore and understand the relationships in complex data sets. Using the Python matplotlib, pandas plotting and seaborn packages attendees will learn to: • Explore complex data sets with visualization, to develop understanding of the inherent relationships. • Create multiple views of data to highlight different aspects of the inherent relationships, with different graph types. • Use plot aesthetics to project multiple dimensions. • Apply conditioning or faceting methods to project multiple dimensions www.pydata.org
|
|
|
|
The Best Way to Visualize a Dataset Easily
Siraj Raval In this video, we'll visualize a dataset of body metrics collected by giving people a fitness tracking device. We'll go over the steps necessary to preprocess the data, then use a technique called T-SNE to reduce the dimensionality of our data so we can visualize it.
|
|
Feature Selection
YouTube search...
...Google search