Evaluation - Measures

From
Revision as of 22:42, 5 September 2020 by BPeat (talk | contribs) (Specificity)
Jump to: navigation, search

YouTube search... ...Google search

Confusion Matrix, Precision, Recall, F Score, ROC Curves, trade off between True Positive Rate and False Positive Rate.

Metrics - Ep. 23 (Deep Learning Simplified)
Data scientists use a variety of metrics in order to objectively determine the performance of a model. This clip will provide an overview of some of the most common metrics such as error, precision, and recall.

Arun Chaganty: Debiasing natural language evaluation with humans in the loop
A significant challenge in developing systems for tasks such as knowledge base population, text summarization or question answering is simply evaluating their performance: existing fully-automatic evaluation techniques rely on an incomplete set of “gold” annotations that can not adequately cover the range of possible outputs of such systems and lead to systematic biases against many genuinely useful system improvements. In this talk, I’ll present our work on how we can eliminate this bias by incorporating on-demand human feedback without incurring the full cost of human evaluation. Our key technical innovation is the design of good statistical estimators that are able to tradeoff cost for variance reduction. We hope that our work will enable the development of better NLP systems by making unbiased natural language evaluation practical and easy to use.

Model Evaluation : ROC Curve, Confusion Matrix, Accuracy Ratio | Data Science
In this video you will learn about the different performance matrix used for model evaludation such as Receiver Operating Charateristics, Confusion matrix, Accuracy. This is used very well in evauating classfication models like deicision tree, Logistic regression, SVM

Machine Learning: Testing and Error Metrics
Announcement: New Book by Luis Serrano! Grokking Machine Learning. bit.ly/grokkingML A friendly journey into the process of evaluating and improving machine learning models. - Training, Testing - Evaluation Metrics: Accuracy, Precision, Recall, F1 Score - Types of Errors: Overfitting and Underfitting - Cross Validation and K-fold Cross Validation - Model Evaluation Graphs - Grid Search

Applied Machine Learning 2019 - Lecture 10 - Model Evaluation
Metrics for binary classification, multiclass and regression. ROC curves, precision-recall curves.

Which Machine Learning Error Metric to Use?? RMSE, MSE, AUC, Lift, F1 & more
There are many ways to measure error in a machine learning model. Some techniques favor classification over regression. There are a number of important considerations. This video discussed RMSE, MSE, AUC, Lift, F1, Precision, & Recall.

Error Metric

YouTube search...

Predictive Modeling works on constructive feedback principle. You build a model. Get feedback from metrics, make improvements and continue until you achieve a desirable accuracy. Evaluation metrics explain the performance of a model. An important aspects of evaluation metrics is their capability to discriminate among model results. 7 Important Model Evaluation Error Metrics Everyone should know | Tavish Srivastava

Machine Learning #48 Evaluation Measures
Machine Learning Complete Tutorial/Lectures/Course from IIT (nptel) @ https://goo.gl/AurRXm Discrete Mathematics for Computer Science @ http://goo.gl/YJnA4B (IIT Lectures for GATE)

Evaluation Metrics of Machine Learning Algorithms - Confusion Matrix
Discussed in detailed about one of the key evaluation metrics, i.e. Confusion matrix in layman's terms with the example.


Confusion Matrix

YouTube search... ...Google search

A performance measurement for machine learning classification - one of the fundamental concepts in machine learning is the Confusion Matrix. Combined with Cross Validation, it's how one decides which machine learning method would be best for a particular dataset. Understanding Confusion Matrix | Sarang Narkhede - Medium

1*7EYylA6XlXSGBCF77j_rOA.png

Confusion Matrix & Model Validation
In this video you will learn what is a confusion matrix and how confusion matrix can be used to validate models and come up with optimal cut off score. Watch all our videos on : http://www.analyticuniversity.com/

10 Confusion Matrix Solved
Confusion Matrix Solved for 2 classes and 3 classes generalizing n classes.

Machine Learning Fundamentals: The Confusion Matrix
One of the fundamental concepts in machine learning is the Confusion Matrix. Combined with Cross Validation, it's how we decide which machine learning method would be best for our dataset. Check out the video to find out how!

Making sense of the confusion matrix
How do you interpret a confusion matrix? How can it help you to evaluate your machine learning model? What rates can you calculate from a confusion matrix, and what do they actually mean?

In this video, I'll start by explaining how to interpret a confusion matrix for a binary classifier: 0:49 What is a confusion matrix? 2:14 An example confusion matrix 5:13 Basic terminology

Then, I'll walk through the calculations for some common rates: 11:20 Accuracy 11:56 Misclassification Rate / Error Rate 13:20 True Positive Rate / Sensitivity / Recall 14:19 False Positive Rate 14:54 True Negative Rate / Specificity 15:58 Precision

Finally, I'll conclude with more advanced topics: 19:10 How to calculate precision and recall for multi-class problems 24:17 How to analyze a 10-class confusion matrix 28:26 How to choose the right evaluation metric for your problem 31:31 Why accuracy is often a misleading metric


Accuracy

YouTube search... ...Google search

The number of correct predictions made by the model over all kinds predictions made.

1*5XuZ_86Rfce3qyLt7XMlhw.png

Accuracy Review
This video is part of an online course, Intro to Machine Learning. Check out the course here: http://www.udacity.com/course/ud120. This course was designed as part of a program to help you and others become a Data Analyst. You can check out the full details of the program here: http://www.udacity.com/course/nd002.

Evaluation Metrics of Machine Learning Algorithms - Accuracy
Discussed in detailed about the most frequently used classification metric i.e accuracy and also covered the interesting conclusions about it.

Precision & Recall (Sensitivity)

YouTube search... ...Google search

(also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instances. Both precision and recall are therefore based on an understanding and measure of relevance. Precision and recall | Wikipedia

  • Precision: measure that tells us what proportion of patients that we diagnosed as having cancer, actually had cancer. The predicted positives (People predicted as cancerous are TP and FP) and the people actually having a cancer are TP.

1*KhlD7Js9leo0B0zfsIfAIA.png

  • Recall or Sensitivity: measure that tells us what proportion of patients that actually had cancer was diagnosed by the algorithm as having cancer. The actual positives (People having cancer are TP and FN) and the people diagnosed by the model having a cancer are TP. (Note: FN is included because the Person actually had a cancer even though the model predicted otherwise).

1*a8hkMGVHg3fl4kDmSIDY_A.png

Metrics - Ep. 23 (Deep Learning Simplfied)
Data scientists use a variety of metrics in order to objectively determine the performance of a model. This clip will provide an overview of some of the most common metrics such as error, precision, and recall.

Precision, Recall & F-Measure
In this video, we discuss performance measures for Classification problems in Machine Learning: Simple Accuracy Measure, Precision, Recall, and the F (beta)-Measure. We explain the concepIf you have any questions, feel free to contact me. Email: ask.ajhalthor@gmail.comts in detail, highlighting differences between the terms, introducing Confusion Matrices, and analyzing real world examples.

Specificity

YouTube search... ...Google search

Measure that tells us what proportion of patients that did NOT have cancer, were predicted by the model as non-cancerous. The actual negatives (People actually NOT having cancer are FP and TN) and the people diagnosed by us not having cancer are TN. (Note: FP is included because the Person did NOT actually have cancer even though the model predicted otherwise).

1*deegiX75imQsVXYVpG_SDQ.png

Sensitivity, Specificity, PPV, and NPV
Jim Cropper, DPT, MS

F1 Score (F-Measure)

YouTube search...

F1 Score = 2 * Precision * Recall / (Precision + Recall)

(Harmonic mean) is kind of an average when x and y are equal. But when x and y are different, then it’s closer to the smaller number as compared to the larger number. So if one number is really small between precision and recall, the F1 Score kind of raises a flag and is more closer to the smaller number than the bigger one, giving the model an appropriate score rather than just an arithmetic mean.

1*W2CxvU7m8R6cB_oz2U3ouA.png

Receiver Operating Characteristic (ROC)

YouTube search...

In a ROC curve the true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points of a parameter. a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.

The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity, recall or probability of detection[1] in machine learning. The false-positive rate is also known as probability of false alarm[1] and can be calculated as (1 − specificity). Wikipedia

330px-Roccurves.png


Area Under the Curve (AUC)

The area under the ROC curve ( AUC ) is a measure of how well a parameter can distinguish between two diagnostic groups (diseased/normal).

Correlation Coefficient

A correlation is about how two things change with each other. Knowing about how two things change together is the first step to prediction. The "r value" is a common way to indicate a correlation value. More specifically, it refers to the (sample) Pearson correlation, or Pearson's r. There is more than one way to calculate a correlation. Here we have touched on the case where both variables change at the same way. There are other cases where one variable may change at a different rate, but still have a clear relationship. This gives rise to what's called, non-linear relationships. What is a Correlation Coefficient? The r Value in Statistics Explained | Eric Leung - freeCodeCamp ...Code, Data, Microbiome blog


Rationality

YouTube search...

Tradeoffs

'Precision' & 'Recall'

It is clear that recall gives us information about a classifier’s performance with respect to false negatives (how many did we miss), while precision gives us information about its performance with respect to false positives(how many did we caught).

  • Precision is about being precise. So even if we managed to capture only one cancer case, and we captured it correctly, then we are 100% precise.
  • Recall is not so much about capturing cases correctly but more about capturing all cases that have “cancer” with the answer as “cancer”. So if we simply always say every case as “cancer”, we have 100% recall.

So basically if we want to focus more on:

  • minimising False Negatives, we would want our Recall to be as close to 100% as possible without precision being too bad
  • minimising False Positives, then our focus should be to make Precision as close to 100% as possible.

525px-Precisionrecall.svg.png

'Sensitivity' & 'Specificity'

'True Positive Rate' & 'False Positive Rate'

Accuracy is not the best measure for Machine Learning