Difference between revisions of "Discriminative vs. Generative"
| Line 2: | Line 2: | ||
[http://www.google.com/search?q=Generative+Discriminative+Modeling ...Google search] | [http://www.google.com/search?q=Generative+Discriminative+Modeling ...Google search] | ||
| − | + | Application-specific details ultimately dictate the suitability of selecting a discriminative versus generative model. Model Contrasts... | |
| − | Model Contrasts... | ||
* [http://en.wikipedia.org/wiki/Discriminative_model Discriminative] | * [http://en.wikipedia.org/wiki/Discriminative_model Discriminative] | ||
| Line 12: | Line 11: | ||
** make few assumptions of the model structure | ** make few assumptions of the model structure | ||
** less tied to a particular structure | ** less tied to a particular structure | ||
| − | ** better performance with lots of example data | + | ** better performance with lots of example data; higher accuracy, which mostly leads to better learning result |
| + | ** discriminative models can yield superior performance (in part because they have fewer variables to compute) | ||
| + | ** saves calculation resource | ||
** can outperform generative if assumptions are not satisfied (real world is messy and assumptions are rarely perfectly satisfied) | ** can outperform generative if assumptions are not satisfied (real world is messy and assumptions are rarely perfectly satisfied) | ||
| − | ** not designed to use unlabeled data | + | ** not designed to use unlabeled data; are inherently supervised and cannot easily support unsupervised learning |
** do not generally function for outlier detection | ** do not generally function for outlier detection | ||
** do not offer such clear representations of relations between features and classes in the dataset | ** do not offer such clear representations of relations between features and classes in the dataset | ||
** yields representations of boundaries (more than generative) | ** yields representations of boundaries (more than generative) | ||
| + | ** do not allow one to generate samples from the joint distribution of observed and target variables | ||
| + | ** generates lower asymptotic errors | ||
| + | |||
* [[Generative]] | * [[Generative]] | ||
| + | ** requires less training samples | ||
** model the distribution of individual classes | ** model the distribution of individual classes | ||
** provides a model of how the data is actually generated | ** provides a model of how the data is actually generated | ||
| Line 28: | Line 33: | ||
** often outperform discriminative models on smaller datasets because their generative assumptions place some structure on your model that prevent overfitting | ** often outperform discriminative models on smaller datasets because their generative assumptions place some structure on your model that prevent overfitting | ||
** natural use of unlabeled data | ** natural use of unlabeled data | ||
| + | ** takes all data into consideration, which could result in slower processing as a disadvantage | ||
** generally function for outlier detection | ** generally function for outlier detection | ||
** typically specified as probabilistic graphical models, which offer rich representations of the independence relations in the dataset | ** typically specified as probabilistic graphical models, which offer rich representations of the independence relations in the dataset | ||
** more straightforward to detect distribution changes and update a generative model | ** more straightforward to detect distribution changes and update a generative model | ||
| − | + | ** takes the joint probability and predicts the most possible known label | |
| − | + | ** typically more flexible in expressing dependencies in complex learning tasks | |
| + | ** a flexible framework that could easily cooperate with other needs of the application | ||
| + | ** results in higher asymptotic errors faster | ||
| + | ** training method usually requires multiple numerical optimization techniques | ||
| + | ** will need the combination of multiple subtasks for a solving complex real-world problem | ||
<youtube>XtYMRq7f7KA</youtube> | <youtube>XtYMRq7f7KA</youtube> | ||
Revision as of 01:25, 6 January 2019
YouTube search... ...Google search
Application-specific details ultimately dictate the suitability of selecting a discriminative versus generative model. Model Contrasts...
- Discriminative
- learn the (hard or soft) boundary between classes
- providing classification splits (probabilistic or non-probabilistic manner)
- allow you to classify points, without providing a model of how the points are actually generated
- don't have generative properties
- make few assumptions of the model structure
- less tied to a particular structure
- better performance with lots of example data; higher accuracy, which mostly leads to better learning result
- discriminative models can yield superior performance (in part because they have fewer variables to compute)
- saves calculation resource
- can outperform generative if assumptions are not satisfied (real world is messy and assumptions are rarely perfectly satisfied)
- not designed to use unlabeled data; are inherently supervised and cannot easily support unsupervised learning
- do not generally function for outlier detection
- do not offer such clear representations of relations between features and classes in the dataset
- yields representations of boundaries (more than generative)
- do not allow one to generate samples from the joint distribution of observed and target variables
- generates lower asymptotic errors
- Generative
- requires less training samples
- model the distribution of individual classes
- provides a model of how the data is actually generated
- learn the underlying structure of the data
- have discriminative properties
- make some kind of structure assumptions on your model
- decision boundary: where one model becomes more likely
- often outperform discriminative models on smaller datasets because their generative assumptions place some structure on your model that prevent overfitting
- natural use of unlabeled data
- takes all data into consideration, which could result in slower processing as a disadvantage
- generally function for outlier detection
- typically specified as probabilistic graphical models, which offer rich representations of the independence relations in the dataset
- more straightforward to detect distribution changes and update a generative model
- takes the joint probability and predicts the most possible known label
- typically more flexible in expressing dependencies in complex learning tasks
- a flexible framework that could easily cooperate with other needs of the application
- results in higher asymptotic errors faster
- training method usually requires multiple numerical optimization techniques
- will need the combination of multiple subtasks for a solving complex real-world problem