Naive Bayes classifier

In statistics, Naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naïve) independence assumptions between the features. They are among the simplest Bayesian network models,[1] but coupled with Kernel density estimation, they can achieve higher accuracy levels.

Machine learningBayesian methodProbability



Initial contribute: 2020-12-18


Method-focused categoriesData-perspectiveIntelligent computation analysis

Detailed Description

English {{currentDetailLanguage}} English

Quoted from:

Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. There is not a single algorithm for training such classifiers, but a family of algorithms based on a common principle: all naive Bayes classifiers assume that the value of a particular feature is independent of the value of any other feature, given the class variable. For example, a fruit may be considered to be an apple if it is red, round, and about 10 cm in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of any possible correlations between the color, roundness, and diameter features.

For some types of probability models, naive Bayes classifiers can be trained very efficiently in a supervised learning setting. In many practical applications, parameter estimation for naive Bayes models uses the method of maximum likelihood; in other words, one can work with the naive Bayes model without accepting Bayesian probability or using any Bayesian methods.

Despite their naive design and apparently oversimplified assumptions, naive Bayes classifiers have worked quite well in many complex real-world situations. In 2004, an analysis of the Bayesian classification problem showed that there are sound theoretical reasons for the apparently implausible efficacy of naive Bayes classifiers. Still, a comprehensive comparison with other classification algorithms in 2006 showed that Bayes classification is outperformed by other approaches, such as boosted trees or random forests.

An advantage of naive Bayes is that it only requires a small number of training data to estimate the parameters necessary for classification.

Abstractly, naïve Bayes is a conditional probability model: given a problem instance to be classified, represented by a vector  representing some n features (independent variables), it assigns to this instance probabilities

for each of K possible outcomes or classes .

The problem with the above formulation is that if the number of features n is large or if a feature can take on a large number of values, then basing such a model on probability tables is infeasible. We therefore reformulate the model to make it more tractable. Using Bayes' theorem, the conditional probability can be decomposed as

In plain English, using Bayesian probability terminology, the above equation can be written as

In practice, there is interest only in the numerator of that fraction, because the denominator does not depend on  and the values of the features  are given, so that the denominator is effectively constant. The numerator is equivalent to the joint probability model

which can be rewritten as follows, using the chain rule for repeated applications of the definition of conditional probability:

Now the "naïve" conditional independence assumptions come into play: assume that all features in  are mutually independent, conditional on the category . Under this assumption,


Thus, the joint model can be expressed as

where  denotes proportionality.

This means that under the above independence assumptions, the conditional distribution over the class variable  is:

where the evidence  is a scaling factor dependent only on , that is, a constant if the values of the feature variables are known.

The discussion so far has derived the independent feature model, that is, the naïve Bayes probability model. The naïve Bayes classifier combines this model with a decision rule. One common rule is to pick the hypothesis that is most probable; this is known as the maximum a posteriori or MAP decision rule. The corresponding classifier, a Bayes classifier, is the function that assigns a class label  for some k as follows:


Zhen Qian (2020). Naive Bayes classifier, Model Item, OpenGMS,


Initial contribute : 2020-12-18


QR Code


{{'; ')}}



Drop the file here, orclick to upload.
Select From My Space
+ add


Cancel Submit
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Localizations}} + {{htmlJSON.Add}}
{{ item.label }} {{ item.value }}
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
Model Type:
Model Domain:
Incorporated models:

Model part of

larger framework

Hardware Requirements:
Software Requirements:
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
Title Author Date Journal Volume(Issue) Pages Links Doi Operation
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Add}} {{htmlJSON.Cancel}}


Authors:  {{articleUploading.authors[0]}}, {{articleUploading.authors[1]}}, {{articleUploading.authors[2]}}, et al.

Journal:   {{articleUploading.journal}}

Date:   {{}}

Page range:   {{articleUploading.pageRange}}

Link:   {{}}

DOI:   {{articleUploading.doi}}

Yes, this is it Cancel

The article {{articleUploading.title}} has been uploaded yet.

{{htmlJSON.Cancel}} {{htmlJSON.Confirm}}