Mixture model

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs.

Machine learningStatistical inferences



Initial contribute: 2020-12-18


Method-focused categoriesData-perspectiveIntelligent computation analysis

Detailed Description

English {{currentDetailLanguage}} English

Quoted from: https://en.wikipedia.org/wiki/Mixture_model

A typical finite-dimensional mixture model is a hierarchical model consisting of the following components:

  • N random variables that are observed, each distributed according to a mixture of K components, with the components belonging to the same parametric family of distributions (e.g., all normal, all Zipfian, etc.) but with different parameters
  • N random latent variables specifying the identity of the mixture component of each observation, each distributed according to a K-dimensional categorical distribution
  • A set of K mixture weights, which are probabilities that sum to 1.
  • A set of K parameters, each specifying the parameter of the corresponding mixture component. In many cases, each "parameter" is actually a set of parameters. For example, if the mixture components are Gaussian distributions, there will be a mean and variance for each component. If the mixture components are categorical distributions (e.g., when each observation is a token from a finite alphabet of size V), there will be a vector of V probabilities summing to 1.

In addition, in a Bayesian setting, the mixture weights and parameters will themselves be random variables, and prior distributions will be placed over the variables. In such a case, the weights are typically viewed as a K-dimensional random vector drawn from a Dirichlet distribution (the conjugate prior of the categorical distribution), and the parameters will be distributed according to their respective conjugate priors.

Mathematically, a basic parametric mixture model can be described as follows:

In a Bayesian setting, all parameters are associated with random variables, as follows:



House prices

Assume that we observe the prices of N different houses. Different types of houses in different neighborhoods will have vastly different prices, but the price of a particular type of house in a particular neighborhood (e.g., three-bedroom house in moderately upscale neighborhood) will tend to cluster fairly closely around the mean. One possible model of such prices would be to assume that the prices are accurately described by a mixture model with K different components, each distributed as a normal distribution with unknown mean and variance, with each component specifying a particular combination of house type/neighborhood. Fitting this model to observed prices, e.g., using the expectation-maximization algorithm, would tend to cluster the prices according to house type/neighborhood and reveal the spread of prices in each type/neighborhood. (Note that for values such as prices or incomes that are guaranteed to be positive and which tend to grow exponentially, a log-normal distribution might actually be a better model than a normal distribution.)

Topics in a document

Assume that a document is composed of N different words from a total vocabulary of size V, where each word corresponds to one of K possible topics. The distribution of such words could be modelled as a mixture of K different V-dimensional categorical distributions. A model of this sort is commonly termed a topic model. Note that expectation maximization applied to such a model will typically fail to produce realistic results, due (among other things) to the excessive number of parameters. Some sorts of additional assumptions are typically necessary to get good results. Typically two sorts of additional components are added to the model:

  1. A prior distribution is placed over the parameters describing the topic distributions, using a Dirichlet distribution with a concentration parameter that is set significantly below 1, so as to encourage sparse distributions (where only a small number of words have significantly non-zero probabilities).
  2. Some sort of additional constraint is placed over the topic identities of words, to take advantage of natural clustering.
  • For example, a Markov chain could be placed on the topic identities (i.e., the latent variables specifying the mixture component of each observation), corresponding to the fact that nearby words belong to similar topics. (This results in a hidden Markov model, specifically one where a prior distribution is placed over state transitions that favors transitions that stay in the same state.)
  • Another possibility is the latent Dirichlet allocation model, which divides up the words into D different documents and assumes that in each document only a small number of topics occur with any frequency.

Handwriting recognition

The following example is based on an example in Christopher M. Bishop, Pattern Recognition and Machine Learning.

Imagine that we are given an N×N black-and-white image that is known to be a scan of a hand-written digit between 0 and 9, but we don't know which digit is written. We can create a mixture model with  different components, where each component is a vector of size  of Bernoulli distributions (one per pixel). Such a model can be trained with the expectation-maximization algorithm on an unlabeled set of hand-written digits, and will effectively cluster the images according to the digit being written. The same model could then be used to recognize the digit of another image simply by holding the parameters constant, computing the probability of the new image for each possible digit (a trivial calculation), and returning the digit that generated the highest probability.



Zhen Qian (2020). Mixture model, Model Item, OpenGMS, https://geomodeling.njnu.edu.cn/modelItem/5f6dceab-ce38-4992-ab9f-ebf87ec94ddc


Initial contribute : 2020-12-18


QR Code


{{curRelation.author.join('; ')}}



Drop the file here, orclick to upload.
Select From My Space
+ add


Cancel Submit
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Localizations}} + {{htmlJSON.Add}}
{{ item.label }} {{ item.value }}
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
名称 别名 {{tag}} +
系列名 版本号 目的 修改内容 创建/修改日期 作者
摘要 详细描述
{{tag}} + 添加关键字
* 时间参考系
* 空间参考系类型 * 空间参考系名称

起始日期 终止日期 进展 开发者
* 是否开源 * 访问方式 * 使用方式 开源协议 * 传输方式 * 获取地址 * 发布日期 * 发布者

编号 目的 修改内容 创建/修改日期 作者

时间分辨率 时间尺度 时间步长 时间范围 空间维度 格网类型 空间分辨率 空间尺度 空间范围
{{tag}} +
* 类型

* 名称 * 描述
示例描述 * 名称 * 类型 * 值/链接 上传

{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
Title Author Date Journal Volume(Issue) Pages Links Doi Operation
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Add}} {{htmlJSON.Cancel}}


Authors:  {{articleUploading.authors[0]}}, {{articleUploading.authors[1]}}, {{articleUploading.authors[2]}}, et al.

Journal:   {{articleUploading.journal}}

Date:   {{articleUploading.date}}

Page range:   {{articleUploading.pageRange}}

Link:   {{articleUploading.link}}

DOI:   {{articleUploading.doi}}

Yes, this is it Cancel

The article {{articleUploading.title}} has been uploaded yet.

{{htmlJSON.Cancel}} {{htmlJSON.Confirm}}