k-nearest neighbors algorithm

The k-nearest neighbors algorithm (k-NN) is a non-parametric method proposed by Thomas Cover used for classification and regression.

Machine learninginstance-based learning

Contributor(s)

Initial contribute: 2020-12-16

Classification(s)

Method-focused categoriesData-perspectiveIntelligent computation analysis

Detailed Description

English {{currentDetailLanguage}} English

Quoted from: https://ecommons.cornell.edu/bitstream/1813/31637/1/BU-1065-MA.pdf

k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until function evaluation. Since this algorithm relies on distance for classification, normalizing the training data can improve its accuracy dramatically.

Both for classification and regression, a useful technique can be to assign weights to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor.

The neighbors are taken from a set of objects for which the class (for k-NN classification) or the object property value (for k-NN regression) is known. This can be thought of as the training set for the algorithm, though no explicit training step is required.

A peculiarity of the k-NN algorithm is that it is sensitive to the local structure of the data.

Suppose we have pairs  taking values in , where Y is the class label of X, so that  for  (and probability distributions ). Given some norm  on  and a point , let  be a reordering of the training data such that .

The best choice of k depends upon the data; generally, larger values of k reduces effect of the noise on the classification, but make boundaries between classes less distinct. A good k can be selected by various heuristic techniques (see hyperparameter optimization). The special case where the class is predicted to be the class of the closest training sample (i.e. when k = 1) is called the nearest neighbor algorithm.

The accuracy of the k-NN algorithm can be severely degraded by the presence of noisy or irrelevant features, or if the feature scales are not consistent with their importance. Much research effort has been put into selecting or scaling features to improve classification. A particularly popular approach is the use of evolutionary algorithms to optimize feature scaling. Another popular approach is to scale features by the mutual information of the training data with the training classes.

In binary (two class) classification problems, it is helpful to choose k to be an odd number as this avoids tied votes. One popular way of choosing the empirically optimal k in this setting is via bootstrap method.

模型元数据

{{htmlJSON.HowtoCite}}

Zhen Qian (2020). k-nearest neighbors algorithm, Model Item, OpenGMS, https://geomodeling.njnu.edu.cn/modelItem/a0f23e6f-7481-41ba-b224-eab16750b172
{{htmlJSON.Copy}}

Contributor(s)

Initial contribute : 2020-12-16

{{htmlJSON.CoContributor}}

QR Code

×

{{curRelation.overview}}
{{curRelation.author.join('; ')}}
{{curRelation.journal}}









{{htmlJSON.RelatedItems}}

{{htmlJSON.LinkResourceFromRepositoryOrCreate}}{{htmlJSON.create}}.

Drop the file here, orclick to upload.
Select From My Space
+ add

{{htmlJSON.authorshipSubmitted}}

Cancel Submit
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Localizations}} + {{htmlJSON.Add}}
{{ item.label }} {{ item.value }}
{{htmlJSON.ModelName}}:
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
名称 别名 {{tag}} +
系列名 版本号 目的 修改内容 创建/修改日期 作者
摘要 详细描述
{{tag}} + 添加关键字
* 时间参考系
* 空间参考系类型 * 空间参考系名称

起始日期 终止日期 进展 开发者
* 是否开源 * 访问方式 * 使用方式 开源协议 * 传输方式 * 获取地址 * 发布日期 * 发布者



编号 目的 修改内容 创建/修改日期 作者





时间分辨率 时间尺度 时间步长 时间范围 空间维度 格网类型 空间分辨率 空间尺度 空间范围
{{tag}} +
* 类型
图例


* 名称 * 描述
示例描述 * 名称 * 类型 * 值/链接 上传


{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
Title Author Date Journal Volume(Issue) Pages Links Doi Operation
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Add}} {{htmlJSON.Cancel}}

{{articleUploading.title}}

Authors:  {{articleUploading.authors[0]}}, {{articleUploading.authors[1]}}, {{articleUploading.authors[2]}}, et al.

Journal:   {{articleUploading.journal}}

Date:   {{articleUploading.date}}

Page range:   {{articleUploading.pageRange}}

Link:   {{articleUploading.link}}

DOI:   {{articleUploading.doi}}

Yes, this is it Cancel

The article {{articleUploading.title}} has been uploaded yet.

OK
{{htmlJSON.Cancel}} {{htmlJSON.Confirm}}
OVMRjTLNXyNh