rDenStream

For mining new pattern from evolving data streams, most algorithms are inherited from DenStream framework which is realized via a sliding window. So at the early stage of a pattern emerges, its knowledge points can be easily mistaken as outliers and dropped. In most cases, these points can be ignored, but in some special applications which need to quickly and precisely master the emergence rule of some patterns, these points must play their rules. Based on DenStream, this paper proposes a three-step clustering algorithm, rDenStream, which presents the concept of outlier retrospect. In rDenStream clustering, dropped micro-clusters are stored on outside memory temporarily, and will be given new chance to attend clustering to improve the clustering accuracy. Experiments modeled the arrival of data stream in Poisson process, and the results over standard data set showed its advantage over other methods in the early phase of new pattern discovery.

DenstreamClustering

Contributor(s)

Initial contribute: 2021-01-09

Classification(s)

Method-focused categoriesData-perspectiveGeoinformation analysis

Detailed Description

English {{currentDetailLanguage}} English

Below are quoted fromLiu, Li-xiong, et al. "rDenStream, a clustering algorithm over an evolving data stream." 2009 international conference on information engineering and computer science. IEEE, 2009.

For mining new pattern from evolving data streams, most algorithms are inherited from DenStream framework which is realized via a sliding window. So at the early stage of a pattern emerges, its knowledge points can be easily mistaken as outliers and dropped. In most cases, these points can be ignored, but in some special applications which need to quickly and precisely master the emergence rule of some patterns, these points must play their rules. Based on DenStream, this paper proposes a three-step clustering algorithm, rDenStream, which presents the concept of outlier retrospect. In rDenStream clustering, dropped micro-clusters are stored on outside memory temporarily, and will be given new chance to attend clustering to improve the clustering accuracy. Experiments modeled the arrival of data stream in Poisson process, and the results over standard data set showed its advantage over other methods in the early phase of new pattern discovery.

 

Three phases are implemented as follows:

  1. Select the time window with proper granularity, points of a data stream are divided into several disjoint subsets according to the arriving time. Points in the same time window are clustered into two kinds of clusters: potential-micro-cluster and outlier-micro-cluster. Only the p-micro-cluster is the input of the next phase, while other micro-clusters are stored in historical micro-cluster buffer by using the outside memory such as magnetic disc.

  2. Macro-clustering phase puts all results from each time window into a new set. This set includes all the learning results from subsets but has much few points compared with the original data.

  3. In the last phase, new clusters are used to form a classifier, which will relearn the historical micro-clusters according to emerging rule of the coming data, and we call this phase as retrospect. During the retrospect, the pseudo outlier-micro-clusters misjudged in the first two stages can be modified to improve the clustering accuracy. Considering the “long tail” phenomenon, this algorithm gives a chance to let misjudged points to be learned again so it can enhance theclustering robustness.

模型元数据

{{htmlJSON.HowtoCite}}

Jie Song (2021). rDenStream, Model Item, OpenGMS, https://geomodeling.njnu.edu.cn/modelItem/2d0b1a45-2103-428e-b490-854bb78aaa32
{{htmlJSON.Copy}}

Contributor(s)

Initial contribute : 2021-01-09

{{htmlJSON.CoContributor}}

QR Code

×

{{curRelation.overview}}
{{curRelation.author.join('; ')}}
{{curRelation.journal}}









{{htmlJSON.RelatedItems}}

{{htmlJSON.LinkResourceFromRepositoryOrCreate}}{{htmlJSON.create}}.

Drop the file here, orclick to upload.
Select From My Space
+ add

{{htmlJSON.authorshipSubmitted}}

Cancel Submit
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Localizations}} + {{htmlJSON.Add}}
{{ item.label }} {{ item.value }}
{{htmlJSON.ModelName}}:
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
名称 别名 {{tag}} +
系列名 版本号 目的 修改内容 创建/修改日期 作者
摘要 详细描述
{{tag}} + 添加关键字
* 时间参考系
* 空间参考系类型 * 空间参考系名称

起始日期 终止日期 进展 开发者
* 是否开源 * 访问方式 * 使用方式 开源协议 * 传输方式 * 获取地址 * 发布日期 * 发布者



编号 目的 修改内容 创建/修改日期 作者





时间分辨率 时间尺度 时间步长 时间范围 空间维度 格网类型 空间分辨率 空间尺度 空间范围
{{tag}} +
* 类型
图例


* 名称 * 描述
示例描述 * 名称 * 类型 * 值/链接 上传


{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
Title Author Date Journal Volume(Issue) Pages Links Doi Operation
{{htmlJSON.Cancel}} {{htmlJSON.Submit}}
{{htmlJSON.Add}} {{htmlJSON.Cancel}}

{{articleUploading.title}}

Authors:  {{articleUploading.authors[0]}}, {{articleUploading.authors[1]}}, {{articleUploading.authors[2]}}, et al.

Journal:   {{articleUploading.journal}}

Date:   {{articleUploading.date}}

Page range:   {{articleUploading.pageRange}}

Link:   {{articleUploading.link}}

DOI:   {{articleUploading.doi}}

Yes, this is it Cancel

The article {{articleUploading.title}} has been uploaded yet.

OK
{{htmlJSON.Cancel}} {{htmlJSON.Confirm}}
_Y88tPl1eFJE