Below are quoted from: Cao, Feng, Martin Estert, Weining Qian, and Aoying Zhou. "Density-based clustering over an evolving data stream with noise." In Proceedings of the 2006 SIAM international conference on data mining, pp. 328-339. Society for industrial and applied mathematics, 2006. https://doi.org/10.1137/1.9781611972764.29
Clustering is an important task in mining evolving data streams. Beside the limited memory and one-pass constraints, the nature of evolving data streams implies the following requirements for stream clustering: no assumption on the number of clusters, discovery of clusters with arbitrary shape and ability to handle outliers. While a lot of clustering algorithms for data streams have been proposed, they offer no solution to the combination of these requirements. In this paper, we present DenStream, a new approach for discovering clusters in an evolving data stream. The “dense” micro-cluster (named core-micro-cluster) is introduced to summarize the clusters with arbitrary shape, while the potential core-micro-cluster and outlier micro-cluster structures are proposed to maintain and distinguish the potential clusters and outliers. A novel pruning strategy is designed based on these concepts, which guarantees the precision of the weights of the micro-clusters with limited memory. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.
Below are quoted from: Li, Manqi, Arie Croitoru, and Songshan Yue. "GeoDenStream: An improved DenStream clustering method for managing entity data within geographical data streams." Computers & Geosciences 144 (2020): 104563. https://doi.org/10.1016/j.cageo.2020.104563
In order to conceptualize the DenStream in the context of entity stream data in geographical space consider a data stream in which each record is comprised of a data “point”, i.e., a geographic location (for example, in the form of geographic coordinates), a timestamp, and a set of related attributes that describe an entity. The DenStream clustering method applies the core-micro-cluster approach to detect arbitrary-shaped clusters (Cao et al., 2006). In this approach, a core-micro-cluster is constructed by points that are sufficiently dense according to a density threshold, and such cluster evolves over time as data points are received. In addition, each core-micro-cluster is assigned a weight that decreases exponentially with time. Based on their weights, core-micro-clusters with higher weights (i.e., potential-clusters) are acquired for building clusters, and core-micro-clusters with lower weights (i.e., outlier-clusters) are removed from the final clustering results.
There are four phases in the original DenStream clustering method, as shown in Fig. 1: an Initializing phase in which the potential-cluster and outlier-cluster lists are constructed; an Online phase in which newly arrived data points are either merged into a potential-cluster or form a new outlier-cluster; a Pruning phase in which potential- and outlier-clusters with lower weights are removed from the corresponding lists; and finally, an Offline phase in which DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering (Ester et al., 1996) is used for generating offline-clusters based on the potential-cluster list. In this process, a set of parameters are used, including initial_points and min_points in the Initializing phase, epsilon, lambda, beta, and mu in the Online phase, tp in the Pruning phase, and offline in the Offline phase. A more detailed description of these parameters is provided in Appendix A.
All copyrights of a material (model, data, article, etc.) in the OpenGMS fully belong to its author/developer/designer (or any other wording about the owner). The OpenGMS takes every care to avoid copyright infringement, contributor(s) should carefully employ materials from other sources and give proper citations.
Contributor(s)
Initial contribute : 2020-01-09
{{htmlJSON.CoContributor}}
Authorship
Feng Cao, Martin Estert, Weining Qian, and Aoying Zhou