Result: RFIMiner : A regression-based algorithm for recently frequent patterns in multiple time granularity data streams

Title:
RFIMiner : A regression-based algorithm for recently frequent patterns in multiple time granularity data streams
Source:
Special issue on intelligent computing theory and methodologyApplied mathematics and computation. 185(2):769-783
Publisher Information:
New York, NY: Elsevier, 2007.
Publication Year:
2007
Physical Description:
print, 22 ref
Original Material:
INIST-CNRS
Document Type:
Conference Conference Paper
File Description:
text
Language:
English
Author Affiliations:
College of Computer Science, Jilin University. Key Laboratory of Symbol Computation and Knowledge. Engineering of the Ministry of Education, Changchun 130012, China
ISSN:
0096-3003
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Mathematics
Accession Number:
edscal.18637788
Database:
PASCAL Archive

Further Information

In this paper, we propose an algorithm for computing and maintaining recently frequent patterns which is more stable and smaller than the data stream and dynamically updating them with the incoming transactions. Our study mainly has two contributions. First, a regression-based data stream model is proposed to differentiate new and old transactions. The novel model reflects transactions into many multiple time granularities and can automatically adjust transactional fading rate by defining a fading factor. The factor defines a desired life-time of the information of transactions in the data stream. Second, we develop RFIMiner. a single-scan algorithm for mining recently frequent patterns from data streams. Our algorithm employs a special property among suffix-trees, sb it is unnecessary to traverse suffix-trees when patterns are discovered. To cater to suffix-trees, we also adopt a new method called Depth-first and Bottom-up Inside Itemset Growth to find more recently frequent patterns from known frequent ones. Moreover, it avoids generating redundant computation and candidate patterns as well. We conduct detailed experiments to evaluate the performance of algorithm in several aspects. Results confirm that the new method has an excellent scalability and the performance meets the condition which requires better quality and efficiency of mining recently frequent itemsets in the data stream.