The research of sampling for mining frequent itemsets

X Hu, H Yu�- Rough Sets and Knowledge Technology: First�…, 2006 - Springer
X Hu, H Yu
Rough Sets and Knowledge Technology: First International Conference, RSKT 2006�…, 2006Springer
Efficiently mining frequent itemsets is the key step in extracting association rules from large
scale databases. Considering the restriction of min_support in mining association rules, a
weighted sampling algorithm for mining frequent itemsets is proposed in the paper. First of
all, a weight is given to each transaction data. Then according to the statistical optimal
sample size of database, a sample is extracted based on weight of data. In terms of the
algorithm, the sample includes large amounts of transaction data consisting of the frequent�…
Abstract
Efficiently mining frequent itemsets is the key step in extracting association rules from large scale databases. Considering the restriction of min_support in mining association rules, a weighted sampling algorithm for mining frequent itemsets is proposed in the paper. First of all, a weight is given to each transaction data. Then according to the statistical optimal sample size of database, a sample is extracted based on weight of data. In terms of the algorithm, the sample includes large amounts of transaction data consisting of the frequent itemsets with many items inside, so that the frequent itemsets mined from sample are similar to those gained from the original data. Furthermore, the algorithm can shrink the sample size and guarantee the sample quality at the same time. The experiment verifys the validity.
Springer
Showing the best result for this search. See all results