Skip to main content

Abstract

In the real world, ordered sequence data is commonly seen, and sequence analysis plays an important role in a wide range of real applications, such as market basket analysis. The weight concept helps to find more interesting sequences, whereas they may be treated as meaningless patterns in sequential pattern mining. Therefore, how to effectively discover these high weighted sequences from a quantitative sequential database is an urgent task. Based on the remaining weight concept, we propose a novel algorithm called Fast Weighted Sequential Pattern Mining (FWSPM) by utilizing an upper-bound called the remaining sequence maximum weight. Based on this upper-bound, an effective pruning strategy is designed to reduce the search space and save memory cost. Experimental results on both real and synthetic datasets show that the designed FWSPM algorithm is more efficient than the existing algorithms, and also has good scalability on large-scale datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 99.00
Price excludes VAT (USA)
Softcover Book
USD 129.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.philippe-fournier-viger.com/spmf/.

  2. 2.

    https://recsys.acm.org/recsys15/challenge/.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)

    Article  Google Scholar 

  2. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM (1993)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 7-th International Conference on Data Engineering, pp. 3–14. IEEE (1995)

    Google Scholar 

  4. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20-th International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  5. Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the 8-th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435 (2002)

    Google Scholar 

  6. Cai, C.H., Fu, A.W.C., Cheng, C.H., Kwong, W.W.: Mining association rules with weighted items. In: Proceedings of the International Database Engineering and Applications Symposium, pp. 68–77. IEEE (1998)

    Google Scholar 

  7. Chen, M.S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)

    Article  Google Scholar 

  8. Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)

    Google Scholar 

  9. Gan, W., Lin, J.C.W., Chao, H.C., Zhan, J.: Data mining in distributed environment: a survey. Wiley Interdisc. Rev.-Data Min. Knowl. Discov. 7(6), e1216 (2017)

    Article  Google Scholar 

  10. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Tseng, V.S., Yu, P.S.: A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. 33(4), 1306–1327 (2021)

    Article  Google Scholar 

  11. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13(3), 1–34 (2019)

    Article  Google Scholar 

  12. Gan, W., Lin, J.C.-W., Fournier-Viger, P., Chao, H.-C., Zhan, J., Zhang, J.: Exploiting highly qualified pattern with frequency and weight occupancy. Knowl. Inf. Syst. 56(1), 165–196 (2017). https://doi.org/10.1007/s10115-017-1103-8

    Article  Google Scholar 

  13. Gan, W., Lin, J.C.W., Zhang, J., Chao, H.C., Fujita, H., Yu, P.S.: ProUM: Projection-based utility mining on sequence data. Inf. Sci. 513, 222–240 (2020)

    Article  Google Scholar 

  14. Gan, W., Lin, J.C.W., Zhang, J., Fournier-Viger, P., Chao, H.C., Yu, P.S.: Fast utility mining on sequence data. IEEE Trans. Cybern. 51(2), 487–500 (2021)

    Article  Google Scholar 

  15. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  16. Lan, G.-C., Hong, T.-P., Lee, H.-Y.: An efficient approach for finding weighted sequential patterns from sequence databases. Appl. Intell. 41(2), 439–452 (2014). https://doi.org/10.1007/s10489-014-0530-4

    Article  Google Scholar 

  17. Lim, A.H., Lee, C.S.: Processing online analytics with classification and association rule mining. Knowl.-Based Syst. 23(3), 248–255 (2010)

    Article  Google Scholar 

  18. Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P.: RWFIM: recent weighted-frequent itemsets mining. Eng. Appl. Artif. Intell. 45, 18–32 (2015)

    Article  Google Scholar 

  19. Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Chao, H.C.: Mining weighted frequent itemsets without candidate generation in uncertain databases. Int. J. Inf. Technol. Dec. Mak. 16(06), 1549–1579 (2017)

    Article  Google Scholar 

  20. Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P., Tseng, V.S.: Weighted frequent itemset mining over uncertain databases. Appl. Intell. 44(1), 232–250 (2015). https://doi.org/10.1007/s10489-015-0703-9

    Article  Google Scholar 

  21. Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Min. Knowl. Disc. 1(3), 241–258 (1997)

    Article  Google Scholar 

  22. Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)

    Article  Google Scholar 

  23. Schweizer, D., Zehnder, M., Wache, H., Witschel, H.F., Zanatta, D., Rodriguez, M.: Using consumer behavior data to reduce energy consumption in smart homes: Applying machine learning to save energy without lowering comfort of inhabitants. In: Proceedings of the 14-th International Conference on Machine Learning and Applications, pp. 1123–1129. IEEE (2015)

    Google Scholar 

  24. Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140

    Chapter  Google Scholar 

  25. Wang, J., Han, J., Li, C.: Frequent closed sequence mining without candidate maintenance. IEEE Trans. Knowl. Data Eng. 19(8), 1042–1056 (2007)

    Article  Google Scholar 

  26. Yun, U., Leggett, J.J.: WSpan: Weighted sequential pattern mining in large sequence databases. In: Proceedings of the 3rd International Conference Intelligent Systems, pp. 512–517. IEEE (2006)

    Google Scholar 

  27. Zhang, C., Du, Z., Gan, W., Yu, P.S.: TKUS: mining top-\(k\) high utility sequential patterns. Inf. Sci. 570, 342–359 (2021)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgment

This research was supported in part by the National Natural Science Foundation of China (Grant Nos. 61902079 and 62002136), Guangzhou Basic and Applied Basic Research Foundation (Grant Nos. 202102020928 and 202102020277), and the Young Scholar Program of Pazhou Lab (Grant No. PZL2021KF0023).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wensheng Gan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, Z., Li, Z., Guo, W., Gan, W., Wan, S., Chen, J. (2022). Fast Weighted Sequential Pattern Mining. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08530-7_68

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08529-1

  • Online ISBN: 978-3-031-08530-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics