Skip to main content
Log in

Layered convolutional dictionary learning for sparse coding itemsets

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Dictionary learning for sparse coding has been successfully used in different domains, however, has never been employed for the interesting itemset mining. In this paper, we formulate an optimization problem for extracting a sparse representation of itemsets and show that the discrete nature of itemsets makes it NP-hard. An efficient approximation algorithm is presented which greedily solves maximum set cover to reduce overall compression loss. Furthermore, we incorporate our sparse representation algorithm into a layered convolutional model to learn nonredundant dictionary items. Following the intuition of deep learning, our convolutional dictionary learning approach convolves learned dictionary items and discovers statistically dependent patterns using chi-square in a hierarchical fashion; each layer having more abstract and compressed dictionary than the previous. An extensive empirical validation is performed on thirteen datasets, showing better interpretability and semantic coherence of our approach than two existing state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://jmlr.csail.mit.edu/

References

  1. Aggarwal, C.C., Han, J.: Frequent Pattern Mining. Springer, Berlin (2014)

    Book  MATH  Google Scholar 

  2. Agrawal, R., Srikant, R., et al.: Fast Algorithms for Mining Association Rules. Morgan Kaufmann, San Mateo (1994)

    Google Scholar 

  3. Boureau, Y.-l., Cun, Y. L., et al.: Sparse feature learning for deep belief networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)

  4. Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)

    Article  MathSciNet  Google Scholar 

  5. Chandola, V., Kumar, V.: Summarization—compressing data into an informative representation. Knowl. Inf. Syst. 12(3), 355–378 (2007)

    Article  Google Scholar 

  6. Coenen, F.: The lucs-kdd discretised/normalised arm and carm data library. http://www.csc.liv.ac.uk/frans/KDD/Software/LUCS-KDD-DN/DataSets/dataSets.html

  7. Fowkes, J., Sutton, C.: A bayesian network model for interesting itemsets. In: Proceeding of European Conference Machine Learning and Knowledge Discovery in Databases, pp. 410–425. Springer, Berlin (2016)

  8. Fowkes, J., Sutton, C.: A subsequence interleaving model for sequential pattern mining. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 835–844. KDD, USA (2016)

  9. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling Databases, pp. 278–289. Springer, Berlin (2004)

    MATH  Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11 (1), 10–18 (2009)

    Article  Google Scholar 

  11. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  12. Kavukcuoglu, K., Sermanet, P., Boureau, Y.-L., Gregor, K., Mathieu, M., Cun, Y.L.: Learning convolutional feature hierarchies for visual recognition. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1090–1098 (2010)

  13. Lam, H.T., Mörchen, F., Fradkin, D., Calders, T.: Mining compressing sequential patterns. Stat. Anal. Data Min. 7(1), 34–52 (2014)

    Article  MathSciNet  Google Scholar 

  14. LeCun, Y., et al: Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet (2015)

  15. Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Proceedings of Advances in Neural Information Processing Systems, pp. 801–808 (2006)

  16. Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of International Conference on Machine Learning, pp. 689–696 (2009)

  17. Mampaey, M., Tatti, N., Vreeken, J.: Tell me what i need to know: succinctly summarizing data with itemsets. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 573–581 (2011)

  18. Mampaey, M., Vreeken, J., Tatti, N.: Summarizing data succinctly with the most informative itemsets. ACM Trans. Knowl. Discov. Data 6(4), 16:1–16:42 (2012)

    Article  Google Scholar 

  19. Mansha, S., Babar, Z., Kamiran, F., Karim, A.: Neural network based association rule mining from uncertain data. In: Proceedings of Neural Information Processing, pp. 129–136. Springer, Berlin (2016)

  20. Mansha, S., Kamiran, F., Karim, A., Anwar, A.: A self-organizing map for identifying influentialcommunities in speech-based networks. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 1965–1968. CIKM (2016)

  21. Mörchen, F., Fradkin, D.: Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of Society for Industrial and Applied Mathematics (2010)

  22. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)

    Article  MATH  Google Scholar 

  23. Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 156–167. ACM (2012)

  24. Shang, S., Chen, L., Jensen, C.S., Wen, J.-R., Kalnis, P.: Searching trajectories by regions of interest. IEEE Trans. Knowl. Data Eng. 29(7), 1549–1562 (2017)

    Article  Google Scholar 

  25. Smets, K., Vreeken, J.: Slim: directly mining descriptive patterns. In: Proceedings of SIAM International Conference on Data Mining, pp. 236–247 (2012)

  26. Tatti, N., Vreeken, J.: The long and the short of it: summarising event sequences with serial episodes. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 462–470. KDD (2012)

  27. Vazirani, V.V.: Approximation Algorithms. Springer, Berlin (2013)

    Google Scholar 

  28. Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress. Data Min. Knowl. Discov. 23(1), 169–214 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  29. Wang, W., Yin, H., Sadiq, S., Chen, L., Xie, M., Spore, X. Zhou.: A sequential personalized spatial item recommender system. In: Proceedings of International Conference on Data Engineering, pp. 954–965 (2016)

  30. Webb, G.I.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. ACM Trans. Knowl. Data Discov. 4(1), 3:1–3:20 (2010)

    Article  MathSciNet  Google Scholar 

  31. Webb, G.I., Vreeken, J.: Efficient discovery of the most interesting associations. ACM Trans. Knowl. Discov. Data 8(3), 15:1–15:31 (2013)

    Article  Google Scholar 

  32. Weisstein, E.W.: Chi-squared test. From MathWorld–A Wolfram Web Resource (1999)

  33. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)

    Article  Google Scholar 

  34. Xie, K., Deng, K., Shang, S., Zhou, X., Zheng, K.: Finding alternative shortest paths in spatial networks. ACM Trans. Database Syst. 37(4), 29:1–29:31 (2012)

    Article  Google Scholar 

  35. Xie, Q., Shang, S., Yuan, B., Pang, C., Zhang, X.: Local correlation detection with linearity enhancement in streaming data. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 309–318. CIKM (2013)

  36. Xie, M., Yin, H., Wang, H., Xu, F., Chen, W., Wang, S.: Learning graph-based poi embedding for location-based recommendation. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 15–24 (2016)

  37. Yang, B., Guo, C., Jensen, C.S., Kaul, M., Shang, S.: Stochastic skyline route planning under time-varying uncertainty. In: Proceedings of IEEE International Conference on Data Engineering, pp. 136–147 (2014)

  38. Yin, H., Sun, Y., Cui, B., Hu, Z., Lcars, L. Chen.: A location-content-aware recommender system. In: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, pp. 221–229. KDD (2013)

  39. Yiu, M.L., Mamoulis, N., Papadias, D.: Aggregate nearest neighbor queries in road networks. IEEE Trans. Knowl. Data Eng. 17(6), 820–833 (2005)

    Article  Google Scholar 

  40. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Article  Google Scholar 

  41. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proceedings of European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014)

  42. Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2528–2535 (2010)

  43. Zhang, A., Shi, W., Webb, G.I.: Mining significant association rules from uncertain data. Data Min. Knowl. Discov. 30(4), 928–963 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  44. Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: Proceedings of IEEE International Conference on Data Engineering, pp. 230–241 (2013)

  45. Zheng, K., Zheng, Y., Yuan, N.J., Shang, S.: On discovery of gathering patterns from trajectories. In: Proceedings of IEEE International Conference on Data Engineering, pp. 242–253 (2013)

  46. Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: Proceedings of IEEE International Conference on Data Engineering, pp. 423–434 (2015)

  47. Zhu, S., Wang, Y., Shang, S., Zhao, G., Wang, J.: Probabilistic routing using multimodal data. Neurocomputing 253(C), 49–55 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sameen Mansha.

Additional information

This article belongs to the Topical Collection: Special Issue on Big Data Management and Intelligent Analytics

Guest Editors: Junping Du, Panos Kalnis, Wenling Li, and Shuo Shang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansha, S., Lam, H.T., Yin, H. et al. Layered convolutional dictionary learning for sparse coding itemsets. World Wide Web 22, 2225–2239 (2019). https://doi.org/10.1007/s11280-018-0565-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0565-2

Keywords

Navigation