×

Clustered federated learning based on nonconvex pairwise fusion. (English) Zbl 07880032

Summary: This study investigates clustered federated learning (FL), a formulation of FL designed to handle non-i.i.d. data by partitioning devices into clusters, with each cluster optimizing a localized model for its data. We propose a clustered FL framework that incorporates a nonconvex penalty to pairwise differences of parameters. Without a priori knowledge of the set of devices in each cluster and the number of clusters, this framework can autonomously estimate cluster structures. To implement the proposed framework, we introduce a novel clustered FL method called Fusion Penalized Federated Clustering (FPFC). Building upon the standard alternating direction method of multipliers (ADMM), FPFC can perform partial updates at each communication round and allows parallel computation with a variable workload. These strategies significantly reduce the communication cost while ensuring privacy, thereby enhancing the practicality of FL. We also propose a new warmup strategy for hyperparameter tuning in FL settings and explore the asynchronous variant of FPFC (asyncFPFC). Theoretical analysis provides convergence guarantees for FPFC with general losses and establishes the statistical convergence rate under a linear model with squared loss. Extensive experiments have demonstrated the superiority of FPFC compared to current methods, including robustness and generalization capability.

MSC:

68-XX Computer science
62-XX Statistics

References:

[1] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3, 2011
[2] Breheny, P.; Huang, J., Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, Ann. Appl. Stat., 5, 2011 · Zbl 1220.62095
[3] Cho, Y. J.; Wang, J.; Chirvolu, T.; Joshi, G., Communication-efficient and model-heterogeneous personalized federated learning via clustered knowledge transfer, IEEE J. Sel. Top. Signal Process., 2023
[4] Dua, D.; Graff, C., UCI machine learning repository, 2017
[5] Fallah, A.; Mokhtari, A.; Ozdaglar, A., Personalized federated learning with theoretical guarantees: a model-agnostic meta-learning approach, (Advances in Neural Information Processing Systems, 2020)
[6] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., 96, 2001 · Zbl 1073.62547
[7] Feyzmahdavian, H. R.; Johansson, M., Asynchronous iterations in optimization: new sequence results and sharper algorithmic guarantees, 2021, arXiv preprint
[8] Ghosh, A.; Chung, J.; Yin, D.; Ramchandran, K., An efficient framework for clustered federated learning, (Advances in Neural Information Processing Systems, 2020)
[9] Hallac, D.; Leskovec, J.; Boyd, S., Network lasso: clustering and optimization in large graphs, (Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015)
[10] Hocking, T.; Vert, J. P.; Bach, F.; Joulin, A., Clusterpath: an algorithm for clustering using convex fusion penalties, (Proceedings of the 28th International Conference on Machine Learning, 2011)
[11] Huang, Y.; Chu, L.; Zhou, Z.; Wang, L.; Liu, J.; Pei, J.; Zhang, Y., Personalized cross-silo federated learning on non-IID data, (Proceedings of the AAAI Conference on Artificial Intelligence, 2021)
[12] Karimireddy, S. P.; Kale, S.; Mohri, M.; Reddi, S.; Stich, S.; Suresh, A. T., SCAFFOLD: stochastic controlled averaging for federated learning, (Proceedings of the 37th International Conference on Machine Learning, 2020)
[13] Kulkarni, V.; Kulkarni, M.; Pant, A., Survey of personalization techniques for federated learning, (2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability, 2020)
[14] Lagakos, S., The challenge of subgroup analyses-reporting without distorting, N. Engl. J. Med., 354, 2006
[15] Lamport, L.; Shostak, R.; Pease, M., The byzantine generals problem, ACM Trans. Program. Lang. Syst., 4, 1982 · Zbl 0483.68021
[16] Li, Q.; Kim, B. M., Clustering approach for hybrid recommender system, (Proceedings IEEE/WIC International Conference on Web Intelligence, 2003), 33-38
[17] Li, T.; Sahu, A. K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V., Federated optimization in heterogeneous networks, (Proceedings of Machine Learning and Systems, 2020), 429-450
[18] Li, X.; Huang, K.; Yang, W.; Wang, S.; Zhang, Z., On the convergence of FedAvg on non-IID data, (International Conference on Learning Representations, 2020)
[19] Liang, P. P.; Liu, T.; Ziyin, L.; Salakhutdinov, R.; Morency, L. P., Think locally, act globally: federated learning with local and global representations, 2020, arXiv preprint
[20] Lin, S.; Han, Y.; Li, X.; Zhang, Z., Personalized federated learning towards communication efficiency, robustness and fairness, (Advances in Neural Information Processing Systems, 2022)
[21] Liu, W.; Mao, X.; Zhang, X.; Zhang, X., Robust personalized federated learning with sparse penalization, J. Am. Stat. Assoc., 2024
[22] Ma, S.; Huang, J., A concave pairwise fusion approach to subgroup analysis, J. Am. Stat. Assoc., 112, 2017
[23] Ma, S.; Huang, J.; Zhang, Z.; Liu, M., Exploration of heterogeneous treatment effects via concave fusion, Int. J. Biostat., 16, 2019
[24] Mansour, Y.; Mohri, M.; Ro, J.; Suresh, A. T., Three approaches for personalization with applications to federated learning, 2020, arXiv preprint
[25] Marfoq, O.; Neglia, G.; Bellet, A.; Kameni, L.; Vidal, R., Federated multi-task learning under a mixture of distributions, (Advances in Neural Information Processing Systems, 2021)
[26] McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B. A., Communication-efficient learning of deep networks from decentralized data, (Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017), 1273-1282
[27] Miao, Y.; Liu, Z.; Li, H.; Choo, K. K.R.; Deng, R. H., Privacy-preserving byzantine-robust federated learning via blockchain systems, IEEE Trans. Inf. Forensics Secur., 17, 2022
[28] Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Köpf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; Chintala, S., PyTorch: An Imperative Style, High-Performance Deep Learning Library, 8024-8035, 2019
[29] Polyak, B. T., Introduction to Optimization. Numerical Methods in Engineering with Python 3, 1987
[30] Ren, Y.; Hu, M.; Yang, Z.; Feng, G.; Zhang, X., Bpfl: blockchain-based privacy-preserving federated learning against poisoning attack, Inf. Sci., 2024
[31] Sattler, F.; Müller, K. R.; Samek, W., Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints, IEEE Trans. Neural Netw. Learn. Syst., 32, 2021
[32] Shenaj, D.; Fanì, E.; Toldo, M.; Caldarola, D.; Tavera, A.; Michieli, U.; Ciccone, M.; Zanuttigh, P.; Caputo, B., Learning across domains and devices: style-driven source-free domain adaptation in clustered federated learning, (Winter Conference on Applications of Computer Vision, 2023)
[33] Smith, V.; Chiang, C. K.; Sanjabi, M.; Talwalkar, A. S., Federated multi-task learning, (Advances in Neural Information Processing Systems, 2017)
[34] Spiridonoff, A.; Olshevsky, A.; Paschalidis, Y., Communication-efficient sgd: from local sgd to one-shot averaging, (Advances in Neural Information Processing Systems, 2021)
[35] Stich, S. U., Local SGD converges fast and communicates little, (International Conference on Learning Representations, 2019)
[36] T. Dinh, C.; Tran, N.; Nguyen, J., Personalized federated learning with Moreau envelopes, (Advances in Neural Information Processing Systems, 2020)
[37] Tran-Dinh, Q.; Pham, N. H.; Phan, D. T.; Nguyen, L. M., FedDR - randomized Douglas-Rachford splitting algorithms for nonconvex federated composite optimization, (Advances in Neural Information Processing Systems, 2021)
[38] Vahidian, S.; Morafah, M.; Wang, W.; Kungurtsev, V.; Chen, C.; Shah, M.; Lin, B., Efficient distribution similarity identification in clustered federated learning via principal angles between client data subspaces, (Proceedings of the AAAI Conference on Artificial Intelligence, 2023)
[39] Van Der Maaten, L.; Hinton, G., Visualizing data using t-sne, J. Mach. Learn. Res., 2008 · Zbl 1225.68219
[40] Xiao, H.; Rasul, K.; Vollgraf, R., Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017, arXiv preprint
[41] Yu, X.; Liu, Z.; Sun, Y.; Wang, W., Clustered federated learning for heterogeneous data (student abstract), (Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 2023)
[42] Zhang, X.; Hong, M.; Dhople, S.; Yin, W.; Liu, Y., FedPD: a federated learning framework with adaptivity to non-IID data, IEEE Trans. Signal Process., 69, 2021 · Zbl 07908779
[43] Zhang, Y.; Liu, D.; Duan, M.; Li, L.; Chen, X.; Ren, A.; Tan, Y.; Wang, C., Fedmds: an efficient model discrepancy-aware semi-asynchronous clustered federated learning framework, IEEE Trans. Parallel Distrib. Syst., 2023
[44] Zhou, X.; Huang, W.; Liang, W.; Yan, Z.; Ma, J.; Pan, Y.; Kevin, I.; Wang, K., Federated distillation and blockchain empowered secure knowledge sharing for internet of medical things, Inf. Sci., 662, 2024
[45] Zhou, X.; Yang, G., Communication-efficient and privacy-preserving large-scale federated learning counteracting heterogeneity, Inf. Sci., 661, 2024 · Zbl 07829855
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.