-
Federated Learning in Practice: Reflections and Projections
Authors:
Katharine Daly,
Hubert Eichner,
Peter Kairouz,
H. Brendan McMahan,
Daniel Ramage,
Zheng Xu
Abstract:
Federated Learning (FL) is a machine learning technique that enables multiple entities to collaboratively learn a shared model without exchanging their local data. Over the past decade, FL systems have achieved substantial progress, scaling to millions of devices across various learning domains while offering meaningful differential privacy (DP) guarantees. Production systems from organizations li…
▽ More
Federated Learning (FL) is a machine learning technique that enables multiple entities to collaboratively learn a shared model without exchanging their local data. Over the past decade, FL systems have achieved substantial progress, scaling to millions of devices across various learning domains while offering meaningful differential privacy (DP) guarantees. Production systems from organizations like Google, Apple, and Meta demonstrate the real-world applicability of FL. However, key challenges remain, including verifying server-side DP guarantees and coordinating training across heterogeneous devices, limiting broader adoption. Additionally, emerging trends such as large (multi-modal) models and blurred lines between training, inference, and personalization challenge traditional FL frameworks. In response, we propose a redefined FL framework that prioritizes privacy principles rather than rigid definitions. We also chart a path forward by leveraging trusted execution environments and open-source ecosystems to address these challenges and facilitate future advancements in FL.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Confidential Federated Computations
Authors:
Hubert Eichner,
Daniel Ramage,
Kallista Bonawitz,
Dzmitry Huba,
Tiziano Santoro,
Brett McLarnon,
Timon Van Overveldt,
Nova Fallen,
Peter Kairouz,
Albert Cheu,
Katharine Daly,
Adria Gascon,
Marco Gruteser,
Brendan McMahan
Abstract:
Federated Learning and Analytics (FLA) have seen widespread adoption by technology platforms for processing sensitive on-device data. However, basic FLA systems have privacy limitations: they do not necessarily require anonymization mechanisms like differential privacy (DP), and provide limited protections against a potentially malicious service provider. Adding DP to a basic FLA system currently…
▽ More
Federated Learning and Analytics (FLA) have seen widespread adoption by technology platforms for processing sensitive on-device data. However, basic FLA systems have privacy limitations: they do not necessarily require anonymization mechanisms like differential privacy (DP), and provide limited protections against a potentially malicious service provider. Adding DP to a basic FLA system currently requires either adding excessive noise to each device's updates, or assuming an honest service provider that correctly implements the mechanism and only uses the privatized outputs. Secure multiparty computation (SMPC) -based oblivious aggregations can limit the service provider's access to individual user updates and improve DP tradeoffs, but the tradeoffs are still suboptimal, and they suffer from scalability challenges and susceptibility to Sybil attacks. This paper introduces a novel system architecture that leverages trusted execution environments (TEEs) and open-sourcing to both ensure confidentiality of server-side computations and provide externally verifiable privacy properties, bolstering the robustness and trustworthiness of private federated computations.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
U-Net-and-a-half: Convolutional network for biomedical image segmentation using multiple expert-driven annotations
Authors:
Yichi Zhang,
Jesper Kers,
Clarissa A. Cassol,
Joris J. Roelofs,
Najia Idrees,
Alik Farber,
Samir Haroon,
Kevin P. Daly,
Suvranu Ganguli,
Vipul C. Chitalia,
Vijaya B. Kolachalama
Abstract:
Development of deep learning systems for biomedical segmentation often requires access to expert-driven, manually annotated datasets. If more than a single expert is involved in the annotation of the same images, then the inter-expert agreement is not necessarily perfect, and no single expert annotation can precisely capture the so-called ground truth of the regions of interest on all images. Also…
▽ More
Development of deep learning systems for biomedical segmentation often requires access to expert-driven, manually annotated datasets. If more than a single expert is involved in the annotation of the same images, then the inter-expert agreement is not necessarily perfect, and no single expert annotation can precisely capture the so-called ground truth of the regions of interest on all images. Also, it is not trivial to generate a reference estimate using annotations from multiple experts. Here we present a deep neural network, defined as U-Net-and-a-half, which can simultaneously learn from annotations performed by multiple experts on the same set of images. U-Net-and-a-half contains a convolutional encoder to generate features from the input images, multiple decoders that allow simultaneous learning from image masks obtained from annotations that were independently generated by multiple experts, and a shared low-dimensional feature space. To demonstrate the applicability of our framework, we used two distinct datasets from digital pathology and radiology, respectively. Specifically, we trained two separate models using pathologist-driven annotations of glomeruli on whole slide images of human kidney biopsies (10 patients), and radiologist-driven annotations of lumen cross-sections of human arteriovenous fistulae obtained from intravascular ultrasound images (10 patients), respectively. The models based on U-Net-and-a-half exceeded the performance of the traditional U-Net models trained on single expert annotations alone, thus expanding the scope of multitask learning in the context of biomedical image segmentation.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
A Field Guide to Federated Optimization
Authors:
Jianyu Wang,
Zachary Charles,
Zheng Xu,
Gauri Joshi,
H. Brendan McMahan,
Blaise Aguera y Arcas,
Maruan Al-Shedivat,
Galen Andrew,
Salman Avestimehr,
Katharine Daly,
Deepesh Data,
Suhas Diggavi,
Hubert Eichner,
Advait Gadhikar,
Zachary Garrett,
Antonious M. Girgis,
Filip Hanzely,
Andrew Hard,
Chaoyang He,
Samuel Horvath,
Zhouyuan Huo,
Alex Ingerman,
Martin Jaggi,
Tara Javidi,
Peter Kairouz
, et al. (28 additional authors not shown)
Abstract:
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and…
▽ More
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms through concrete examples and practical implementation, with a focus on conducting effective simulations to infer real-world performance. The goal of this work is not to survey the current literature, but to inspire researchers and practitioners to design federated learning algorithms that can be used in various practical applications.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Predicting Injectable Medication Adherence via a Smart Sharps Bin and Machine Learning
Authors:
Yingqi Gu,
Akshay Zalkikar,
Lara Kelly,
Kieran Daly,
Tomas E. Ward
Abstract:
Medication non-adherence is a widespread problem affecting over 50% of people who have chronic illness and need chronic treatment. Non-adherence exacerbates health risks and drives significant increases in treatment costs. In order to address these challenges, the importance of predicting patients' adherence has been recognised. In other words, it is important to improve the efficiency of interven…
▽ More
Medication non-adherence is a widespread problem affecting over 50% of people who have chronic illness and need chronic treatment. Non-adherence exacerbates health risks and drives significant increases in treatment costs. In order to address these challenges, the importance of predicting patients' adherence has been recognised. In other words, it is important to improve the efficiency of interventions of the current healthcare system by prioritizing resources to the patients who are most likely to be non-adherent. Our objective in this work is to make predictions regarding individual patients' behaviour in terms of taking their medication on time during their next scheduled medication opportunity. We do this by leveraging a number of machine learning models. In particular, we demonstrate the use of a connected IoT device; a "Smart Sharps Bin", invented by HealthBeacon Ltd.; to monitor and track injection disposal of patients in their home environment. Using extensive data collected from these devices, five machine learning models, namely Extra Trees Classifier, Random Forest, XGBoost, Gradient Boosting and Multilayer Perception were trained and evaluated on a large dataset comprising 165,223 historic injection disposal records collected from 5,915 HealthBeacon units over the course of 3 years. The testing work was conducted on real-time data generated by the smart device over a time period after the model training was complete, i.e. true future data. The proposed machine learning approach demonstrated very good predictive performance exhibiting an Area Under the Receiver Operating Characteristic Curve (ROC AUC) of 0.86.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Orbigraphs: a graph theoretic analog to Riemannian orbifolds
Authors:
Kathleen Daly,
Colin Gavin,
Gabriel Montes de Oca,
Diana Ochoa,
Elizabeth Stanhope,
Sam Stewart
Abstract:
A Riemannian orbifold is a mildly singular generalization of a Riemannian manifold that is locally modeled on $R^n$ modulo the action of a finite group. Orbifolds have proven interesting in a variety of settings. Spectral geometers have examined the link between the Laplace spectrum of an orbifold and the singularities of the orbifold. One open question in this field is whether or not a singular o…
▽ More
A Riemannian orbifold is a mildly singular generalization of a Riemannian manifold that is locally modeled on $R^n$ modulo the action of a finite group. Orbifolds have proven interesting in a variety of settings. Spectral geometers have examined the link between the Laplace spectrum of an orbifold and the singularities of the orbifold. One open question in this field is whether or not a singular orbifold and a manifold can be Laplace isospectral. Motivated by the connection between spectral geometry and spectral graph theory, we define a graph theoretic analogue of an orbifold called an orbigraph. We obtain results about the relationship between an orbigraph and the spectrum of its adjacency matrix. We prove that the number of singular vertices present in an orbigraph is bounded above and below by spectrally determined quantities, and show that an orbigraph with a singular point and a regular graph cannot be cospectral. We also provide a lower bound on the Cheeger constant of an orbigraph.
△ Less
Submitted 11 January, 2019;
originally announced January 2019.