Introduction to Federated Learning

Kuan Hoong, Ph.D
4 min readFeb 5, 2023

--

Federated Learning (Source)

Overview

In a nutshell, Federated Learning (FL) is a way to train AI models without anyone seeing or touching your data, offering a way to unlock information to feed new AI applications. Generally, federated learning operates in a decentralized machine learning method (ML) where instead of training a model on a central server with all data, the model is trained on many devices such as smartphones, laptops, or IoT devices. Each device trains the model locally on its own data and sends the updates to a central server, where the updates are aggregated to improve the overall model. This process continues until a satisfactory model is reached.

How Does Federated Learning Work?

The key idea behind FL is to allow the devices to learn from their own data while keeping the data on the devices themselves. This approach has several benefits, including improved privacy, reduced data transfer costs, and improved model performance due to the diverse data sources.

The first step in FL is to select a model architecture and define a loss function. The central server then sends the model parameters to each device, where the model is trained on the local data. The devices then send the updated model parameters back to the central server, where the updates are aggregated. This process continues until the model has been trained to a satisfactory level.

Types of Federated Learning

There are mainly 2 types of federated learning:

Horizontal vs Vertical Federated Learning
  1. Horizontal Learning: The feature space of all the users consuming the machine learning model will be the same. The data used by the ML model in all of the individual servers will have the same feature variables.
  2. Vertical Learning: 2 or more users who are part of FL will be part of different business models but might be sharing the same customers. Here, the features of the data from the 2 users might be different.

Applications of Federated Learning

FL has several potential applications in various industries, including healthcare, finance, and retail. In healthcare, FL can be used to develop models that can predict patient outcomes without compromising patient privacy. In finance, FL can be used to develop models that can identify fraud without compromising sensitive financial information. In retail, FL can be used to develop models that can recommend products without compromising customer data.

A centralized-server approach to federated learning (Source)

Benefits of Federated Learning

One of the biggest benefits of FL is improved data privacy and data secrecy. Since the data is kept on the devices themselves, there is no need to transfer sensitive information to a central server. With FL, only machine learning parameters are exchanged. This makes FL an attractive solution for organizations that need to protect sensitive information.

Another benefit of FL is improved model performance. Since the model is trained on diverse data sources, it can learn from a wider range of experiences, leading to improved model performance.

Finally, FL can also reduce data transfer costs. Since the data does not need to be transferred to a central server, the cost of data transfer can be reduced, making federated learning a more cost-effective solution for organizations.

Federated learning frameworks

As with other ML projects, the FL paradigm requires frameworks and libraries for training algorithms and enabling the working data pipeline between a server and clients. Here are some of the most widely-used FL opensource frameworks at the moment.

  1. TensorFlow Federated (TFF)
    TensorFlow is pioneering the experimentation with federated learning as an approach. TensorFlow Federated (TFF) performs in two main API layers:
  • Federated Learning (FL) API offers high-level interfaces that enable developers to plug existing machine learning models to TFF without the need to dive deeply into how federated learning works.
  • Federated Core (FC) API offers low-level interfaces that provide opportunities to build novel federated algorithms.

2. OpenFL

Intel® Open Federated Learning (OpenFL) is a Python 3 open-source project developed by Intel to implement FL on sensitive data. OpenFL has deployment scripts in bash and leverages certificates for securing communication but requires the user of the framework to handle most of this by himself.

3. IBM Federated Learning

IBM Federated Learning provides a basic fabric for FL on which advanced features can be added. It is not dependent on any specific machine learning framework and supports different learning topologies, e.g., a shared aggregator, and protocols.

4. Nvidia Clara

Nvidia Clara is an application framework designed for healthcare use cases. It includes full-stack GPU-accelerated libraries, SDKs, and reference applications for developers, data scientists, and researchers to create real-time, secure, and scalable federated learning solutions.

Final Thought

FL is a promising technique in machine learning that allows data to be trained on individual devices while keeping sensitive information secure. From a practical point of view, there is still a gap between the current federated learning technology and real industry usage. However, with its potential applications in various industries and its benefits of improved privacy, improved model performance, and reduced data transfer costs, federated learning is an exciting area to watch in the coming years.

Resources

  1. Federated Learning: Collaborative Machine Learning without Centralized Training Data
  2. What Is Federated Learning?
  3. Federated Learning: Challenges, Methods, and Future Directions
  4. Top 7 Open-Source Frameworks for Federated Learning

Thanks for reading! If you like the article make sure to clap (up to 50!) and let’s connect on LinkedIn and follow me on Medium to stay updated with my new articles.

--

--

Kuan Hoong, Ph.D

Google Developer Expert (GDE) in Machine Learning, Lead Data Scientist, Malaysia TensorFlow User Group, Malaysia R User Group