Wonderland AI: How can federated learning be done differently?


Belgrade played host recently to the annual Wonderland AI summit, a meeting place for some of the world’s largest and most innovative companies in the artificial intelligence space, as well as leading experts in the field. One speaker who made a particular impression on us at DisruptionBanking was Nicholas Lane. Nicholas is an Associate Professor in the Department of Computer Science and Technology at the University of Cambridge, where he heads up the Machine Learning Systems Lab. He is also a Director at the Samsung AI Centre in Cambridge. Following his speech at Wonderland AI, we caught up with him to discuss how the concept of “Flower” is changing the way federated learning is done.

As Nicholas explained at Wonderland AI, federated learning is a machine learning technique that allows an algorithm to work across numerous devices or servers without exchanging the data held on those devices. Whereas traditional machine learning involves local datasets being uploaded onto one central server, federated learning allows different parties to build a common machine learning model without the need to share confidential or commercially sensitive information. Google’s AI blog describes it in the following way:

“Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. This goes beyond the use of local models that make predictions on mobile devices (like the Mobile Vision API and On-Device Smart Reply) by bringing model training to the device as well.

“It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarises the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.”

This technique can be used to build many different models which are based on user behaviour: predictive text, face detection technology and voice recognition are just a few. It allows industry to build more accurate models, based on a larger amount of data, because competing companies can share data which previously they would not have for legal or strategic reasons. While the benefits are demonstrable, it is very difficult for companies outside an elite few to build the infrastructure required. As it stands, the technology involved is just too complex for the vast majority to grapple with:

“Only 0.5 percent of companies have the internal tools to do these things [federated learning]. So, it’s like the Googles, Facebooks and Apples of the world – a very small number of companies – who we know do have the internal infrastructure that lets them do these sort of things. But very few others have this infrastructure.”

Nicholas has been part of a team, however, that has developed a concept called “Flower.” Nicholas believes that Flower can resolve these issues by simplifying the technology on which federated learning is based. Flower is an open source community that has produced a framework for federated learning simulation, development and deployment that is accessible to organisations that have as little as a handful of Graphics Processing Units (GPUs). The result is that the benefits of federated learning have been opened up to a greater range of companies. The “natural consequence” is improved competition and better results for consumers.

In the way Flower aims to democratise AI through open source, Nicholas suggests that it is “analogous to what Hugging Face is doing, in the sense that it’s a really strong, very open community that is trying to accelerate the development of language models by widening participation and access.”

“The key problems we see is the difficulty in evaluating real scenarios. Essentially federated learning systems has a large distributed system, and so is going to be affected by all sorts of system dynamics: scale, nodes being available or not available, overhead communication, all of these kind of things.

“If you’re a developer, or a researcher, or somewhere in between, as you try to develop systems it can be very hard to measure and understand how your system is going to handle these issues. Flower helps you resolve this dimension of system dynamics, heterogeneity and complexities related to systems issues, communication bottlenecks as well as this other dimension of scale.”

Currently, companies that are not part of the elite few who have the required infrastructure are stuck using systems built at a very small scale. They are therefore missing out on some of the most important innovations which are happening in the AI space. But in the way Flower resolves technological complexities and simplifies the process, the advantages of federated learning could be made available to a wider audience.

But why does Nicholas believe in federated learning? What value can it add to the AI space, particularly if it is opened up to a wider audience? Nicholas believes there is a whole range of benefits that Flower can help more tech firms take advantage of:

“I think the obvious benefit is privacy. There are basically two big types of federated systems that you can build. There are what people call “cross device” and those are the kind that, say, Google and Apple build. So imagine a scenario where you have lots of data, very small fragments across the world, and you want to improve something like speech recognition. You’ll need fragments of personal data so the technology can be trained using that data. Federated systems can allow those fragments to be shared in an anonymous, private way.

“Another one is cross-organisational [federated systems]. These involve a comparatively small number of organisations working together. With the first model you might have billions of devices; on the second you may have tens or hundreds of organisations. These organisations use federated learning to collaborate towards building the best model possible.

“The benefit is these companies get to collaborate on the model, so they can offer the consumer the best possible model based on the maximum amount of data. But they also get to do it in such a way that they can still compete – they don’t have to share [commercially sensitive] data as they are obviously reluctant to do.

“So you can have, for example, car companies collaborating together; pharmaceuticals collaborating together; lawyers, banks, all kinds of organisations whose consumers can benefit from the sharing of data.”

This certainly appears to be a promising development in the AI and federated learning space. Already Nicholas and his team are taking their concept into industry and business – they are even attempting to launch a proof of concept initiative with the Bank of England, in what could be a fascinating taste of how federated learning can improve the technology and processes of central banks.

What is clear is that some disruption in this space is required, to break an elite few’s monopoly on an incredibly useful technology. If Nicholas and his team can do that, consumers will be the main winners.

Author: Harry Clynch

#WonderlandAISummit2021 #FederatedLearning #Flower #MachineLearning #Algorithms #Data #UserBehaviour #OpenSource #GPUs #Democratisation #HuggingFace #AI