Blockchain-based Federated Extreme Multi-label Classification
A blockchain-based federated learning framework for extreme multi-label classification models based on the DeepXML Framework
Created on April 16|Last edited on April 25
Comment
I have developed a BlockChain-based Federated Learning (BCFL) framework for working with any of the newer models that are based on the DeepXML framework (both the label feature dependent and independent models). I have currently used SiameseXML as the classifier management part of the implementation is not complete yet.
Description
The code of any DeepXML model will need to be modified a bit for it to work with the BCFL framework. Furthermore, wandb logging will also need to be added.
💡
Training process
The training process is supposed to be simulated in the following way:
- Different processes are spawned to act like different clients training their local versions of the model. Each process trains independently.
- Some more number of processes will be spawned to act like miners who will do verification of the data and blockchain mining.
- If we want to simulate FuBCFL(fully-coupled), each client process will be paired up with exactly one miner process acting like a single process.
- If we want to simulate LoBCFL(loosely-coupled), the number of miners will need to be specified. A client process can communicate with any of the miner processes to update the global model.
- The client processes will have to register their models to the chain before starting training. This way the blockchain knows the number of clients that are training their models for another dedicated process to use FedAvg algorithm off the chain.
- The client processes will send gradient updates to the miner processes to update the global and local models every fixed number of epochs.
The development of the blockchain-variant was met with some errors and for now, the training process is simulated like this:
- Different processes are spawned to act like different clients training their local versions of the model. Each process trains independently.
- The global model is stored in a global variable to simulate training in a blockchain environment.
This simulates the training process for the purposes of recording the accuracy properly but is not how the model works. Adding the functionalities of a blockchain should not change the accuracy values significantly.
Aggregation process
The aggregation is done as mentioned in https://ourspace.uregina.ca/handle/10294/14336. I take a hyperparameter called num_best_clients, i.e., we rank the local models with respect to their accuracy on the prediction dataset and only consider the top num_best_clients number of clients for aggregation.
There are two types of aggregation that I test out:
- Unweighted: The global model is simply the mean of all the local model weights that we consider.
- Weighted: The global model is the weighted mean of the local model weights that we consider with their loss.
Development
The task of development of the BCFL framework was divided into the following sub-tasks:
- Cluster management: The server provided by RIPPLE did not have any built-in cluster management. To make sure I simulate the training process properly, I used the cluster management system written in python called ray. Ray provides some core primitives (i.e., tasks, actors, objects) for building and scaling distributed applications.
- Dataset management: The datasets are all available on the Manik Verma website but the DeepXML based models are typically run on a server which may or may not have a convenient way of downloading the datasets. Alongwith convenient download, the dataset management system I developed also divides the dataset into a specified number of pieces to make it federated learning ready. The generated FL dataset is currently iid but can be non-iid if needed.
- Chain management: For the blockchain, Geth was used. Geth (go-ethereum) is a Go implementation of Ethereum. It has been a core part of Ethereum since the very beginning. It was also one of the original Ethereum implementations. The learning curve was steep and plenty of errors were met with during the development process.
- Classifier management (under construction!): A convenient way to git clone and make any classifier based on DeepXML framework ready to be simulated training with the BCFL framework.
All the above functionalities have been conveniently packaged into a tool.
💡
Results
Summary
The reported P@k scores are mean among all the clients.
Charts
LF-AmazonTitles-131K
722
LF-Amazon-131K
58
LF-WikiSeeAlsoTitles-320K
35
Add a comment