This article presents a flexible federated learning scheme that allows machine learning researchers to tap into heterogeneously labeled data from multiple institutions and utilize it to train models on big data in a secure and privacy-preserving manner. The article also introduces two datasets, VinDr-CXR and ChestX-ray14, which contain a total of 18,000 and 112,120 frontal chest X-ray images respectively. The labels for these datasets consist of 27 and 14 different diseases, respectively, as well as the no finding label.
