The inventors have developed algorithms that enable secure training of deep neural networks using multiple data repositories and a supercomputing resource. In this invention, steps of the neural network training process are distributed between one or more data repositories and a supercomputing resource, which maintains control over the architecture of the neural network. Importantly, raw labeled data are not directly shared between the data sources nor with the supercomputing resource. Furthermore, by distributing the training process between data repositories and a supercomputing resource, these methods reduce the computational requirements on individual data sources.
The inventors demonstrate that this distributed algorithm has similar performance as when all data is combined on a single machine. These methods can be modified to incorporate semi-supervised learning when training with a small amount of labeled data. This invention may be beneficial for applications when raw data sharing is not possible.