Abstract: With the rapid growth scale of dataset and model, the training of deep neural networks (DNN) tends to be deployed in a distributed manner. In the large-scale distributed training, the ...