Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Jin, Hai; Qamar, Saqib; Zheng, Ran; Ahmad, Parvez

Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Jin, Hai ; Qamar, Saqib ; Zheng, Ran ; Ahmad, Parvez

Identifier: https://hdl.handle.net/2292/68658

Issue Date: 2018-11-20

Reference: (2018). Journal of Intelligent and Fuzzy Systems, vol. 35, no. 5, pp. 5449-5466, 2018

Rights: Copyright: The authors

Rights (URI): https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm

Abstract:

Convolutional neural networks (CNNs) are important methods in deep learning. They have presented up-to-date performance in different challenging areas, such as natural language processing and computer vision. These powerful and efficient neural networks implement training slowly under a massive number of network training parameters. The primary challenge is to reduce the training time for large volumetric data. CNNs have a small number of parameters at convolutional layers and a large number of parameters at fully connected layers. Training time can be reduced by the use of computing parallelism according to the characteristics of CNN layers. This paper presents an optimized parallelism algorithm using a communication strategy for CNN training in distributed graphic processing units (GPUs). We use the butterfly reduction communication strategy and apply data and model parallelisms at convolutional and fully connected layers respectively. A model is divided among distributed GPUs, and each division of the model works according to the characteristics of CNNs. This hybrid parallel approach is more desirable than previous parallelism alternatives, such as data parallelism only and model parallelism alone, which have been applied to modern CNNs. Experimental results reveal that this parallel approach with butterfly communication strategy can enhance accuracy and decrease training time.

Show full item record

Files in this item

DOI: 10.3233/jifs-171329

This item appears in the following Collection(s)

General [13654]

Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Abstract:

Files in this item

This item appears in the following Collection(s)

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics

Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Single binding of data and model parallelisms to parallelize convolutional neural networks through multiple machines

Abstract:

Files in this item

This item appears in the following Collection(s)

Related items

Share

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics