Complexity of Deep Convolutional Neural Networks in Mobile Computing

Naeem, Saad; Jamil, Noreen; Khan, Habib Ullah; Nazir, Shah

View/Open

3853780.pdf (1.290Mb)

Date

2020-09-17

Author

Naeem, Saad
Jamil, Noreen
Khan, Habib Ullah
Nazir, Shah

Metadata

Show full item record

Abstract

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.