Area Efficient Neuromorphic Circuit Based on Stochastic Computation Kiwon Yoon, Suhyeong Choi, and Youngsoo Shin School of Electrical Engineering, KAIST Daejeon 34141, Korea Abstract— Neuromorphic circuit can be simplified by applying stochastic computing, which uses a bit stream. A large number of stochastic number generators (SNGs) allows independent bit streams and hence secures accuracy, but outweighs the advantage of stochastic computing in circuit area. An area efficient SNG design method is proposed, in which a single linear feedback shift register (LFSR) is shared among a number of SNGs; independency of bit streams is made possible through shuffled wiring between LFSR and bit stream generators. Proposed design method is applied to a neuromorphic circuit that recognizes handwritten numbers; circuit area is reduced by 86% while prediction accuracy is sacrificed by 11% compared to a reference design in which LFSR is not shared.

Input layer

Hidden layer

Output layer

Value1

SNG

Neuron

Neuron

Value2

SNG

Neuron

Neuron

Weight1

SNG

...

¢GWeight

x x>y y

LFSR

Bit stream

8

(a)

(b)

Fig. 1. (a) A neuromorphic circuit with stochastic computing, and (b) typical SNG structure. (px=4/8)

(px=4/8)

I. I NTRODUCTION A neuron in neuromorphic circuit consists of a few multipliers, an adder, and a threshold function. Since a large number of neurons are required in actual circuit, reducing the area of neuron is very important. Stochastic computing has been applied in neuron design for this purpose [1]. In stochastic computing, a real number p ∈ [0, 1] is represented by the number of 1s (divided by the total number of bits) in a random bit stream S. Let Sx and Sy be two independent bit streams applied to AND gate and px and py be their corresponding real number respectively; the bit stream at the output of AND gate will represent px py implying that AND gate can function as a multiplier. A neuromorphic circuit with stochastic computing is shown in Fig. 1(a). At input layer, each real number p is converted to a bit stream through SNG, which is then applied to a hidden layer (or hidden layers); the output layer finally makes a decision. A typical structure of SNG is shown in Fig. 1(b), in which LFSR is employed to generate the final bit stream. Computation accuracy is mainly determined by how independent each bit stream is. This is intuitively shown in Fig. 2: correlated inputs yield inaccurate multiplication in (a), while accurate result is obtained with independent bit streams in (b). Independency of bit streams is achieved by a LFSR with its own unique seed. However, this causes large area occupied by LFSRs, e.g. 80% of area for LFSRs [2].

Sx=10100101

x

Sy=10100101

y

z

Sz=10100101 (pz=4/8)

Sx=10100101

x

Sy=00110110

y

z

Sz=00100100 (pz=2/8)

(py=4/8)

(py=4/8) (a)

(b)

Fig. 2. Multiplication of two (a) correlated bit streams and (b) uncorrelated bit streams.

has different random numbers. Since no SNG contains LFSR anymore, area is drastically reduced. However, rotation gives only n different random numbers at a time, and the number of required SNGs (>1k; for a handwritten number recognition) is far more than n (≤32; [3]). It means that there must be correlated bit streams, because some bit streams are born with same seed. Hence, prediction accuracy becomes low. Proposed design let a m-bit (m > n) LFSR shared by all SNGs through shuffled wiring, as described in Fig. 3(b). Among m, n bits are randomly selected and shuffled before they are delivered to SNG. As a result, the maximum number of different random numbers at a time is m Pn , which is much larger than n. Thus, bit streams are generated with less correlation, and prediction accuracy is ensured as unique seed design does. Also, area is decreased due to LFSR sharing. III. E XPERIMENTS We built an artificial neural network that consists of 196, 4 and 10 nodes at input, hidden, and output layer, respectively. Network was trained to predict handwritten numbers [4], and achieved 81% of prediction accuracy. Implementation was done with Verilog and circuit was synthesized with 28nm industrial library [5]. For prediction tests, l values from 10 to 19 were used, where 2l is the length of bit stream.

II. A REA E FFICIENT SNG D ESIGN Sharing a n-bit LFSR by all SNGs via rotated wiring was proposed [3], as shown in Fig. 3(a), where n is the bitwidth of SNG input. A random number is provided to every SNG, and is rotated by hardwiring in the middle so that each SNG

978-1-5090-3219-8/16/$31.00 ©2016 IEEE

Binary number

73

ISOCC 2016

SNG n 1-bit rotation

n

n

LFSR

n

2-bit rotation

n

x>y

Shuffled wiring

y

n

n

x x>y y

Area [mm2]

SNG n

x

LFSR

m

Shuffled wiring

n

x

0.06

x>y y

Unique seed

LFSR sharing with rotation

71%

79%

Proposed

0.05 0.04

x x>y

0.03

y

0.02 n 3-bit rotation

n

n

x x>y y

Shuffled wiring

n

x

81%

0.01

x>y y

10

14

18

= Log2(Bit stream length)

k

Fig. 4. Circuit area with three SNG implementation: unique seed, LFSR sharing with rotation, and proposed LFSR sharing with shuffled wiring; l is 10, 14, and 18. k-bit rotation

Shuffled wiring

(a)

(b)

Prediction accuracy [%]

Fig. 3. (a) LFSR sharing with k-bit rotation, and (b) proposed LFSR sharing with shuffled wiring.

80

We implemented the network by unique seed, LFSR sharing with rotation, and proposed design. In unique seed design, l-bit LFSRs were used. For the other designs, (l − 1)-bit LFSR was shared by SNGs for pixel values, and l-bit LFSR and shared by SNGs for weights. In LFSR sharing with rotation design, n was equal to l − 1 or l, but in proposed design, m was equal to l − 1 or l, and n was fixed to 9. We measure areas by three designs when l is 10, 14, and 18, as reported in Fig. 4. The number of LFSRs is 1,096 when unique seed design is applied, and it is reduced to 2 by LFSR sharing with rotation and proposed design. As a result, both designs reduce area by 71%, 79% and 81% when l is 10, 14, and 18, respectively. The number of registers in LFSR increases as l increases, thus impact of area reduction becomes bigger as l rises. Prediction accuracies are measured through 200 test images while changing l from 10 to 19, as represented in Fig. 5. As stochastic computing is based on stochastic behavior, computation becomes more accurate as the length of bit stream increases, due to the law of large numbers. Since unique seed and proposed design provide uncorrelated random numbers for SNGs, prediction accuracies of both designs tend to increase as l increases. When l is extended to 19, degradations of prediction accuracy compared to original network are 6% and 11%, respectively. Prediction accuracy of proposed design is almost same as that of unique seed design, which implys that proposed design can be an alternative. Meanwhile, prediction accuracy of LFSR sharing with rotation design is below 30%, which is poor to use, even though area is reduced significantly. It is because of correlation between bit streams, which bring an inaccurate computation.

60 Proposed

50 40 29

30 20

29

LFSR sharing with rotation

10 10

11

12

13

14

15

16

17

18

19

= Log2(Bit stream length)

Fig. 5. varied.

Prediction accuracy from three SNG implementations while l is

among a number of SNGs has been proposed; independency among bit streams is provided through shuffled wiring between LFSR and bit stream generators. The idea has been applied in neuromorphic circuit that recognizes handwritten numbers; circuit area is reduced by 86% while prediction accuracy is sacrificed only by 11%. ACKNOWLEDGEMENT This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2015R1A2A2A01008037). R EFERENCES [1] V. Canals et al., “A new stochastic computing methodology for efficient neural network implementation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 3, pp. 551–564, Mar. 2016. [2] W. Qian et al., “An architecture for fault-tolerant computation with stochastic logic,” IEEE Trans. Computers, vol. 60, no. 1, pp. 93–105, Jan. 2011. [3] H. Ichihara et al., “Compact and accurate stochastic circuits with shared random number sources,” in Proc. Int. Conf. on Computer Design, Oct. 2014, pp. 361–366. [4] The MNIST database of handwritten digits. [Online]. Available: http://yann.lecun.com/exdb/mnist/ [5] Design Compiler User Guide, Synopsys, Mountain View, CA, June 2015.

IV. C ONCLUSION Reducing the area of SNGs is a key in neuromorphic circuit design based on stochastic computation. Sharing an LFSR

978-1-5090-3219-8/16/$31.00 ©2016 IEEE

75 70

Unique seed

70

74

ISOCC 2016

Paper Title (use style: paper title)

School of Electrical Engineering, KAIST .... [Online]. Available: http://yann.lecun.com/exdb/mnist/. [5] Design Compiler User Guide, Synopsys, Mountain View, CA, ...

91KB Sizes 3 Downloads 364 Views

Recommend Documents

Paper Title (use style: paper title) - Sites
Android application which is having higher graphics or rendering requirements. Graphics intensive applications such as games, internet browser and video ...

Paper Title (use style: paper title) - GitHub
points in a clustered data set which are least similar to other data points. ... data mining, clustering analysis in data flow environments .... large than the value of k.

Paper Title (use style: paper title)
College of Computer Science. Kookmin ... of the distinct words for clustering online news comments. In ... This work was supported by the Basic Science Research Program through .... is performed on class-wise reviews as depicted in Fig. 1(b).

Paper Title (use style: paper title)
on the substrate, substrate pre-deposition process, and Pd deposition .... concentration is below the ignition threshold, which is often important for such a sensor.

Paper Title (use style: paper title)
Turin, Italy [email protected]. Hui Wang. School of Information Engineering. Nanchang Institute of Technology. Nanchang 330099, China [email protected]. Abstract—Frequency Modulation (FM) sound synthesis provides a neat synthesis

Paper Title (use style: paper title)
mobile wireless networking, it is becoming possible to monitor elderly people in so-called ... sensor network that might be used in order to recognize tasks described in Table 1. ..... its advantages, and their relative merits and demerits are still.

Paper Title (use style: paper title)
zero which means cosθ tends to 1. The distance between each of the test vectors and profile vectors were obtained using (2). If the cosine value between the test vector and profile hub vector was greater than the cosine value between the same test v

Paper Title (use style: paper title)
communication channel between the sensors and the fusion center: a Binary ..... location estimation in sensor networks using binary data," IEEE Trans. Comput., vol. ... [9] K. Sha, W. Shi, and O. Watkins, "Using wireless sensor networks for fire.

Paper Title (use style: paper title)
search and compact storage space. Although search ... neighbor search methods in the binary space. ... Given a query ∈ { } , we list the online search algorithm.

Paper Title (use style: paper title)
Research Program Fellowships, the University of Central Florida – Florida. Solar Energy Center (FSEC), and a NASA STTR Phase I contract. NNK04OA28C. ...... Effluents Given Off by Wiring Insulation," Review of Progress in. QNDE, vol. 23B ...

Paper Title (use style: paper title)
In Long term Evolution. (LTE), HARQ is implemented by MAC level module called .... the receiver is decoding already received transport blocks. This allows the ...

use style: paper title
helps learners acquire scientific inquiry skills. One of ... tutoring systems; LSA; natural language processing ..... We collected data from 21 college students who.

Paper Title (use style: paper title)
Reducing Power Spectral Density of Eye Blink Artifact through Improved Genetic ... which could be applied to applications like BCI design. MATERIALS AND ...

Paper Title (use style: paper title)
general, SAW technology has advantages over other potentially competitive ... SAW devices can also be small, rugged, passive, wireless, and radiation hard,.

Paper Title (use style: paper title)
provide onboard device sensor integration, or can provide integration with an .... Figure 2 Schematic diagram of a 7 chip OFC RFID tag, and. OFC measured and ..... [3] C. S. Hartmann, "A global SAW ID tag with large data capacity," in Proc.

Paper Title (use style: paper title) - Research at Google
decades[2][3], but OCR systems have not followed. There are several possible reasons for this dichotomy of methods: •. With roots in the 1980s, software OCR ...

Paper Title (use style: paper title) - Research
grams for two decades[1]. Yet the most common question addressed to the author over more than two decades in OCR is: “Why don't you use a dictionary?

Paper Title (use style: paper title)
determine the phase error at unity-gain frequency. In this paper, while comparing some topologies we ... degrees at the integrator unity gain frequency result in significant filter degradation. Deviations from the .... due to gm/Cgd occur at a much h

Paper Title (use style: paper title)
Abstract— The Open Network and Host Based Intrusion Detection. Testbed .... It is unique in that it is web-based. .... sensor is also the application web server.

Paper Title (use style: paper title)
Orlando, FL 32816-2450 (email: [email protected]). Brian H. Fisher, Student .... presentation provides a foundation for the current efforts. III. PALLADIUM ...

Paper Title (use style: paper title)
A VLSI architecture for the proposed method is implemented on the Altera DE2 FPGA board. Experimental results show that the proposed design can perform Chroma-key effect with pleasing quality in real-time. Index Terms—Chroma-key effect, K-means clu

Paper Title (use style: paper title)
the big amount of texture data comparing to a bunch of ... level and a set of tile data stored in the system memory from ... Figure 1: Architecture of our algorithm.

Paper Title (use style: paper title)
printed texts. Up to now, there are no ... free format file TIFF. ... applied on block texts without any use of pre- processing ... counting [12, 13] and the reticular cell counting [1]. The main ..... Computer Vision and Image Understanding, vol. 63

Paper Title (use style: paper title)
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798. Abstract— ... For 60GHz wireless communication systems, the ... the benefit of isolated DC noise from the tuning element. The load on ...