Download presentation
Presentation is loading. Please wait.
1
神经计算 神经计算 史忠植 中国科学院计算技术研究所 shizz@ics.ict.ac.cn
2
内容提要 一、引言 二、神经计算 三、研究问题 四、结束语
3
引言
4
引言
5
引言 神经系统活动,不论是感觉、运动,还是脑的高级功能(如学习、记忆、情绪等)都有整体上的表现,面对这种表现的神经基础和机理的分析不可避免地会涉及各种层次。这些不同层次的研究互相启示,互相推动。在低层次(细胞、分子水平)上的工作为较高层次的观察提供分析的基础,而较高层次的观察又有助于引导低层次工作的方向和体现其功能意义。既有物理的、化学的、生理的、心理的分门别类研究,又有综合研究。
6
神经计算 神经计算就是通过对人脑的基本单元---神经元的建模和联结,来探索模拟人脑神经系统功能的模型,并研制一种具有学习、联想、记忆和模式识别等智能信息处理功能的人工系统。 系统结构 学习算法
7
神经计算 (2)所有定量或定性的信息都等势分布贮存于网络内的各神经元,故有很强的鲁棒性和容错性;
(1)可以充分逼近任意复杂的非线性关系; (2)所有定量或定性的信息都等势分布贮存于网络内的各神经元,故有很强的鲁棒性和容错性; (3)采用并行分布处理方法,使得快速进行大量运算成为可能; (4)可学习和自适应不知道或不确定的系统; (5)能够同时处理定量、定性知识。
8
神经计算 50年代、60年代的代表性工作是Rosenblatt的感知机和Widrow的自适应性元件Adaline。
40年代心理学家Mcculloch和数学家Pitts合作提出的兴奋与抑制型神经元模型和Hebb提出的神经元连接强度的修改规则,他们的研究结果至今仍是许多神经网络模型研究的基础。 50年代、60年代的代表性工作是Rosenblatt的感知机和Widrow的自适应性元件Adaline。 1969年,Minsky和Papert合作发表了颇有影响的Perceptron一书,得出了消极悲观的论点,加上数字计算机正处于全盛时期并在人工智能领域取得显著成就,70年代人工神经网络的研究处于低潮。 80年代后,传统的Von Neumann数字计算机在模拟视听觉的人工智能方面遇到了物理上不可逾越的极限。与此同时,Rumelhart与Mcclelland以及Hopfield等人在神经网络领域取得了突破性进展,神经网络的热潮再次掀起。
9
神经计算 感知机 多层网络BP算法 Hopfield网络模型 自适应共振理论(ART) 自组织特征映射理论
Hinton 等人最近提出了 Helmboltz 机 徐雷提出的 Ying-Yang 机理论模型 甘利俊一( S.Amari) 开创和发展的基于统计流形的方法应用于人工神经网络的研究,
10
All neurons contain an activation function which determines whether the signal is strong enough to produce an output. Fig 4 shows several functions that could be used as an activation function.
11
感知机 Rosenblatt’s Perceptron: a network of processing elements (PE): Y1
Yp a1 am . . . - Notice that weights determine the amount with which each x affects each a. - The weights are updated via a learning rule at each iteration of input. - Notice that networks need NOT be fully connected as shown here. x1 x2 x3 xn . . .
12
感知机 Initial proposal of connectionist networks
Rosenblatt, 50’s and 60’s Essentially a linear discriminant composed of nodes, weights I1 W1 I1 W1 or W2 W2 O I2 O I2 W3 W3 I3 I3 Activation Function 1
13
感知机 Additional layer(s) can be added: Y1 Yp a1 am . . . h1 hm . . . x1
- We can add an arbitrary number of hidden layers. - Additional hidden layers tend to increase the ability of the network to learn complex functions, but also increase learning times required. h1 hm . . . x1 x2 x3 xn . . .
14
感知机 2 .5 .3 =-1 1 2(0.5) + 1(0.3) + -1 = 0.3 , O=1 Learning Procedure:
Randomly assign weights (between 0-1) Present inputs from training data Get output O, nudge weights to gives results toward our desired output T Repeat; stop when no errors, or enough epochs completed
15
We want to minimize the LMS:
最小均方学习方法 LMS = Least Mean Square learning Systems, more general than the previous perceptron learning rule. The concept is to minimize the total error, as measured over all training examples, P. O is the raw output, as calculated by E.g. if we have two patterns and T1=1, O1=0.8, T2=0, O2=0.5 then D=(0.5)[(1-0.8)2+(0-0.5)2]=.145 We want to minimize the LMS: C-learning rate E W(old) W(new) W
16
激励函数 To apply the LMS learning rule, also known as the delta rule, we need a differentiable activation function. Old: New:
17
反传网络 Attributed to Rumelhart and McClelland, late 70’s
To bypass the linear classification problem, we can construct multilayer networks. Typically we have fully connected, feedforward networks. Input Layer Output Layer Hidden Layer I1 O1 H1 I2 H2 O2 I3 1 Wj,k Wi,j 1 1’s - bias
18
反传学习 Learning Procedure: Randomly assign weights (between 0-1)
Present inputs from training data, propagate to outputs Compute outputs O, adjust weights according to the delta rule, backpropagating the errors. The weights will be nudged closer so that the network learns to give the desired output. Repeat; stop when no errors, or enough epochs completed
19
反传 – 权值修改 We had computed:
For the Output unit k, f(sum)=O(k). For the output units, this is: For the Hidden units (skipping some math), this is: I H O Wi,j Wj,k
20
Hopfield 网络 Every node is connected to every other nodes
Weights are symmetric Recurrent network State of the net is given by the vector of the node outputs (x1, x2, x3)
21
自组织特征映射-SOM Kohonen (1982, 1984)
In biological systems cells tuned to similar orientations tend to be physically located in proximity with one another microelectrode studies with cats Orientation tuning over the surface forms a kind of map with similar tunings being found close to each other topographic feature map Train a network using competitive learning to create feature maps automatically
22
学习算法 Decreasing the neighbor ensures progressively finer features are encoded gradual lowering of the learn rate ensures stability
23
特 点 Kohonen’s algorithm creates a vector quantizer by adjusting weight from common input nodes to M output nodes Continuous valued input vectors are presented without specifying the desired output After the learning, weight will be organized such that topologically close nodes are sensitive to inputs that are physically similar Output nodes will be ordered in a natural manner
Similar presentations