Neural Networks: Learning Cost function Machine Learning
Neural Network (Classification) total no. of layers in network no. of units (not counting bias unit) in layer Layer 1 Layer 2 Layer 3 Binary classification Layer 4 MultiBclass classification (K classes) E.g. , , , pedestrian car motorcycle truck K output units 1 output unit Andrew Ng
Cost function Logistic regression: Neural network: Andrew Ng
Neural Networks: Learning Backpropagation algorithm Machine Learning
Gradient computation 目標 Need code to compute:
Given one training example ( , ): Forward propagation: Gradient computation Given one training example ( , ): Forward propagation: 向前傳播 Layer 1 Layer 2 Layer 3 Layer 4
Gradient computation: Backpropagation algorithm Intuition: “error” of node in layer . For each output unit (layer L = 4) Dot times 個別元素相乘 Delta 1 不用算 因為不會跟原始資料比較誤差 The delta values of layer l are calculated by multiplying the delta values in the next layer with the theta matrix of layer l. We then element-wise multiply that with a function called g', or g-prime, which is the derivative of the activation function g evaluated with the input values given by z(l). The g-prime derivative terms can also be written out as: g′(z(l))=a(l) .∗ (1−a(l)) Layer 1 Layer 2 Layer 3 Layer 4
Gradient computation: Backpropagation algorithm Intuition: “error” of node in layer . For each output unit (layer L = 4) Layer 1 Layer 2 Layer 3 Layer 4
Backpropagation algorithm Training set Set (for all ). For Set Perform forward propagation to compute Using , compute Compute for The capital-delta matrix D is used as an "accumulator" to add up our values as we go along and eventually compute our partial derivative. Thus we get ∂∂Θ(l)ijJ(Θ)= D(l)ij 先算正向值 再算DELTA
Backpropagation algorithm Training set Set (for all ). For Set Perform forward propagation to compute Using , compute Compute for
Backpropagation intuition Neural Networks: Learning Backpropagation intuition Machine Learning
Forward Propagation
Forward Propagation Andrew Ng
What is backpropagation doing? 若有多個輸出 就要用summation k 目前單一例子 所以不考慮正規化(K=1) Cost i像是在計算結果與真實質Yi有多接近 Focusing on a single example and ignoring regularization ( , , the case of 1 output unit, ), (Think of ) I.e. how well is the network doing on example i? Andrew Ng
“error” of cost for Formally, (unit (for in layer ). Forward Propagation Formally 那邊是變化量 這邊忽略bais的計算 要看應用絲路跟方法決定是否計算 “error” of cost for Formally, (unit (for in layer ). ), where Andrew Ng
“error” of cost for Formally, (unit (for in layer ). Forward Propagation 只要稍微改變zij 就會改變我們的值 底下是方便我們對於篇微分 了解若改變權重的影響 所改變的變化量 若以個別來算的話 如上面 “error” of cost for Formally, (unit (for in layer ). ), where Andrew Ng
Thank you for your listening