## #49常識集_類神經網路_下

### Activation function

$\begin{matrix}&space;u_i(t)=\sum_{j}v_{ij}(t)x_j\\&space;z_i(t)=g(u_i(t))\\&space;a_i(t)=\sum_{j}w_{ij}(t)z_i(t)\\&space;y_i(t)=g(a_i(t))\\&space;e_i(t)=(d_i-y_i(t)\\&space;E(t)=\sum_{i}e_i(t)^2&space;\end{matrix}$

### 如何修改權重矩陣

$\LARGE&space;\begin{matrix}&space;\frac{\partial&space;E}{\partial&space;w_{ij}}=\frac{\partial&space;E}{\partial&space;a_{i}}\frac{\partial&space;a_i}{\partial&space;w_{ij}}\\&space;\frac{\partial&space;E}{\partial&space;v_{ij}}=\frac{\partial&space;E}{\partial&space;u_{i}}\frac{\partial&space;u_i}{\partial&space;v_{ij}}\\&space;\end{matrix}$
其中∂ai/∂vij & ∂ui/∂wij蠻容易的，他就是zj & xj。∂E/∂ai & ∂E/∂ui比較困難，為了方便起見，我們先把他們兩個分別定義為：Δi = -∂E/∂ai     ,    δi = -∂E/∂ui。我們必須再用一次chain rule：

$\large&space;\begin{matrix}&space;\Delta_i=-\frac{\partial&space;E}{\partial&space;a_{i}}=-\frac{\partial&space;E}{\partial&space;y_{i}}\frac{\partial&space;y_i}{\partial&space;a_{i}}=-(-2(d_i-y_i))(g'(a_i))=2g'(a_i)e_i\\&space;\delta_i=-\frac{\partial&space;E}{\partial&space;u_{i}}=\sum_{j}(-\frac{\partial&space;E}{\partial&space;a_{j}})\frac{\partial&space;a_j}{\partial&space;u_{i}}=\sum_{j}\Delta_j\frac{\partial&space;a_j}{\partial&space;u_{i}}=\sum_{j}\Delta_j\frac{\partial&space;a_j}{\partial&space;z_{i}}\frac{\partial&space;z_i}{\partial&space;u_{i}}=g'(u_i)\sum_{j}w_{ji}\Delta_j\\&space;\end{matrix}$

$\large&space;\begin{matrix}&space;u_i(t)=\sum_{j}v_{ij}(t)x_j\\&space;z_i(t)=g(u_i(t))\\&space;a_i(t)=\sum_{j}w_{ij}(t)z_i(t)\\&space;y_i(t)=g(a_i(t))\\&space;e_i(t)=(d_i-y_i(t)\\&space;E(t)=\sum_{i}e_i(t)^2\\&space;\Delta_i=-\frac{\partial&space;E}{\partial&space;a_{i}}=2g'(a_i)e_i\\&space;\delta_i=-\frac{\partial&space;E}{\partial&space;u_{i}}=g'(u_i)\sum_{j}w_{ji}\Delta_j\\&space;w_{ij}(t+1)=w_{ij}(t)-\alpha\frac{\partial&space;E}{\partial&space;w_{ij}}=w_{ij}(t)+\alpha\Delta_i(t)z_j(t)\\&space;v_{ij}(t+1)=v_{ij}(t)-\alpha\frac{\partial&space;E}{\partial&space;v_{ij}}=v_{ij}(t)+\alpha\delta_i(t)x_j(t)&space;\end{matrix}$