# DeepLearning.ai作业:(2-1)-- 深度学习的实践层面（Practical aspects of Deep Learning）

1. 不要抄作业！
2. 我只是把思路整理了，供个人学习。
3. 不要抄作业！

• 初始化参数
• 正则化（L2、dropout）
• 梯度检验

# part1：Initialization

1. Zero Initialization

2. Random initialization

3. He initialization

# Part 2：Regularization

## L2 正则

$$J_{regularized} = \small \underbrace{-\frac{1}{m} \sum\limits_{i = 1}^{m} \large{(}\small y^{(i)}\log\left(a^{[L] (i)}\right) + (1-y^{(i)})\log\left(1- a^{[L] (i)}\right) \large{)} }_\text{cross-entropy cost} + \underbrace{\frac{1}{m} \frac{\lambda}{2} \sum\limits_l\sum\limits_k\sum\limits_j W_{k,j}^{[l]2} }_\text{L2 regularization cost}$$

## dropout

1. Forward propagation with dropout

1. 每一层的$d^{[l]}$对应每一层的$a^{[l]}$,因为有m个样本，所以就有$D^{[1]} = [d^{1} d^{1} … d^{1}]$of the same dimension as $A^{[1]}$.使用np.random.rand(n,m)
2. 将$D^{[l]}$布尔化， $< keepprob$ 分为 1和0
3. Set $A^{[1]}$ to $A^{[1]} * D^{[1]}$.
4. Divide $A^{[1]}$ by keep_prob.

2. Backward propagation with dropout

1. reapplying the same mask $D^{[1]}$ to dA1.
2. divide dA1 by keep_prob

• dropout也是正则化的一种
• 训练的时候用，测试的时候不要用
• 在正向传播和反向传播中都要用

$$difference = \frac {\mid\mid grad - gradapprox \mid\mid_2}{\mid\mid grad \mid\mid_2 + \mid\mid gradapprox \mid\mid_2}$$

1. $\theta^{+} = \theta + \varepsilon$
2. $\theta^{-} = \theta - \varepsilon$
3. $J^{+} = J(\theta^{+})$
4. $J^{-} = J(\theta^{-})$
5. $gradapprox = \frac{J^{+} - J^{-}}{2 \varepsilon}$

J_plus[i]就是向量中的每一个元素，也就是W,b展开之后的每一项元素

• To compute J_plus[i]:
1. Set $\theta^{+}$ to np.copy(parameters_values)
2. Set $\theta^{+}_i$ to $\theta^{+}_i + \varepsilon$
3. Calculate $J^{+}_i$ using to forward_propagation_n(x, y, vector_to_dictionary($\theta^{+}$ )).
• To compute J_minus[i]: do the same thing with $\theta^{-}$
• Compute $gradapprox[i] = \frac{J^{+}_i - J^{-}_i}{2 \varepsilon}$