1) Question:
3.1 Basic Concepts 88
3.1.1 Neural network composition? 88
3.1.2 What are the common model structures of neural networks? 90
3.1.3 How to choose a deep learning development platform? 92
3.1.4 Why use deep representation? 92
3.1.5 Why is a deep neural network difficult to train? 93
3.1.6 What is the difference between deep learning and machine learning? 94
3.2 Network Operations and Calculations 95
3.2.1 Forward Propagation and Back Propagation 95
3.2.2 How to calculate the output of the neural network? 97
3.2.3 How to calculate the convolutional neural network output value? 98
3.2.4 How do I calculate the output value of the Pooling layer? 101
3.2.5 Example to help understand back propagation 102
3.3 Superparameters 105
3.3.1 What is a hyperparameter? 105
3.3.2 How to find the optimal value of the hyperparameter? 105
3.3.3 General procedure to find the optimal hyperparameter106
3.4 Activation function 106
3.4.1 Why do I need a nonlinear activation function? 106
3.4.2 Common Activation Functions and Images 107
3.4.3 Derivative calculation of common activation functions?109
3.4.4 What are the properties of the activation function? 110
3.4.5 How do I choose an activation function? 110
3.4.6 Advantages of using the ReLu activation function? 111
3.4.7 When can I use the linear activation function? 111
3.4.8 How to understand that Relu (<0) is a nonlinear activation function? 111
3.4.9 How does the Softmax function be applied to multiple classifications? 112
3.5 Batch_Size 113
3.5.1 Why do I need Batch_Size? 113
3.5.2 Selection of Batch_Size Values 114
3.5.3 What are the benefits of increasing Batch_Size within a reasonable range? 114
3.5.4 What is the disadvantage of blindly increasing Batch_Size? 114
3.5.5 What is the impact of Batch_Size on the training effect? 114
3.6 Normalization 115
3.6.1 What is the meaning of normalization? 115
3.6.2 Why Normalize?115
3.6.3 How can normalization improve the speed of training? 115
3.6.4 3D illustration of a non-normalized training set 116
3.6.5 What types of normalization are there? 117
3.6.6 Local response normalization Effect 117
3.6.7 Understanding the local response normalization formula 117
3.6.8 What is Batch Normalization 118
3.6.9 Advantages of the Batch Normalization (BN) Algorithm 119
3.6.10 Batch normalization (BN) algorithm flow 119
3.6.11 Batch normalization and group normalization 120
3.6.12 Weight Normalization and Batch Normalization 120
3.7 Pre-training and fine tuning 121
3.7.1 How can unsupervised pre-training help deep learning? 121
3.7.2 What is the model fine tuning fine tuning 121
3.7.3 Is the network parameter updated when fine tuning? 122
3.7.4 Three states of the fine-tuning model 122
3.8 Weight Deviation Initialization 122
3.8.1 All initialized to 0 122
3.8.2 All initialized to the same value 123
3.8.3 Initializing to a Small Random Number 124
3.8.4 Calibrating the variance with 1/sqrt(n) 125
3.8.5 Sparse Initialization (Sparse Initialazation) 125
3.8.6 Initialization deviation 125
3.9 Softmax 126
3.9.1 Softmax Definition and Function 126
3.9.2 Softmax Derivation 126
3.10 Understand the principles and functions of One Hot Encodeing? 126
3.11 What are the commonly used optimizers? 127
3.12 Dropout Series Issues 128
3.12.1 Choice of dropout rate 128
3.27 Padding Series Issues 128
2) Answer:
P
References: