DRL Notes
Unit 1: History of Deep Learning, McCulloch Pitts Neuron, Thresholding Logic, Activation functions,
Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD, AdaGrad,
RMSProp, Adam, Eigenvalue Decomposition. Recurrent Neural Networks, Backpropagation through
time (BPTT), Vanishing and Exploding Gradients, Truncated BPTT, GRU, LSTMs, Encoder Decoder
Models, Attention Mechanism, Attention over images.
Deep learning: thinking like human or mimic how human brain learn through neural network,various layers-
input layer-----hidden layers----output layers
History :
- 1960 - continuous Back Propagation
- 1965-models with polynomial (non linear)
- 1970- first AI winter ,
- 2001-three-dimensional,big data
- 2009- Data drives learning.
- 2011 & 12-speed of GPUs had increased
Threshold logic - (Mathematics logic + algorithm) in artificial neural network
Activation functions : it separates relevant (useful) and irrelevant(non useful) info from large info. increASE complexixity (non linear data)
x = activation ((weight * input) + bias)
forward propagation: output from the activation function moves to the next hidden layer and the same process is repeated.
backpropagation : if error is calculated., Based on this error value, the weights and biases of the neurons are updated
types of activation functions:
binary function,
linear function,
sigmoid function
Gradient Descent: the optimized algorithm used to train model, iteratively moves step towards learning (similar like mountain)
Repeat until hit convergence:
- Given the gradient, calculate the change in the parameters with the learning rate.
- Re-calculate the new gradient with the new value of the parameter.
- Repeat step 1.
- formula
Unit 2: Autoencoders and relation to PCA, Regularization in autoencoders, Denoisingautoencoders,
Sparse autoencoders, Contractive autoencoders, Regularization: Bias Variance Tradeoff, L2
regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at
input, Ensemble methods, Dropout, Batch Normalization, Instance Normalization, Group Normalization.
Unit 3: Greedy Layerwise Pre-training, Better activation functions, Better weight initialization methods,
Learning Vectorial Representations Of Words, Convolutional Neural Networks, LeNet, AlexNet, ZF-Net,
VGGNet, GoogLeNet, ResNet, Visualizing Convolutional Neural Networks, Guided Backpropagation,
Deep Dream, Deep Art, Recent Trends in Deep Learning Architectures.
Unit 4: Introduction to reinforcement learning(RL), Bandit algorithms – UCB, PAC,Median Elimination,
Policy Gradient, Full RL & MDPs, Bellman Optimality, Dynamic Programming - Value iteration, Policy
iteration, and Q-learning & Temporal Difference Methods, Temporal-Difference Learning, Eligibility
Traces, Function Approximation, Least Squares Methods
Unit 5: Fitted Q, Deep Q-Learning , Advanced Q-learning algorithms , Learning policies by imitating
optimal controllers , DQN & Policy Gradient, Policy Gradient Algorithms for Full RL, Hierarchical
RL,POMDPs, Actor-Critic Method, Inverse reinforcement learning, Maximum Entropy Deep Inverse
Reinforcement Learning, Generative Adversarial Imitation Learning,Recent Trends in RL Architectures
Comments
Post a Comment