DRL Notes

 Unit 1: History of Deep Learning, McCulloch Pitts Neuron, Thresholding Logic, Activation functions, Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD, AdaGrad, RMSProp, Adam, Eigenvalue Decomposition. Recurrent Neural Networks, Backpropagation through time (BPTT), Vanishing and Exploding Gradients, Truncated BPTT, GRU, LSTMs, Encoder Decoder Models, Attention Mechanism, Attention over images.

Deep learning: thinking like human or mimic how human brain                                  learn through neural network,various layers-
                          input layer-----hidden layers----output layers


History :  
  • 1960 -  continuous Back Propagation 
  • 1965-models with polynomial  (non linear)
  • 1970-  first AI winter ,
  • 2001-three-dimensional,big data
  • 2009- Data drives learning.
  • 2011 & 12-speed of GPUs had increased 
Threshold logic - (Mathematics logic + algorithm) in artificial neural network

Activation functions : it separates relevant (useful) and irrelevant(non useful) info from large info. increASE complexixity (non linear data)

x = activation ((weight * input) + bias)
forward propagation: output from the activation function moves to the next hidden layer and the same process is repeated.
backpropagation : if error is calculated.Based on this error value, the weights and biases of the neurons are updated

types of activation functions:
binary function,
 linear function, 
sigmoid function

Gradient Descent: the optimized algorithm used to train model, iteratively moves step towards learning (similar like mountain)

Repeat until hit convergence:

  1. Given the gradient, calculate the change in the parameters with the learning rate.
  2. Re-calculate the new gradient with the new value of the parameter.
  3. Repeat step 1.
  4. formula



 Unit 2: Autoencoders and relation to PCA, Regularization in autoencoders, Denoisingautoencoders, Sparse autoencoders, Contractive autoencoders, Regularization: Bias Variance Tradeoff, L2 regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at input, Ensemble methods, Dropout, Batch Normalization, Instance Normalization, Group Normalization. 

Unit 3: Greedy Layerwise Pre-training, Better activation functions, Better weight initialization methods, Learning Vectorial Representations Of Words, Convolutional Neural Networks, LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet, Visualizing Convolutional Neural Networks, Guided Backpropagation, Deep Dream, Deep Art, Recent Trends in Deep Learning Architectures. 

Unit 4: Introduction to reinforcement learning(RL), Bandit algorithms – UCB, PAC,Median Elimination, Policy Gradient, Full RL & MDPs, Bellman Optimality, Dynamic Programming - Value iteration, Policy iteration, and Q-learning & Temporal Difference Methods, Temporal-Difference Learning, Eligibility Traces, Function Approximation, Least Squares Methods 

Unit 5: Fitted Q, Deep Q-Learning , Advanced Q-learning algorithms , Learning policies by imitating optimal controllers , DQN & Policy Gradient, Policy Gradient Algorithms for Full RL, Hierarchical RL,POMDPs, Actor-Critic Method, Inverse reinforcement learning, Maximum Entropy Deep Inverse Reinforcement Learning, Generative Adversarial Imitation Learning,Recent Trends in RL Architectures

Comments

Popular posts from this blog

Intellect Interview Experience

Google

Accolite digital Interview experience