00001 Introduction to Course and Instructor.mp4
04:17
00002 What is Reinforcement Learning.mp4
08:46
00003 What is Reinforcement Learning Hiders and Seekers by OpenAI.mp4
06:09
00004 RL Versus Other ML Frameworks.mp4
07:43
00005 Why Reinforcement Learning.mp4
03:48
00006 Examples of Reinforcement Learning.mp4
05:02
00007 Limitations of Reinforcement Learning.mp4
08:10
00008 Exercises.mp4
02:16
00009 What is Environment.mp4
03:30
00010 What is Environment 2.mp4
05:55
00011 What is Agent.mp4
05:35
00012 What is State.mp4
05:59
00013 State Belongs to Environment and not to Agent.mp4
05:02
00014 What is Action.mp4
05:41
00015 What is Reward.mp4
09:44
00016 Goal.mp4
04:04
00017 Policy.mp4
04:16
00018 Summary.mp4
08:35
00019 Setup 1.mp4
03:14
00020 Setup 2.mp4
05:03
00021 Setup 3.mp4
07:09
00022 Policy Comparison.mp4
08:11
00023 Deterministic Environment.mp4
07:25
00024 Stochastic Environment.mp4
08:04
00025 Stochastic Environment 2.mp4
04:59
00026 Stochastic Environment 3.mp4
09:55
00027 Non-Stationary Environment.mp4
08:36
00028 GridWorld Summary.mp4
05:52
00029 Activity.mp4
01:16
00030 Probability.mp4
03:32
00031 Probability 2.mp4
04:58
00032 Probability 3.mp4
04:07
00033 Conditional Probability.mp4
05:14
00034 Conditional Probability Fun Example.mp4
06:04
00035 Joint Probability.mp4
03:28
00036 Joint probability 2.mp4
03:45
00037 Joint probability 3.mp4
02:52
00038 Expected Value.mp4
06:08
00039 Conditional Expectation.mp4
02:33
00040 Modeling Uncertainty of Environment.mp4
04:58
00041 Modeling Uncertainty of Environment 2.mp4
04:02
00042 Modeling Uncertainty of Environment 3.mp4
03:07
00043 Modeling Uncertainty of Environment Stochastic Policy.mp4
03:19
00044 Modeling Uncertainty of Environment Stochastic Policy 2.mp4
02:56
00045 Modeling Uncertainty of Environment Value Functions.mp4
07:59
00046 Running Averages.mp4
01:30
00047 Running Averages 2.mp4
04:52
00048 Running Averages as Temporal Difference.mp4
04:23
00049 Activity.mp4
01:37
00050 Markov Property.mp4
04:03
00051 State Space.mp4
04:20
00052 Action Space.mp4
03:31
00053 Transition Probabilities.mp4
03:52
00054 Reward Function.mp4
04:20
00055 Discount Factor.mp4
03:52
00056 Summary.mp4
04:07
00057 Activity.mp4
01:09
00058 MOR Quiz 1.mp4
03:18
00059 MOR Quiz Solution 1.mp4
06:29
00060 MOR Quiz 2.mp4
03:09
00061 MOR Quiz Solution 2.mp4
04:35
00062 MOR Reward Scaling.mp4
04:37
00063 MOR Infinite Horizons.mp4
06:17
00064 MOR Quiz 3.mp4
03:01
00065 MOR Quiz Solution 3.mp4
04:55
00066 MDP Recap.mp4
02:15
00067 Value Functions.mp4
05:18
00068 Optimal Value Function.mp4
04:56
00069 Optimal Policy.mp4
05:21
00070 Bellman Equation.mp4
05:29
00071 Value Iteration.mp4
03:53
00072 Value Iteration Quiz.mp4
02:52
00073 Value Iteration Quiz Gamma Missing.mp4
00:56
00074 Value Iteration Solution.mp4
10:08
00075 Problems of Value Iteration.mp4
05:50
00076 Policy Evaluation.mp4
06:59
00077 Policy Evaluation 2.mp4
05:00
00078 Policy Evaluation 3.mp4
05:35
00079 Policy Evaluation d Form Solution.mp4
04:03
00080 Policy Iteration.mp4
07:30
00081 State Action Values.mp4
06:58
00082 V and Q Comparisons.mp4
04:51
00083 What Does it Mean that MDP is Unknown.mp4
02:45
00084 Why Transition Probabilities are Important.mp4
03:48
00085 Model-Based Solutions.mp4
04:32
00086 Model-Free Solutions.mp4
03:09
00087 Monte-Carlo Learning.mp4
04:23
00088 Monte-Carlo Learning Example.mp4
09:55
00089 Monte-Carlo Learning Limitations.mp4
02:59
00090 Running Average.mp4
05:09
00091 Learning Rate.mp4
07:05
00092 Learning Equation.mp4
03:52
00093 TD Algorithm.mp4
05:11
00094 Exploration Versus Exploitation.mp4
02:40
00095 Epsilon Greedy Policy.mp4
03:11
00096 SARSA.mp4
02:48
00097 Q-Learning.mp4
06:35
00098 Q-Learning Implementation for MAPROVER Clipped.mp4
22:55
00099 N-Step Look a Head.mp4
04:11
00100 Formulation.mp4
04:03
00101 Values.mp4
03:05
00102 TD Q-Learning TD Lambda.mp4
06:19
00103 TD Q-Learning TD Lambda TD Lambda MAPRover Activity.mp4
03:54
00104 Frozenlake 1.mp4
02:02
00105 Frozenlake Implementation.mp4
22:49