reinforcement learning

October 12, 2023

chatgpt conversation #

chatgpt conversation

Frozen Lake (Q-learning) #

frozen lake (Q learning) implementation

Frozen Lake (policy gradient) #

implementation

heuristic aspect of RI #

definition #

There is a goal. Goal can be measured numerically in intermediate states. (Reward)

Find a way to reach the goal using the reward as hint. One often has to react to intermediate state (without perfect information) Often, model it by using decision (Policy) acting on the perceived state

property #

The environment is not perfectly modeled (understood) Yet we try to come up with model which will react to environment so that we can reach the goal

Guiding data is different from supervised learning

no annotated data
guiding data creation is part of learning process
- which can be random

next learning objective #

https://developer.ibm.com/learningpaths/get-started-automated-ai-for-decision-making-api/what-is-automated-ai-for-decision-making/