Till KTH:s startsida Till KTH:s startsida

Lecture 9, Reinforcement Learning [Örjan]

Tid: Måndag 7 oktober 2013 kl 17:00 - 19:00 2013-10-07T17:00:00 2013-10-07T19:00:00

Kungliga Tekniska högskolan
HT 2013

Plats: D1

Aktivitet: Föreläsning

Studentgrupper: TCSCM1-AS, TCSCM1-BER, TCSCM1-PRS, TCSCM1-SPR, TITMM2, TIVNM1, TKOMK3, TMAIM1, TMAIM1-BIO, TMAIM1-IR, TMAIM1-PC, TSCRM1, TSCRM2

Info:

Reinforcement Learning

Readings: Marsland: chapter 13

  • Is it possible to learn when nobody tells you the correct answer?
  • Central terms: State, Action, Reward
  • More terms: Value function (cumulative value), Policy
  • How can we judge the consequences of our actions?
  • What do we mean by an optimal behavior?
  • Is it possible to learn what is the best thing to do in each state?
  • Is it possible to learn faster by planning ahead?

Slides on Reinforcement Learning

Schemahandläggare skapade händelsen 13 mars 2013
Schemahandläggare redigerade 15 augusti 2013

[u'TIVNM1', u'TKOMK3', u'TMAIM1-PC', u'TCSCM1-BER', u'TMAIM1', u'TCSCM1-AS', u'TCSCM1-SPR', u'TMAIM1-IR', u'TMAIM1-BIO', u'TSCRM1', u'TITMM2', u'TSCRM2', u'TCSCM1-PRS']

Lärare Örjan Ekeberg redigerade 16 augusti 2013

FöreläsningLecture 9, Reinforcement Learning [Örjan]

Reinforcement Learning Readings: Marsland: chapter 13¶


* Is it possible to learn when nobody tells you the correct answer?
* Central terms: State, Action, Reward
* More terms: Value function (cumulative value), Policy
* How can we judge the consequences of our actions?
* What do we mean by an optimal behavior?
* Is it possible to learn what is the best thing to do in each state?
* Is it possible to learn faster by planning ahead?

Schemahandläggare redigerade 31 augusti 2013

[u'TIVNM1', u'TKOMK3', u'TMAIM1-PC', u'TCSCM1-BER', u'TMAIM1', u'TCSCM1-AS', u'TCSCM1-SPR', u'TMAIM1-IR', u'TCSCM1-AS, TCSCM1-BER, TCSCM1-PRS, TCSCM1-SPR, TITMM2, TIVNM1, TKOMK3, TMAIM1, TMAIM1-BIO', u'TSCRM1', u'TITMM2', u'TSCRM2', u'TCSCM1-PRS']TMAIM1-IR, TMAIM1-PC, TSCRM1, TSCRM2

Schemahandläggare redigerade 14 september 2013

TCSCM1-AS, [u'TIVNM1', u'TKOMK3', u'TMAIM1-PC', u'TCSCM1-BER', TCSCM1-PRS, u'TMAIM1', u'TCSCM1-AS', u'TCSCM1-SPR', TITMM2, TIVNM1, TKOMK3, TMAIM1, u'TMAIM1-IR', u'TMAIM1-BIO', TMAIM1-IR, TMAIM1-PC, TSCRM1, TSCRM2u'TSCRM1', u'TITMM2', u'TSCRM2', u'TCSCM1-PRS']

Lärare Örjan Ekeberg redigerade 4 oktober 2013

Reinforcement Learning Readings: Marsland: chapter 13


* Is it possible to learn when nobody tells you the correct answer?
* Central terms: State, Action, Reward
* More terms: Value function (cumulative value), Policy
* How can we judge the consequences of our actions?
* What do we mean by an optimal behavior?
* Is it possible to learn what is the best thing to do in each state?
* Is it possible to learn faster by planning ahead?
Slides on Reinforcement Learning¶

Schemahandläggare ställde in händelsen 14 december 2013

Hela världen får läsa.

Senast ändrad 2013-12-14 00:59

Taggar: Saknas än så länge.