diaporamaMiscDM
 
◃  Ch. 6 App par renforcement  ▹
 

Q-Learning (2)

Ra1a2a3a4a5a6
s1 +1
s2 +1
s3 0 0
s4 -1 0
s5 -1
s6 0 0 -1
 
Qa1a2a3a4a5a6
s1 0+.8 0 0+.7 0
s2 0+.4 0 0 0 0
s3 0 0+.3 0 0 0
s4 0 0-1 0+.8 0
s5 0 0-.9 0 0 0
s6+.1 0 0 0 0-.7