The IROS paper was submitted 10 minutes before the deadline on March 1st. The paper was okay, but I’m not thrilled about it because it is only the beginning stages of the work. So, it doesn’t really use the bounty hunting stuff to its fullest. Future papers will hopefully provide that. We also had to… Continue reading Research and Math 🙂
Category: reinforcement learning
Natural Language LfD & RL
So, I’m working with Ermo on applying reinforcement learning to text based games. So, I was wondering if eventually if our method works if we could do text based learning from demonstration with reinforcement learning? Basically instead of the user pressing buttons they would describe what they wanted the system to do using english sentences.… Continue reading Natural Language LfD & RL
Directed Reading
I’m planning my directed reading class this coming semester. So, basically I have/get to come up with an entire semester’s worth of material. I might be able to make a class out of it by the time I’m done 🙂 haha. My subjects are focusing on the areas I want to explore with the bounty hunting… Continue reading Directed Reading
Q-Learning with delayed updates
I’m sure someone has thought of this, but I didn’t look. So, what if we delayed the update of the q-value and the policy for x number of time steps. We would keep track of the history and create an average reward. Initialize the policy for each action to be 1/|A| so that we can… Continue reading Q-Learning with delayed updates
MGS Markov Game Simulator
I found a really cool paper that basically benchmarks a lot of MARL algorithms using MGS, a stochastic game simulator written in Java. This simulator is like my MALSIM except I only implemented features for repeated games not stochastic games.
Reinforcement Learning
Here is a nice informative PhD thesis on Reinforcement Learning. Really does a nice job in Chapter 2.2.3 explaining the use of Boltzmann distribution and Q-values. http://www.compapp.dcu.ie/~humphrys/PhD/ch2.html