Hi, I'm a Reinforcement Learning Researcher focused on building indepenedent agents that continuously learn from their subjective experiences. During my PhD, I was supervised by Rich Sutton and Patrick Pilarski at the University of Alberta in the Reinforcement Learning & Artificial Intelligence Lab
My research addresses how artificial intelligence systems can construct knowledge by deciding both what to learn and how to learn, independent of designer instruction. I predominantly use Reinforcement Learning methods.
2944 0D1A 9484 963F 6E1E 8ECC 6452 2A83 434F 2D65
Reinforcement Learning Research Scientist & life-long hacker.
PhD in CS supervised by Rich Sutton and Patrick Pilarski at the University of Alberta
@ Reinforcement Learning & Artificial Intelligence Lab
More about my research on google scholar
What I'm currently working on ... "What's a Good Prediction? Issues in Evaluating General Value Functions Through Error"
What's new ... "Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures"
Agents tackling complex problems often benefit from the ability to construct knowledge.
Learning to independently solve sub-tasks and form models of the world can help agents
progress in solving challenging problems.
My work focuses on how agents construct knowledge of their world independent of designer instruction, and apply their knowledge flexibly to many challenges.
The design of user interfaces determines how effectively people can interact with and use
machines. By adapting user interfaces to individuals, we can improve the joint performance
of people and machines, especially in biomedical technology and rehabilitation medicine.
In my bionic limb research, we use reinforcement learning (RL) to help with human decision making by anticpating user commands.
A web-app to teach folks the basics of
General Value Functions.
Learning new algorithms is easiest when you can see the world from an agent's perspective. Rory Dawson and I are developing a visual primer to explain what General Value Functions are and how they can be learned using Temporal-difference methods.
GVF predictions are made as a simulated robot interacts with its environment. The prediction parameters can be changed, so users can experience how differences in γ, α, and the signal of interest influence what an agent learns. The equations in the underlying learning algorithm are updated, so that you can watch how incoming data is processed by the agent. and both the weights and eligibility traces are depicted as heatmaps.
Feedback welcome and appreciated 🎉
I built a tiny indoor
greenhouse to grow strawberries year-round. Even when it's -40°c!
Using household materials, and an LED light, I built a tiny green-house that can operate all year-round. Using a Raspberry Pi, a dusty old web-cam, and some sensors, I rigged up a dashboard that gives me a live-view of my indoor-garden, and tells how thirsty the 🍓 are.
Most recently, I gave an invited talk at the ICML 2020 workshop on large and open artificial worlds.
Agents tackling complex problems in open environments often benefit from the ability to construct knowledge. Learning to independently solve sub-tasks and form models of the world can help agents progress in solving challenging problems. In this talk, we draw attention to challenges that arise when evaluating an agent’s knowledge, specifically focusing on methods that express an agent’s knowledge as predictions. Using the General Value Function framework we highlight the distinction between useful knowledge and strict measures of accuracy. Having identified challenges in assessing an agent’s knowledge, we propose a possible evaluation approach that is compatible with large and open worlds.