Doshi-Velez F. Nonparametric Bayesian Approaches for Reinforcement Learning in Partially Observable Domains, in Conference on Artificial Intelligence. ; 2010. Paper
Doshi-Velez F, Wingate D, Roy N, Tenenbaum JB. Nonparametric Bayesian Policy Priors for Reinforcement Learning, in Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada. ; 2010 :532–540. Paper
Doshi-Velez F, Ghahramani Z. Accelerated Sampling for the Indian Buffet Process. ICML . 2009.Abstract

We often seek to identify co-occurring hidden features in a set of observations. The Indian Buffet Process (IBP) provides a nonparametric prior on the features present in each observation, but current inference techniques for the IBP often scale poorly. The collapsed Gibbs sampler for the IBP has a running time cubic in the number of observations, and the uncollapsed Gibbs sampler, while linear, is often slow to mix. We present a new linear-time collapsed Gibbs sampler for conjugate likelihood models and demonstrate its efficacy on large real-world datasets.

Doshi-Velez F, Ghahramani Z. Correlated Non-Parametric Latent Feature Models, in {UAI} 2009, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009. ; 2009 :143–150. Paper
Doshi-Velez F. The Infinite Partially Observable Markov Decision Process, in Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, Vancouver, British Columbia, Canada. ; 2009 :477–485. Paper
Doshi-Velez F, Knowles DA, Mohamed S, Ghahramani Z. Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process, in Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, Vancouver, British Columbia, Canada. ; 2009 :1294–1302. Paper
Doshi F, Miller K, Gael JV, Teh YW. Variational Inference for the Indian Buffet Process, in Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, {AISTATS} 2009, Clearwater Beach, Florida, USA, April 16-18, 2009. ; 2009 :137–144. Paper
Doshia F, Roy N. Spoken Language Interaction with Model Uncertainty: An Adaptive Human-Robot Interaction System. Connection Science. 2008;00 (00) :1-21.Abstract

Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experience communication errors and do not correctly understand the user’s intentions. Recent systems have successfully used probabilistic models of speech, language, and user behavior to generate robust dialog performance in the presence of noisy speech recognition and ambiguous language choices, but decisions made using these probabilistic models are still prone to errors due to the complexity of acquiring and maintaining a complete model of human language and behavior. In this paper, we describe a decision-theoretic model for human-robot interaction using natural language. Our algorithm is based on the Partially Observable Markov Decision Process (POMDP), which allows agents to choose actions that are robust not only to uncertainty from noisy or ambiguous speech recognition but also unknown user models. Like most dialog systems, a POMDP is defined by a large number of parameters that may be difficult to specify a priori from domain knowledge, and learning these parameters from the user may require an unacceptably long training period. We describe an extension to the POMDP model that allows the agent to acquire a linguistic model of the user online, including new vocabulary and word choice preferences. Our approach not only avoids a training period of constant questioning as the agent learns, but also allows the agent to actively query for additional information when its uncertainty suggests a high risk of mistakes. We demonstrate our approach both in simulation and on a natural language interaction system for a robotic wheelchair application. Keywords: dialog management, human-computer interface, adaptive systems, online learning, partially observable Markov decision processes

Doshi F, Roy N. The permutable fast solutions to POMDPs for preference elicitation, in 7th International Joint Conference on Autonomous Agents and Multiagent Systems {(AAMAS} 2008), Estoril, Portugal, May 12-16, 2008, Volume 1. ; 2008 :493–500. Paper
Doshi F, Pineau J, Roy N. Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs, in Machine Learning, Proceedings of the Twenty-Fifth International Conference 2008, Helsinki, Finland, June 5-9, 2008. ; 2008 :256–263. Paper
Doshi F, Roy N. Efficient model learning for dialog management. 2007. Paper
Doshi F, Brunskill E, Shkolnik AC, Kollar T, Rohanimanesh K, Tedrake R, Roy N. Collision detection in legged locomotion using supervised learning, in ; 2007. Paper