Doshi-Velez F. Bayesian nonparametric approaches for reinforcement learning in partially observable domains. 2012. Thesis
Doshi-Velez F, Li W, Battat Y, Charrow B, Curthis D, Park J-G, Hemachandra S, Velez J, Walsh C, Fredette D, et al. Improving safety and operational efficiency in residential care settings with WiFi-based localization. Journal of the American Medical Directors Association. 2012;13 :558–563. Paper
Doshi-Velez F, Konidaris G. Transfer Learning by Discovering Latent Task Parametrizations. the NIPS 2012 Workshop on Bayesian Nonparametric Models for Reliable Planning And Decision-Making Under Uncertainty. 2012. Paper
Joseph JM, Doshi-Velez F, Roy N. A Bayesian nonparametric approach to modeling battery health. IEEE International Conference on Robotics and Automation. 2012 :1876–1882.Abstract

Abstract—Making intelligent decisions from incomplete information is critical in many applications: for example, robots must choose actions based on imperfect sensors, and speech-based interfaces must infer a user’s needs from noisy microphone inputs. What makes these tasks hard is that often we do not have a natural representation with which to model the domain and use for choosing actions; we must learn about the domain’s properties while simultaneously performing the task. Learning a representation also involves trade-offs between modeling the data that we have seen previously and being able to make predictions about new data. This article explores learning representations of stochastic systems using Bayesian nonparametric statistics. Bayesian nonparametric methods allow the sophistication of a representation to scale gracefully with the complexity in the data. Our main contribution is a careful empirical evaluation of how representations learned using Bayesian nonparametric methods compare to other standard learning approaches, especially in support of planning and control. We show that the Bayesian aspects of the methods result in achieving state-of-the-art performance in decision making with relatively few samples, while the nonparametric aspects often result in fewer computations. These results hold across a variety of different techniques for choosing actions given a representation. Index Terms—Artificial intelligence, machine learning, reinforcement learning, partially-observable Markov decision process, hierarchial Dirichlet process hidden Markov model.

Doshi-Velez F, Ghahramani Z. A Comparison of Human and Agent Reinforcement Learning in Partially Observable Domains. 33rd Annual Meeting of the Cognitive Science Society (CogSci). 2011. Paper
Doshi-Velez F, Roy N. An Analysis of Activity Changes in MS Patients: A Case Study in the Use of Bayesian Nonparametrics. Neural Information Processing Systems (NIPS) Workshop: Bayesian Nonparametrics, Hope or Hype? 2011. Paper
Joseph JM, Doshi-Velez F, Huang AS, Roy N. A Bayesian nonparametric approach to modeling motion patterns. Auton. Robots. 2011;31 :383–400. Paper
Doshi F, Wingate D, Tenenbaum JB, Roy N. Infinite Dynamic Bayesian Networks, in Proceedings of the 28th International Conference on Machine Learning. ; 2011 :913–920. Paper
Geramifard A, Doshi F, Redding J, Roy N, How JP. Online Discovery of Feature Dependencies, in Proceedings of the 28th International Conference on Machine Learning. ; 2011 :881–888. Paper
Joseph JM, Doshi-Velez F, Roy N. A Bayesian Nonparametric Approach to Modeling Mobility Patterns, in Proceedings of the Twenty-Fourth Conference on Artificial Intelligence. ; 2010. Paper
Doshi-Velez F. Nonparametric Bayesian Approaches for Reinforcement Learning in Partially Observable Domains, in Conference on Artificial Intelligence. ; 2010. Paper
Doshi-Velez F, Wingate D, Roy N, Tenenbaum JB. Nonparametric Bayesian Policy Priors for Reinforcement Learning, in Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada. ; 2010 :532–540. Paper
Doshi-Velez F. The Infinite Partially Observable Markov Decision Process. Advances in Neural Information Processing Systems (NIPS). 2009. Paper
Doshi-Velez F, Ghahramani Z. Accelerated Sampling for the Indian Buffet Process, in Proceedings of the 26th International Conference on Machine Learning. Montreal, Canada ; 2009.Abstract

We often seek to identify co-occurring hidden features in a set of observations. The Indian Buffet Process (IBP) provides a nonparametric prior on the features present in each observation, but current inference techniques for the IBP often scale poorly. The collapsed Gibbs sampler for the IBP has a running time cubic in the number of observations, and the uncollapsed Gibbs sampler, while linear, is often slow to mix. We present a new linear-time collapsed Gibbs sampler for conjugate likelihood models and demonstrate its efficacy on large real-world datasets.

Doshi-Velez F, Ghahramani Z. Correlated Non-Parametric Latent Feature Models, in {UAI} 2009, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009. ; 2009 :143–150. Paper
Doshi-Velez F, Knowles DA, Mohamed S, Ghahramani Z. Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process. Conference on Neural Information Processing Systems (NIPS). 2009. Paper
Doshi F, Miller K, Gael JV, Teh YW. Variational Inference for the Indian Buffet Process. Artificial Intelligence on Statistics (AISTATS) Best Paper Nominee. 2009. Paper
Doshia F, Roy N. Spoken Language Interaction with Model Uncertainty: An Adaptive Human-Robot Interaction System. Connection Science. 2008;20 (4) :299-318.Abstract

Spoken language is one of the most intuitive forms of interaction between humans and agents. Unfortunately, agents that interact with people using natural language often experience communication errors and do not correctly understand the user’s intentions. Recent systems have successfully used probabilistic models of speech, language, and user behavior to generate robust dialog performance in the presence of noisy speech recognition and ambiguous language choices, but decisions made using these probabilistic models are still prone to errors due to the complexity of acquiring and maintaining a complete model of human language and behavior. In this paper, we describe a decision-theoretic model for human-robot interaction using natural language. Our algorithm is based on the Partially Observable Markov Decision Process (POMDP), which allows agents to choose actions that are robust not only to uncertainty from noisy or ambiguous speech recognition but also unknown user models. Like most dialog systems, a POMDP is defined by a large number of parameters that may be difficult to specify a priori from domain knowledge, and learning these parameters from the user may require an unacceptably long training period. We describe an extension to the POMDP model that allows the agent to acquire a linguistic model of the user online, including new vocabulary and word choice preferences. Our approach not only avoids a training period of constant questioning as the agent learns, but also allows the agent to actively query for additional information when its uncertainty suggests a high risk of mistakes. We demonstrate our approach both in simulation and on a natural language interaction system for a robotic wheelchair application. Keywords: dialog management, human-computer interface, adaptive systems, online learning, partially observable Markov decision processes

Doshi F, Roy N. The Permutable POMDP: Fast Solutions to POMDPs for Preference Elicitation. Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) Best Paper Nominee. 2008. Paper
Doshi F, Pineau J, Roy N. Reinforcement Learning with Limited Reinforcement: Using Bayes Risk for Active Learning in POMDPs. International Conference on Machine Learning (ICML). 2008. Paper