Defining Admissible Rewards for High-Confidence Policy Evaluation in Batch Reinforcement Learning