Optimizing for Interpretability in Deep Neural Networks with Tree Regularization