×

Contextual explanation networks. (English) Zbl 1529.68250

Summary: Modern learning algorithms excel at producing accurate but complex models of the data. However, deploying such models in the real-world requires extra care: we must ensure their reliability, robustness, and absence of undesired biases. This motivates the development of models that are equally accurate but can be also easily inspected and assessed beyond their predictive performance. To this end, we introduce contextual explanation networks (CENs) – a class of architectures that learn to predict by generating and utilizing intermediate, simplified probabilistic models. Specifically, CENs generate parameters for intermediate graphical models which are further used for prediction and play the role of explanations. Contrary to the existing post-hoc model-explanation tools, CENs learn to predict and to explain simultaneously. Our approach offers two major advantages: (i) for each prediction, valid, instance-specific explanation is generated with no computational overhead and (ii) prediction via explanation acts as a regularizer and boosts performance in data-scarce settings. We analyze the proposed framework theoretically and experimentally. Our results on image and text classification and survival analysis tasks demonstrate that CENs are not only competitive with the state-of-the-art methods but also offer additional insights behind each prediction, that can be valuable for decision support. We also show that while post-hoc methods may produce misleading explanations in certain cases, CENs are consistent and allow to detect such cases systematically.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H22 Probabilistic graphical models
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62N01 Censored data models
62N05 Reliability and life testing

References:

[1] O.O. Aalen. A linear regression model for the analysis of life time.Statistics in Medicine, 8(8):907-925, 1989.
[2] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: a system for large-scale machine learning. InOSDI, volume 16, pages 265-283, 2016.
[3] Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, and Eric P Xing. Learning scalable deep kernels with recurrent structure.Journal of Machine Learning · Zbl 1434.68390
[4] David Belanger and Andrew McCallum. Structured prediction energy networks. InProceedings of the International Conference on Machine Learning, 2016.
[5] Luca Bertinetto, João F Henriques, Jack Valmadre, Philip Torr, and Andrea Vedaldi. Learning feed-forward one-shot learners. InAdvances in Neural Information Processing Systems, pages 523-531, 2016.
[6] David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation.Journal of machine Learning research, 3(Jan):993-1022, 2003. · Zbl 1112.68379
[7] Rich Caruana, Hooshang Kangarloo, JD Dionisio, Usha Sinha, and David Johnson. Casebased explanation of non-case-based learning methods. InProceedings of the AMIA Symposium, page 212, 1999.
[8] Rich Caruana et al. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. InProceedings of the 21th ACM SIGKDD International Conference
[9] Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. Synthesized classifiers for zero-shot learning.arXiv preprint arXiv:1603.00550, 2016.
[10] François Chollet et al. Keras.https://keras.io, 2015.
[11] Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing (almost) from scratch.Journal of Machine · Zbl 1280.68161
[12] Thomas M Cover and Joy A Thomas.Elements of information theory. John Wiley & Sons, 2012. · Zbl 1140.94001
[13] DR Cox. Regression Models and Life-Tables.Journal of the Royal Statistical Society. Series B (Methodological), pages 187-220, 1972. · Zbl 0243.62041
[14] George E. Dahl, Ryan P. Adams, and Hugo Larochelle. Training restricted boltzmann machines on word observations. InProceedings of the 29th International Coference on Interna
[15] Andrew M Dai and Quoc V Le. Semi-supervised sequence learning. InAdvances in Neural Information Processing Systems, pages 3079-3087, 2015.
[16] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 1, pages 886-893. IEEE, 2005.
[17] Bert De Brabandere, Xu Jia, Tinne Tuytelaars, and Luc Van Gool. Dynamic filter networks. InNeural Information Processing Systems (NIPS), 2016.
[18] Adji B. Dieng, Chong Wang, Jianfeng Gao, and John William Paisley. Topicrnn: A recurrent neural network with long-range semantic dependency. InInternational Conference on
[19] Ann-Kathrin Dombrowski, Maximillian Alber, Christopher Anders, Marcel Ackermann, Klaus-Robert Müller, and Pan Kessel. Explanations can be manipulated and geometry is to blame. InAdvances in Neural Information Processing Systems, pages 13567-13578, 2019.
[20] Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning.arXiv preprint arXiv:1702.08608, 2017.
[21] Harrison Edwards and Amos Storkey.Towards a neural statistician.arXiv preprint arXiv:1606.02185, 2016.
[22] Bradley Efron. Logistic regression, survival analysis, and the kaplan-meier curve.Journal of the American statistical Association, 83(402):414-425, 1988. · Zbl 0644.62100
[23] Kuzman Ganchev, Jennifer Gillenwater, Ben Taskar, et al. Posterior regularization for structured latent variable models.Journal of Machine Learning Research, 11(Jul):2001- 2049, 2010. · Zbl 1242.68223
[24] Yuanjun Gao, Evan W Archer, Liam Paninski, and John P Cunningham. Linear dynamical neural population models through nonlinear embeddings. InAdvances in Neural
[25] Marta Garnelo, Dan Rosenbaum, Chris J Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J Rezende, and SM Eslami. Conditional neural processes.
[26] Scott Gray, Alec Radford, and Diederik P Kingma. Gpu kernels for block-sparse weights. arXiv preprint arXiv:1711.09224, 3, 2017.
[27] David Ha, Andrew Dai, and Quoc V Le. Hypernetworks.arXiv preprint arXiv:1609.09106, 2016.
[28] Lu Haonan, Seth H Huang, Tian Ye, and Guo Xiuyan. Graph star net for generalized multi-task learning.arXiv preprint arXiv:1906.12330, 2019.
[29] Trevor Hastie and Robert Tibshirani. Varying-coefficient models.Journal of the Royal Statistical Society. Series B (Methodological), pages 757-796, 1993. · Zbl 0796.62060
[30] Jeremy Howard and Sebastian Ruder. Universal language model fine-tuning for text classification.arXiv preprint arXiv:1801.06146, 2018.
[31] Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and Geoffrey E Hinton. Adaptive mixtures of local experts.Neural computation, 3(1):79-87, 1991.
[32] Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep structured output learning for unconstrained text recognition.arXiv preprint arXiv:1412.5903, 2014.
[33] Neal Jean, Marshall Burke, Michael Xie, W Matthew Davis, David B Lobell, and Stefano Ermon. Combining satellite imagery and machine learning to predict poverty.Science, 353(6301):790-794, 2016.
[34] Wenxin Jiang and Martin A Tanner. Hierarchical mixtures-of-experts for exponential family regression models: approximation and maximum likelihood estimation.Annals of Statistics, pages 987-1011, 1999. · Zbl 0957.62032
[35] Matthew Johnson, David K Duvenaud, Alex Wiltschko, Ryan P Adams, and Sandeep R Datta. Composing graphical models with neural networks for structured representations and fast inference. InAdvances in Neural Information Processing Systems, pages 2946-2954, 2016.
[36] Rie Johnson and Tong Zhang. Effective use of word order for text categorization with convolutional neural networks. InProceedings of the 2015 Conference of the North American
[37] Rie Johnson and Tong Zhang. Semi-supervised convolutional neural networks for text categorization via region embedding. InAdvances in neural information processing systems, pages 919-927, 2015b.
[38] Rie Johnson and Tong Zhang. Supervised and semi-supervised text categorization using lstm for region embeddings. InProceedings of The 33rd International Conference on Machine · Zbl 1304.68147
[39] Andrej Karpathy, Justin Johnson, and Li Fei-Fei. Visualizing and understanding recurrent networks.arXiv preprint arXiv:1506.02078, 2015.
[40] Been Kim, Cynthia Rudin, and Julie A Shah. The bayesian case model: A generative approach for case-based reasoning and prototype classification. InAdvances in Neural
[41] Been Kim, Oluwasanmi O Koyejo, and Rajiv Khanna. Examples are not enough, learn to criticize! criticism for interpretability. InAdvances In Neural Information Processing Systems, pages 2280-2288, 2016.
[42] P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In International Conference on Machine Learning (ICML), 2017.
[43] Daphne Koller and Nir Friedman.Probabilistic Graphical Models: Principles and Techniques. MIT press, 2009. · Zbl 1183.68483
[44] Rahul G Krishnan, Uri Shalit, and David Sontag. Structured inference networks for nonlinear state space models. InAAAI, pages 2101-2109, 2017.
[45] John Lafferty, Andrew McCallum, Fernando Pereira, et al. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. InProceedings of the
[46] Himabindu Lakkaraju and Osbert Bastani. “how do i fool you?”: Manipulating user trust via misleading black box explanations.arXiv preprint arXiv:1911.06473, 2019.
[47] Quoc Le and Tomas Mikolov. Distributed representations of sentences and documents. In International Conference on Machine Learning, pages 1188-1196, 2014.
[48] Tao Lei, Regina Barzilay, and Tommi Jaakkola. Rationalizing neural predictions.arXiv preprint arXiv:1606.04155, 2016.
[49] Jimmy Lei Ba, Kevin Swersky, Sanja Fidler, and Ruslan Salakhutdinov. Predicting deep zero-shot convolutional neural networks using textual descriptions. InProceedings of the
[50] Bruce G Lindsay. Mixture models: theory, geometry and applications. InNSF-CBMS regional conference series in probability and statistics, pages i-163. JSTOR, 1995. · Zbl 1163.62326
[51] Zachary C Lipton. The mythos of model interpretability.arXiv preprint arXiv:1606.03490, 2016.
[52] Liping Liu, Francisco Ruiz, and David Blei. Context selection for embedding models. In Advances in Neural Information Processing Systems, pages 4817-4826, 2017.
[53] Tania Lombrozo. The structure and function of explanations.Trends in cognitive sciences, 10(10):464-470, 2006.
[54] Scott Lundberg and Su-In Lee. A unified approach to interpreting model predictions.arXiv preprint arXiv:1705.07874, 2017.
[55] Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InProceedings of
[56] Andrew L Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InProceedings of the 49th Annual
[57] Aravindh Mahendran and Andrea Vedaldi. Understanding deep image representations by inverting them. InProceedings of the IEEE conference on computer vision and pattern
[58] Takeru Miyato, Andrew M Dai, and Ian Goodfellow. Adversarial training methods for semi-supervised text classification.arXiv preprint arXiv:1605.07725, 2016.
[59] Sahand Negahban, Bin Yu, Martin J Wainwright, and Pradeep K Ravikumar. A unified framework for high-dimensional analysis ofm-estimators with decomposable regularizers. InAdvances in Neural Information Processing Systems, pages 1348-1356, 2009. · Zbl 1331.62350
[60] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. InProceedings of the IEEE Conference
[61] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against deep learning systems using adversarial examples.arXiv preprint arXiv:1602.02697, 2016.
[62] Sashank J Reddi, Satyen Kale, and Sanjiv Kumar. On the convergence of adam and beyond. arXiv preprint arXiv:1904.09237, 2019.
[63] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why Should I Trust You?: Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD
[64] Maja Rudolph, Francisco Ruiz, Stephan Mandt, and David Blei. Exponential family embeddings. InAdvances in Neural Information Processing Systems, pages 478-486, 2016.
[65] Maja Rudolph, Francisco Ruiz, and David Blei. Structured embedding models for grouped data. InAdvances in Neural Information Processing Systems, pages 250-260, 2017.
[66] Devendra Singh Sachan, Manzil Zaheer, and Ruslan Salakhutdinov. Revisiting lstm networks for semi-supervised text classification via mixed objective function. InProceedings of the
[67] Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. Meta-learning with memory-augmented neural networks. InInternational conference on
[68] Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propagating activation differences.arXiv preprint arXiv:1704.02685, 2017.
[69] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition.arXiv preprint arXiv:1409.1556, 2014.
[70] Charles Sutton, Andrew McCallum, et al. An introduction to conditional random fields. Foundations and TrendsRin Machine Learning, 4(4):267-373, 2012. · Zbl 1253.68001
[71] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks.arXiv preprint
[72] Sebastian Thrun and Lorien Pratt.Learning to learn. Springer, 1998. · Zbl 0891.68079
[73] Joel A Tropp. User-friendly tail bounds for sums of random matrices.Foundations of computational mathematics, 12(4):389-434, 2012. · Zbl 1259.60008
[74] Manasi Vartak, Hugo Larochelle, and Arvind Thiagarajan. A meta-learning perspective on cold-start recommendations for items. InAdvances in Neural Information Processing Systems, pages 6888-6898, 2017.
[75] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. InAdvances in Neural Information Processing Systems, pages 3630-3638, 2016.
[76] Joseph Wang and Venkatesh Saligrama. Local supervised learning through space partitioning. InNIPS, 2012.
[77] Sida Wang and Christopher D. Manning. Baselines and bigrams: Simple, good sentiment and topic classification. InProceedings of the 50th Annual Meeting of the Association for
[78] Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P Xing. Deep kernel learning. InProceedings of the 19th International Conference on Artificial Intelligence and Statistics, pages 370-378, 2016.
[79] Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V Le. Unsupervised data augmentation.arXiv preprint arXiv:1904.12848, 2019.
[80] Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. Show, attend and tell: Neural image caption generation with visual attention. InInternational Conference on Machine Learning, pages 2048-2057, 2015.
[81] Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. Understanding neural networks through deep visualization.arXiv preprint arXiv:1506.06579, 2015.
[82] Chun-Nam J Yu, Russell Greiner, Hsiu-Chin Lin, and Vickie Baracos. Learning patientspecific cancer survival distributions as a sequence of dependent regressors. InAdvances
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.