ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12895))

Included in the following conference series:

International Conference on Artificial Neural Networks

2342 Accesses
1 Citations

Abstract

Neural language representation models such as BERT, pre-trained on large-scale unstructured corpora lack explicit grounding to real-world commonsense knowledge and are often unable to remember facts required for reasoning and inference. Natural Language Inference (NLI) is a challenging reasoning task that relies on common human understanding of language and real-world commonsense knowledge. We introduce a new model for NLI called External Knowledge Enhanced BERT (ExBERT), to enrich the contextual representation with real-world commonsense knowledge from external knowledge sources and enhance BERT’s language understanding and reasoning capabilities. ExBERT takes full advantage of contextual word representations obtained from BERT and employs them to retrieve relevant external knowledge from knowledge graphs and to encode the retrieved external knowledge. Our model adaptively incorporates the external knowledge context required for reasoning over the inputs. Extensive experiments on the challenging SciTail and SNLI benchmarks demonstrate the effectiveness of ExBERT: in comparison to the previous state-of-the-art, we obtain an accuracy of $95.9\%$ on SciTail and $91.5\%$ on SNLI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Natural Language Inference Using Evidence from Knowledge Graphs

Bilinear Fusion of Commonsense Knowledge with Attention-Based NLI Models

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Notes

1.
https://github.com/huggingface/transformers.
2.
https://nlp.stanford.edu/projects/snli/.
3.
https://leaderboard.allenai.org/scitail/submissions/public.
4.
We expect further improvements in ExBERT’s performance with $\mathrm {BERT_{LARGE}}$, however we left the evaluation for future work due to the limited computing resources.

References

Bast, H., Björn, B., Haussmann, E.: Semantic search on text and knowledge bases. Found. Trends Inf. Retrieval 10(2–3), 119–271 (2016)
Article Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: ACL (2015)
Google Scholar
Chen, Q., Zhu, X., Ling, Z.H., Inkpen, D., Wei, S.: Neural natural language inference models enhanced with external knowledge. In: ACL (2018)
Google Scholar
Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising textual entailment challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS (LNAI), vol. 3944, pp. 177–190. Springer, Heidelberg (2006). https://doi.org/10.1007/11736790_9
Chapter Google Scholar
Dalvi, M.B., Tandon, N., Clark, P.: Domain-targeted, high precision knowledge extraction. Trans. Assoc. Comput. Linguist. 5, 233–246 (2017). https://www.transacl.org/ojs/index.php/tacl/article/view/1064
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the NAACL-HLT 2019 (Long and Short Papers), vol. 1, pp. 4171–4186 (2019)
Google Scholar
Gajbhiye, A., Jaf, S., Moubayed, N.A., Bradley, S., McGough, A.S.: Cam: a combined attention model for natural language inference. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 1009–1014, December 2018
Google Scholar
Gajbhiye, A., Jaf, S., Moubayed, N.A., McGough, A.S., Bradley, S.: An exploration of dropout with RNNs for natural language inference. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11141, pp. 157–167. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01424-7_16
Chapter Google Scholar
Gajbhiye, A., Winterbottom, T., Al Moubayed, N., Bradley, S.: Bilinear fusion of commonsense knowledge with attention-based NLI models. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12396, pp. 633–646. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61609-0_50
Chapter Google Scholar
Kang, D., Khot, T., Sabharwal, A., Hovy, E.: AdvEntuRe: adversarial training for textual entailment with knowledge-guided examples. In: ACL, Melbourne, July 2018
Google Scholar
Kapanipathi, P., et al.: Infusing knowledge into the textual entailment task using graph convolutional networks. arXiv preprint arXiv:1911.02060 (2019)
Khot, T., Sabharwal, A., Clark, P.: Scitail: A textual entailment dataset from science question answering. In: AAAI, New Orleans, 2–7, February 2018
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2015)
Google Scholar
Kwon, S., Kang, C., Han, J., Choi, J.: Why do masked neural language models still need common sense knowledge?. CoRR abs/1911.03024 (2019)
Google Scholar
Li, A.H., Sethy, A.: Knowledge enhanced attention for robust natural language inference. arXiv preprint arXiv:1909.00102 (2019)
Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4487–4496, Florence, July 2019
Google Scholar
Logan, R., Liu, N.F., Peters, M.E., Gardner, M., Singh, S.: Barack’s wife hillary: using knowledge graphs for fact-aware language modeling. In: Proceedings of the 57th ACL, pp. 5962–5971, Florence, July 2019
Google Scholar
Pang, D., Lin, L.H., Smith, N.A.: Improving natural language inference with a pretrained parser. arXiv preprint arXiv:1909.08217 (2019)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
Google Scholar
Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: An open multilingual graph of general knowledge (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)
Google Scholar
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017). https://doi.org/10.1109/TKDE.2017.2754499
Article Google Scholar
Wang, X., et al.: Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI, vol. 33, pp. 7208–7215 (2019)
Google Scholar
Yang, B., Mitchell, T.: Leveraging knowledge bases in LSTMs for improving machine reading. In: ACL, pp. 1436–1446, Vancouver, July 2017
Google Scholar
Zhang, Z., et al.: Semantics-aware BERT for language understanding. ArXiv arXiv:1909.02209 (2020)
Zhang, Z., Wu, Y., Li, Z., Zhao, H.: Explicit contextual semantics for text comprehension. CoRR abs/1809.02794 http://arxiv.org/abs/1809.02794 (2018)

Download references

Author information

Authors and Affiliations

University of Sheffield, Sheffield, UK
Amit Gajbhiye
University of Durham, Durham, UK
Noura Al Moubayed & Steven Bradley

Authors

Amit Gajbhiye
View author publications
You can also search for this author in PubMed Google Scholar
Noura Al Moubayed
View author publications
You can also search for this author in PubMed Google Scholar
Steven Bradley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amit Gajbhiye .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gajbhiye, A., Moubayed, N.A., Bradley, S. (2021). ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12895. Springer, Cham. https://doi.org/10.1007/978-3-030-86383-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-86383-8_37
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86382-1
Online ISBN: 978-3-030-86383-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Natural Language Inference Using Evidence from Knowledge Graphs

Bilinear Fusion of Commonsense Knowledge with Attention-Based NLI Models

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Natural Language Inference Using Evidence from Knowledge Graphs

Bilinear Fusion of Commonsense Knowledge with Attention-Based NLI Models

K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation