Abstract
The discovery of new chemical compounds is a key driver of the chemistry and pharmaceutical industries, and many other industrial sectors. Patents serve as a critical source of information about new chemical compounds. The ChEMU (Cheminformatics Elsevier Melbourne Universities) lab addresses information extraction over chemical patents and aims to advance the state of the art on this topic. ChEMU lab 2022, as part of the 13th Conference and Labs of the Evaluation Forum (CLEF-2022), will be the third ChEMU lab. The ChEMU 2020 lab provided two information extraction tasks, named entity recognition and event extraction. The ChEMU 2021 lab introduced two more tasks, chemical reaction reference resolution and anaphora resolution. For ChEMU 2022, we plan to re-run all the four tasks with a new task on semantic classification for tables as the fifth one. In this paper, we introduce ChEMU 2022, including its motivation, goals, tasks, resources, and evaluation framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Reaxys® Copyright ©2021 Elsevier Life Sciences IP Limited. Reaxys is a trademark of Elsevier Life Sciences IP Limited, used under license. https://www.reaxys.com.
- 2.
References
Akhondi, S.A., et al.: Automatic identification of relevant chemical compounds from patents. Database 2019, baz001 (2019)
Bregonje, M.: Patents: a unique source for scientific technical information in chemistry related industry? World Patent Inf. 27(4), 309–315 (2005)
Fang, B., Druckenbrodt, C., Akhondi, S.A., He, J., Baldwin, T., Verspoor, K.M.: ChEMU-Ref: a corpus for modeling anaphora resolution in the chemical domain. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, 19–23 April 2021, pp. 1362–1375. Association for Computational Linguistics (2021). https://www.aclweb.org/anthology/2021.eacl-main.116/
He, J., et al.: ChEMU 2021: reaction reference resolution and Anaphora resolution in chemical patents. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12657, pp. 608–615. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72240-1_71
He, J., et al.: Overview of ChEMU 2020: named entity recognition and event extraction of chemical reactions from patents. In: Arampatzis, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 237–254. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_18
He, J., et al.: ChEMU 2020: natural language processing methods are effective for information extraction from chemical patents. Frontiers Res. Metrics Anal. 6, 654438 (2021). https://doi.org/10.3389/frma.2021.654438
Hu, M., Cinciruk, D., Walsh, J.M.: Improving automated patent claim parsing: dataset, system, and experiments. arXiv preprint arXiv:1605.01744 (2016)
Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., Valencia, A.: CHEMDNER: the drugs and chemical names extraction challenge. J. Cheminform. 7(1), 1–11 (2015)
Li, Y., et al.: Overview of ChEMU 2021: reaction reference resolution and Anaphora resolution in chemical patents. In: Candan, K.S., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 292–307. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_20
Li, Y., et al.: Extended overview of ChEMU 2021: reaction reference resolution and anaphora resolution in chemical patents. In: Faggioli, G., Ferro, N., Joly, A., Maistro, M., Piroi, F. (eds.) Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, 21st–24th September 2021. CEUR Workshop Proceedings, vol. 2936, pp. 693–709. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2936/paper-58.pdf
Muresan, S., et al.: Making every SAR point count: the development of chemistry connect for the large-scale integration of structure and bioactivity data. Drug Discovery Today 16(23–24), 1019–1030 (2011)
Nguyen, D.Q., et al.: ChEMU: named entity recognition and event extraction of chemical reactions from patents. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 572–579. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_74
Senger, S., Bartek, L., Papadatos, G., Gaulton, A.: Managing expectations: assessment of chemistry databases generated by automated extraction of chemical structures from patents. J. Cheminform. 7(1), 1–12 (2015). https://doi.org/10.1186/s13321-015-0097-z
Yoshikawa, H., et al.: Chemical reaction reference resolution in patents. In: Proceedings of the 2nd Workshop on on Patent Text Mining and Semantic Technologies (2021)
Zhai, Z., et al.: ChemTables: dataset for table classification in chemical patents (2021). https://doi.org/10.17632/g7tjh7tbrj.3
Zhai, Z., et al.: ChemTables: a dataset for semantic classification on tables in chemical patents. J. Cheminform. 13(1), 97 (2021). https://doi.org/10.1186/s13321-021-00568-2
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y. et al. (2022). The ChEMU 2022 Evaluation Campaign: Information Extraction in Chemical Patents. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_50
Download citation
DOI: https://doi.org/10.1007/978-3-030-99739-7_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99738-0
Online ISBN: 978-3-030-99739-7
eBook Packages: Computer ScienceComputer Science (R0)