Evaluating Research Dataset Recommendations in a Living Lab

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13390))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1179 Accesses
1 Citations
2 Altmetric

Abstract

The search for research datasets is as important as laborious. Due to the importance of the choice of research data in further research, this decision must be made carefully. Additionally, because of the growing amounts of data in almost all areas, research data is already a central artifact in empirical sciences. Consequentially, research dataset recommendations can beneficially supplement scientific publication searches. We formulated the recommendation task as a retrieval problem by focussing on broad similarities between research datasets and scientific publications. In a multistage approach, initial recommendations were retrieved by the BM25 ranking function and dynamic queries. Subsequently, the initial ranking was re-ranked utilizing click feedback and document embeddings. The proposed system was evaluated live on real user interaction data using the STELLA infrastructure in the LiLAS Lab at CLEF 2021. Our experimental system could efficiently be fine-tuned before the live evaluation by pre-testing the system with a pseudo test collection based on prior user interaction data from the live system. The results indicate that the experimental system outperforms the other participating systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluation Infrastructures for Academic Shared Tasks

Article Open access 07 February 2020

Overview of LiLAS 2021 – Living Labs for Academic Search

RAR-SB: research article recommendation using SciBERT with BiGRU

Article 21 October 2023

Notes

References

Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17, 734–749 (2005). https://doi.org/10.1109/TKDE.2005.99
Article Google Scholar
Asadi, N., Metzler, D., Elsayed, T., Lin, J.: Pseudo test collections for learning web search ranking functions. In: Ma, W., Nie, J., Baeza-Yates, R., Chua, T., Croft, W.B. (eds.) Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, 25–29 July 2011, pp. 1073–1082. ACM (2011). https://doi.org/10.1145/2009916.2010058
Azzopardi, L., Balog, K.: Towards a living lab for information retrieval research and development. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 26–37. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23708-9_5
Chapter Google Scholar
Balog, K., Schuth, A., Dekker, P., Schaer, P., Chuang, P.Y., Tavakolpoursaleh, N.: Overview of the trec 2016 open search track. In: Voorhees, E.M., Ellis, A. (eds.) TREC, vol. Special Publication 500–321. National Institute of Standards and Technology (NIST) (2016)
Google Scholar
Beel, J., Gipp, B., Langer, S., Breitinger, C.: Research-paper recommender systems: a literature survey. Int. J. Digit. Libr. 17(4), 305–338 (2015). https://doi.org/10.1007/s00799-015-0156-0
Article Google Scholar
Berendsen, R., Tsagkias, M., Weerkamp, W., de Rijke, M.: Pseudo test collections for training and tuning microblog rankers. In: Jones, G.J.F., Sheridan, P., Kelly, D., de Rijke, M., Sakai, T. (eds.) The 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, Dublin, Ireland - July 28 - August 01 2013, pp. 53–62. ACM (2013). https://doi.org/10.1145/2484028.2484063
Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl. Based Syst. 46, 109–132 (2013). https://doi.org/10.1016/j.knosys.2013.03.012
Article Google Scholar
Breuer, T., Schaer, P., Tavakolpoursaleh, N., Schaible, J., Wolff, B., Müller, B.: STELLA: towards a framework for the reproducibility of online search experiments. In: Clancy, R., Ferro, N., Hauff, C., Lin, J., Sakai, T., Wu, Z.Z. (eds.) Proceedings of the Open-Source IR Replicability Challenge co-located with 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, OSIRRC@SIGIR 2019, Paris, France, 25 July 2019. CEUR Workshop Proceedings, vol. 2409, pp. 8–11. CEUR-WS.org (2019), http://ceur-ws.org/Vol-2409/position01.pdf
Chapman, A., et al.: Dataset search: a survey. VLDB J. 29(1), 251–272 (2019). https://doi.org/10.1007/s00778-019-00564-x
Article Google Scholar
Cohan, A., Feldman, S., Beltagy, I., Downey, D., Weld, D.S.: SPECTER: document-level representation learning using citation-informed transformers. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, 5–10 July 2020, pp. 2270–2282. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.207
Craswell, N., Zoeter, O., Taylor, M.J., Ramsey, B.: An experimental comparison of click position-bias models. In: Najork, M., Broder, A.Z., Chakrabarti, S. (eds.) Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, 11–12 February 2008, pp. 87–94. ACM (2008). https://doi.org/10.1145/1341531.1341545
Fix, E., Hodges, J.L.: Discriminatory analysis. Nonparametric discrimination: consistency properties. Int. Stat. Rev. Rev. Int. de Stat. 57(3), 238–247 (1989). http://www.jstor.org/stable/1403797
Hienert, D., Kern, D., Boland, K., Zapilko, B., Mutschke, P.: A digital library for research data and related information in the social sciences. In: Bonn, M., Wu, D., Downie, J.S., Martaus, A. (eds.) 19th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2019, Champaign, IL, USA, 2–6 June 2019, pp. 148–157. IEEE (2019). https://doi.org/10.1109/JCDL.2019.00030
Kacprzak, E., Koesten, L., Ibáñez, L., Blount, T., Tennison, J., Simperl, E.: Characterising dataset search - an analysis of search logs and data requests. J. Web Semant. 55, 37–55 (2019). https://doi.org/10.1016/j.websem.2018.11.003
Article Google Scholar
Kern, D., Mathiak, B.: Are there any differences in data set retrieval compared to well-known literature retrieval? In: Kapidakis, S., Mazurek, C., Werla, M. (eds.) TPDL 2015. LNCS, vol. 9316, pp. 197–208. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24592-8_15
Chapter Google Scholar
Lommatzsch, A., Kille, B., Hopfgartner, F., Ramming, L.: Newsreel multimedia at mediaeval 2018: news recommendation with image and text content. In: Larson, M.A. (eds.) Working Notes Proceedings of the MediaEval 2018 Workshop, Sophia Antipolis, France, 29–31 October 2018. CEUR Workshop Proceedings, vol. 2283. CEUR-WS.org (2018). http://ceur-ws.org/Vol-2283/MediaEval_18_paper_5.pdf
Roberts, K., et al.: Searching for scientific evidence in a pandemic: an overview of TREC-COVID. CoRR abs/2104.09632 (2021). https://arxiv.org/abs/2104.09632
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: Harman, D.K. (ed.) Proceedings of The Third Text Retrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, 2–4 November 1994. NIST Special Publication, vol. 500–225, pp. 109–126. National Institute of Standards and Technology (NIST) (1994). http://trec.nist.gov/pubs/trec3/papers/city.ps.gz
Schaer, P., Breuer, T., Castro, L.J., Wolff, B., Schaible, J., Tavakolpoursaleh, N.: Overview of lilas 2021 – living labs for academic search. In: Candan, K.S. (ed.) CLEF 2021. LNCS, vol. 12880, pp. 394–418. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_25
Chapter Google Scholar
Schaer, P., Breuer, T., Castro, L.J., Wolff, B., Schaible, J., Tavakolpoursaleh, N.: Overview of lilas 2021 - living labs for academic search (extended overview). In: Faggioli, G., Ferro, N., Joly, A., Maistro, M., Piroi, F. (eds.) Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to - 24th, 2021. CEUR Workshop Proceedings, vol. 2936, pp. 1668–1699. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2936/paper-143.pdf
Schaer, P., Schaible, J., Müller, B.: Living labs for academic search at CLEF 2020. In: ECIR 2020. LNCS, vol. 12036, pp. 580–586. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_75
Chapter Google Scholar
Schaible, J., Breuer, T., Tavakolpoursaleh, N., Müller, B., Wolff, B., Schaer, P.: Evaluation Infrastructures for Academic Shared Tasks. Datenbank-Spektrum 20(1), 29–36 (2020). https://doi.org/10.1007/s13222-020-00335-x
Article Google Scholar
Schuth, A., Balog, K., Kelly, L.: Overview of the living labs for information retrieval evaluation (LL4IR) CLEF lab 2015. In: Mothe, J. (ed.) CLEF 2015. LNCS, vol. 9283, pp. 484–496. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24027-5_47
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Technische Hochschule Köln, Ubierring 48, 50678, Cologne, Germany
Jüri Keller & Leon Paul Mondrian Munz

Authors

Jüri Keller
View author publications
You can also search for this author in PubMed Google Scholar
Leon Paul Mondrian Munz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jüri Keller .

Editor information

Editors and Affiliations

University of Bologna, Forlì, Italy
Alberto Barrón-Cedeño
University of Padua, Padova, Italy
Giovanni Da San Martino
University of Bologna, Bologna, Italy
Mirko Degli Esposti
Instituto di Scienza e Tecnologie dell' Informazione “Alessandro Faedo”, Pisa, Italy
Fabrizio Sebastiani
University of Glasgow, Glasgow, UK
Craig Macdonald
University Milano-Bicocca, Milan, Italy
Gabriella Pasi
TU Wien, Vienna, Austria
Allan Hanbury
Leipzig University, Leipzig, Germany
Martin Potthast
University of Padua, Padova, Italy
Guglielmo Faggioli
University of Padua, Padova, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Keller, J., Munz, L.P.M. (2022). Evaluating Research Dataset Recommendations in a Living Lab. In: Barrón-Cedeño, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2022. Lecture Notes in Computer Science, vol 13390. Springer, Cham. https://doi.org/10.1007/978-3-031-13643-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-13643-6_11
Published: 25 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13642-9
Online ISBN: 978-3-031-13643-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluating Research Dataset Recommendations in a Living Lab

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation Infrastructures for Academic Shared Tasks

Overview of LiLAS 2021 – Living Labs for Academic Search

RAR-SB: research article recommendation using SciBERT with BiGRU

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Evaluating Research Dataset Recommendations in a Living Lab

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation Infrastructures for Academic Shared Tasks

Overview of LiLAS 2021 – Living Labs for Academic Search

RAR-SB: research article recommendation using SciBERT with BiGRU

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation