Article

Free access

Bridging the lexical chasm: statistical approaches to answer-finding

Authors:

Vibhu MittalAuthors Info & Claims

SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval

Pages 192 - 199

https://doi.org/10.1145/345508.345576

Published: 01 July 2000 Publication History

PDF eReader

Abstract

This paper investigates whether a machine can automatically learn the task of finding, within a large collection of candidate responses, the answers to questions. The learning process consists of inspecting a collection of answered questions and characterizing the relation between question and answer with a statistical model. For the purpose of learning this relation, we propose two sources of data: Usenet FAQ documents and customer service call-center dialogues from a large retail company. We will show that the task of “answer-finding” differs from both document retrieval and tradition question-answering, presenting challenges different from those found in these problems. The central aim of this work is to discover, through theoretical and empirical investigation, those statistical techniques best suited to the answer-finding problem.

References

[1]

AAAI. Proceedings of the AAAl FSS on Question Answering Systems (Cape Cod, MA, November 1999).

Google Scholar

[2]

Berger, A., and Lafferty, J. Information retrieval as statistical translation. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval. Berkeley, CA, 1999.

Digital Library

Google Scholar

[3]

Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jefinek, E, Lafferty, J., Mercer, R., and Roossin, P. A statistical approach to machine translation. Computational Linguistics 16, 2 (1990), 79-85.

Digital Library

Google Scholar

[4]

Burke, R., Hammond, K., Kulyukin, V., Lytinen, S., and Tomuro, N. Question answering from frequently-asked question files: Experiences with the FAQ Finder system. Tech. Rep. TR-97-05, Department of Computer Science, University of Chicago, 1997.

Digital Library

Google Scholar

[5]

Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T., and Harshman, R. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 6 (1990), 391-407.

Crossref

Google Scholar

[6]

Dempster, A., Laird, N., and Rubin, D. Maximum likefihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39 (1977), 1-38.

Google Scholar

[7]

Efthimiadis, E., and Biron, P. UCLA-Okapi at TREC- 2: Query expansion experiments. In Proceedings of the Second Text Retrieval Conference (1994).

Google Scholar

[8]

GARTNER GROUP. Gartner group report, 1998.

Google Scholar

[9]

Hofmann, T. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval (1999).

Digital Library

Google Scholar

[10]

Lehnert, W. The process of question answering: A computer simulation of cognition. Lawrence Erlbaum Associates, 1978.

Google Scholar

[11]

Salton, G., and Buckley, C. Term-weighting approaches in automatic text retrieval. Information Processing and Management24 (1988), 513-523.

Digital Library

Google Scholar

[12]

Weaver, W. Translation (1949). In Machine Translation of Languages. MIT Press, 1955.

Google Scholar

[13]

Xu, J., and Croft, B. Query expansion using local and global document analysis. In Proceedings of the 19th Annual ACM Conference on Research and Development in Information Retrieval. 1996.

Digital Library

Google Scholar

Cited By

View all

Yang HLi SGonçalves T(2024)Enhancing Biomedical Question Answering with Large Language ModelsInformation10.3390/info1508049415:8(494)Online publication date: 19-Aug-2024
https://doi.org/10.3390/info15080494
Saha SImtiaz H(2024)Privacy-Preserving Non-Negative Matrix Factorization with OutliersACM Transactions on Knowledge Discovery from Data10.1145/363296118:3(1-26)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3632961
Deb AKhakharia RShetty SJadhav ASethi R(2024)Yoga Pose Recommender System for Physical and Mental Illnesses2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545248(1-6)Online publication date: 22-Mar-2024
https://doi.org/10.1109/TQCEBT59414.2024.10545248
Show More Cited By

Index Terms

Bridging the lexical chasm: statistical approaches to answer-finding
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
2. Mathematics of computing
  1. Probability and statistics

Recommendations

Automatic keyphrase extraction by bridging vocabulary gap
CoNLL '11: Proceedings of the Fifteenth Conference on Computational Natural Language Learning

Keyphrase extraction aims to select a set of terms from a document as a short summary of the document. Most methods extract keyphrases according to their statistical properties in the given document. Appropriate keyphrases, however, are not always ...
Generating Keyword Queries for Natural Language Queries to Alleviate Lexical Chasm Problem
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

In recent years, the task of reformulating natural language queries has received considerable attention from both industry and academic communities. Because of the lexical chasm problem between natural language queries and web documents, if we directly ...
Bridging the gap: effect of text query reformulation in multimodal retrieval

Multimodal Retrieval provides new paradigms and methods aimed at effectively searching through the enormous volume of data. Multimodal retrieval is a well studied problem often used in image retrieval. Most of the existing works in image retrieval under ...

Comments

Information & Contributors

Information

Published In

SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval

July 2000

396 pages

ISBN:1581132263

DOI:10.1145/345508

Chairmen:
Emmanuel Yannakoudakis
Athens Univ. of Economics and Business, Greece
,
Nicholas J. Belkin
Rutgers Univ.
,
Mun-Kew Leong
Kent Ridge Digital Labs
,
Peter Ingwersen
Royal School of Library and Information Science

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2000

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

SIGIR00

Sponsor:

Greek Com Soc
SIGIR
Athens U of Econ & Business

SIGIR00: 23rd ACM International SIGIR Conference on Research and Development in Information Retrieval

July 24 - 28, 2000

Athens, Greece

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

174
Total Citations
View Citations
371
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)19

Reflects downloads up to 19 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Yang HLi SGonçalves T(2024)Enhancing Biomedical Question Answering with Large Language ModelsInformation10.3390/info1508049415:8(494)Online publication date: 19-Aug-2024
https://doi.org/10.3390/info15080494
Saha SImtiaz H(2024)Privacy-Preserving Non-Negative Matrix Factorization with OutliersACM Transactions on Knowledge Discovery from Data10.1145/363296118:3(1-26)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3632961
Deb AKhakharia RShetty SJadhav ASethi R(2024)Yoga Pose Recommender System for Physical and Mental Illnesses2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545248(1-6)Online publication date: 22-Mar-2024
https://doi.org/10.1109/TQCEBT59414.2024.10545248
Wu ZXiao CSun JSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)MedLink: De-Identified Patient Health Record LinkageProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599427(2672-2682)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599427
Thakur NWang KGurevych ILin JChen HDuh WHuang HKato MMothe JPoblete B(2023)SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591902(2964-2974)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591902
Singh IScarton CBontcheva K(2023)UTDRM: unsupervised method for training debunked-narrative retrieval modelsEPJ Data Science10.1140/epjds/s13688-023-00437-y12:1Online publication date: 13-Dec-2023
https://doi.org/10.1140/epjds/s13688-023-00437-y
Xiong SWen XLiu K(2023)Short Text Matching Based on Multi-granularity Semantic Information2023 3rd International Conference on Computer Science and Blockchain (CCSB)10.1109/CCSB60789.2023.10398769(79-82)Online publication date: 17-Nov-2023
https://doi.org/10.1109/CCSB60789.2023.10398769
You H(2023)Multi-grained unsupervised evidence retrieval for question answeringNeural Computing and Applications10.1007/s00521-023-08892-435:28(21247-21257)Online publication date: 4-Aug-2023
https://doi.org/10.1007/s00521-023-08892-4
Trapp SGroßer NWarschat J(2023)Question Answering with Transformers and Few-Shot Learning to Find Inventive Solutions for IDM-TRIZ Problems and Contradictions in PatentsTowards AI-Aided Invention and Innovation10.1007/978-3-031-42532-5_2(23-42)Online publication date: 30-Aug-2023
https://doi.org/10.1007/978-3-031-42532-5_2
Zope BMishra SShaw KVora DKotecha KBidwe R(2022)Question Answer System: A State-of-Art Representation of Quantitative and Qualitative AnalysisBig Data and Cognitive Computing10.3390/bdcc60401096:4(109)Online publication date: 7-Oct-2022
https://doi.org/10.3390/bdcc6040109
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Automatic keyphrase extraction by bridging vocabulary gap

Generating Keyword Queries for Natural Language Queries to Alleviate Lexical Chasm Problem

Bridging the gap: effect of text query reformulation in multimodal retrieval