Computer Science > Computation and Language

arXiv:1606.02638 (cs)

[Submitted on 8 Jun 2016]

Title:Addressing Limited Data for Textual Entailment Across Domains

Authors:Chaitanya Shivade, Preethi Raghavan, Siddharth Patwardhan

View PDF

Abstract:We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to address the lack of labeled data. With self-training, we successfully exploit unlabeled data to improve over ENT by 15% F-score on the newswire domain, and 13% F-score on clinical data. On the other hand, our active learning experiments demonstrate that we can match (and even beat) ENT using only 6.6% of the training data in the clinical domain, and only 5.8% of the training data in the newswire domain.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1606.02638 [cs.CL]
	(or arXiv:1606.02638v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1606.02638

Submission history

From: Preethi Raghavan [view email]
[v1] Wed, 8 Jun 2016 16:56:19 UTC (667 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2016-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chaitanya P. Shivade
Preethi Raghavan
Siddharth Patwardhan

export BibTeX citation

Computer Science > Computation and Language

Title:Addressing Limited Data for Textual Entailment Across Domains

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Addressing Limited Data for Textual Entailment Across Domains

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators