Article

Bias-free hypothesis evaluation in multirelational domains

Authors:

Christine Körner,

Stefan WrobelAuthors Info & Claims

SAC '06: Proceedings of the 2006 ACM symposium on Applied computing

Pages 639 - 640

https://doi.org/10.1145/1141277.1141423

Published: 23 April 2006 Publication History

Get Access

Abstract

In machine learning one typically assumes that the true classification of an object depends only on the object itself and given the object, is independent of the classification of other objects. In this case, setting aside a sufficiently large and randomly chosen part of the training data as a test set, the observed sample error on the test set is an unbiased estimator of true error. However, in many application settings, those mainstream approaches to model evaluation might be inappropriate. As pointed out by [2], among others, whenever there is autocorrelation, i.e., whenever the target value of one object depends not only on the object itself, but also on other objects' classifications or information that is shared between objects, observed error on a randomly chosen test set may not be an unbiased estimator anymore. We introduce a sampling technique, generalized subgraph sampling, that avoids a bias in error estimation by establishing the required amount of linked objects in the test set.

References

[1]

http://www.imdb.com.

Google Scholar

[2]

D. Jensen and J. Neville. Autocorrelation and linkage cause bias in evaluation of relational learners. In Proc. of the 12th International Conference on Inductive Logic Programming. Springer-Verlag, 2002.

Digital Library

Google Scholar

[3]

J. Neville and D. Jensen. Collective classification with relational dependency networks. In Proc. of the 2nd Multi-Relational Data Mining Workshop, 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.

Google Scholar

Index Terms

Bias-free hypothesis evaluation in multirelational domains

Recommendations

Bias-Free hypothesis evaluation in multirelational domains
PAKDD'06: Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

In propositional domains using a separate test set via random sampling or cross validation is generally considered to be an unbiased estimator of true error. In multirelational domains previous work has already noted that linkage of objects may cause ...
Bias-free hypothesis evaluation in multirelational domains
MRDM '05: Proceedings of the 4th international workshop on Multi-relational mining

In propositional domains, using a separate test set via random sampling or cross validation is generally considered to be an unbiased estimator of true error. In multirelational domains, previous work has already noted that linkage of objects may cause ...
On the Bias of the Generalized Regression Estimator in Survey Sampling
Abstract
It is well known that the generalized regression (GREG) estimator of the finite population total is asymptotically unbiased. Consequently, bias is negligible when the sample size is large. But the magnitude of the bias is not known, if we are ...

Comments

Information & Contributors

Information

Published In

SAC '06: Proceedings of the 2006 ACM symposium on Applied computing

April 2006

1967 pages

ISBN:1595931082

DOI:10.1145/1141277

Conference Chair:
Hisham M. Haddad
Kennesaw State University, Kennesaw, Georgia

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SAC06

Sponsor:

SIGAPP

SAC06: The 2006 ACM Symposium on Applied Computing

April 23 - 27, 2006

Dijon, France

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
99
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations