[PDF][PDF] Chinese Term Extraction from Web Pages Based on Compound Term Productivity

H Nakagawa, H Kojima, A Maeda�- Proceedings of the Third�…, 2004 - aclanthology.org
H Nakagawa, H Kojima, A Maeda
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing, 2004aclanthology.org
In this paper, we propose an automatic term recognition system for Chinese. Our idea is
based on the relation between a compound word and its constituents that are simple words
or individual Chinese character. More precisely, we basically focus on how many
words/characters adjoin the word/character in question to form compound words. We also
take into account the frequency of term. We evaluated word based method and character
based method with several Chinese Web pages, resulting in precision of 75% for top ten�…
Abstract
In this paper, we propose an automatic term recognition system for Chinese. Our idea is based on the relation between a compound word and its constituents that are simple words or individual Chinese character. More precisely, we basically focus on how many words/characters adjoin the word/character in question to form compound words. We also take into account the frequency of term. We evaluated word based method and character based method with several Chinese Web pages, resulting in precision of 75% for top ten candidate terms.
aclanthology.org
Showing the best result for this search. See all results