Indexed by:
Abstract:
It is usually necessary to obtain task-specific training data in high-resource source languages for cross-lingual text classification. However, due to labeling costs, task characteristics, and privacy concerns, collecting such data is often unfeasible. We focus on how to improve text classification in low-resource languages in the absence of source training annotated data. To effectively transfer resources, we propose a new neural network framework(ATHG) that only make use of bilingual lexicons and high-resource languages's task-independent word embeddings. Firstly, through adversarial training, we map the source language vocabulary into the same space as the target language vocabulary, optimizing the mapping matrix. Then, considering multiple languages, we integrate different language information through a multi-step aggregation strategy. Our method outperforms pretrained models even without accessing large corpora. © 2024 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2024
Page: 735-740
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 6
Affiliated Colleges: