| Mining employment market via text block detection and adaptive cross-domain information extraction |
| Full text |
Pdf
(562 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Information extraction
table of contents
Pages 283-290
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
Tak-Lam Wong
|
The Chinese University of Hong Kong, Hong Kong, Hong Kong
|
|
Wai Lam
|
The Chinese University of Hong Kong, Hong Kong, Hong Kong
|
|
Bo Chen
|
The Chinese University of Hong Kong, Hong Kong, Hong Kong
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 70, Downloads (12 Months): 180, Citation Count: 0
|
|
|
ABSTRACT
We have developed an approach for analyzing online job advertisements in different domains (industries) from different regions worldwide. Our approach is able to extract precise information from the text content supporting useful employment market analysis locally and globally. A major component in our approach is an information extraction framework which is composed of two challenging tasks. The first task is to detect unformatted text blocks automatically based on an unsupervised learning model. Identifying these useful text blocks through this learning model allows the generation of highly effective features for the next task which is text fragment extraction learning. The task of text fragment extraction learning is formulated as a domain adaptation model for text fragment classification. One advantage of our approach is that it can easily adapt to a large number of online job advertisements in different and new domains. Extensive experiments have been conducted to demonstrate the effectiveness and flexibility of our approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
K. Au and K. Cheung. Information extraction for on-line job advertisements. In Proceedings of the ICDM Workshop on Active Mining, pages 58--63, 2002.
|
| |
3
|
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems, pages 137--144, 2006.
|
| |
4
|
D. Blei, J. Bagnell, and A. McCallum. Learning with scope, with application to information extraction and classification. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, pages 53--60, 2002.
|
| |
5
|
|
| |
6
|
K. Borgwardt, A. Gretton, M. Rasch, H. Kriegel, B. Scholkopf, and A. Smola. Integrating structured biological data by kernel maximum mean discrepancy. In Proceedings of the Fourteenth Annual International Conference On Intelligence Systems For Molecular Biology, pages 49--57, 2006.
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
J. Jiang and C. Zhai. Instance weighting for domain adaptation in NLP. In Proceedings of the Forth--Fifth Annual Meeting of the Association for Computational Linguistics, pages 264--271, 2007.
|
| |
12
|
|
 |
13
|
|
| |
14
|
S. Pan, J. Kwok, and Q. Yang. Transfer learning via dimensionality reduction. In Proceedings of the Twenty-Third AAAI conference on Artificial Intelligence, pages 677--682, 2008.
|
 |
15
|
|
 |
16
|
Rajat Raina , Alexis Battle , Honglak Lee , Benjamin Packer , Andrew Y. Ng, Self-taught learning: transfer learning from unlabeled data, Proceedings of the 24th international conference on Machine learning, p.759-766, June 20-24, 2007, Corvalis, Oregon
[doi> 10.1145/1273496.1273592]
|
 |
17
|
|
|