| Discriminative unsupervised learning of structured predictors |
| Full text |
Pdf
(201 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 148
archive
Proceedings of the 23rd international conference on Machine learning
table of contents
Pittsburgh, Pennsylvania
Pages: 1057 - 1064
Year of Publication: 2006
ISBN:1-59593-383-2
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 51, Citation Count: 3
|
|
|
ABSTRACT
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altun, Y., McAllester, D., & Belkin, M. (2005). Maximum margin semi-supervised learning for structured variables. Proceedings NIPS 18.
|
| |
2
|
Altun, Y., Tsochantaridis, I., & Hofmann, T. (2003). Hidden Markov support vector machines. Proc. ICML.
|
| |
3
|
|
| |
4
|
|
| |
5
|
De Bie, T., & Cristianini, N. (2003). Convex methods for transduction. Proceedings NIP 16.
|
| |
6
|
Durbin, B., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis. Cambridge U. Press.
|
| |
7
|
Helmberg, C. (2000). Semidefinite programming for combinatorial optimization (Technical Report).
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
Laurent, M., & Poljak, S. (1995). On a positive semidefinite relaxation of the cut polytope. Linear Algebra and its Applications, 223/224.
|
| |
12
|
|
| |
13
|
Nesterov, Y., & Nimirovskii, A. (1994). Interior-point polynomial algorithms in convex programming. SIAM.
|
| |
14
|
Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of IEEE, 77, 257--286.
|
| |
15
|
Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin Markov networks. Proceedings NIPS 16.
|
 |
16
|
Ioannis Tsochantaridis , Thomas Hofmann , Thorsten Joachims , Yasemin Altun, Support vector machine learning for interdependent and structured output spaces, Proceedings of the twenty-first international conference on Machine learning, p.104, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015341]
|
| |
17
|
Xu, L., Neufeld, J., Larson, B., & Schuurmans, D. (2004). Maximum margin clustering. NIPS 17.
|
| |
18
|
Xu, L., & Schuurmans, D. (2005). Unsupervised and semi-supervised multi-class support vector machines. Proceedings AAAI.
|
|