ACM Home Page
Please provide us with feedback. Feedback
Feature hashing for large scale multitask learning
Full text PdfPdf (736 KB)
Source ACM International Conference Proceeding Series; Vol. 382 archive
Proceedings of the 26th Annual International Conference on Machine Learning table of contents
Montreal, Quebec, Canada
Pages 1113-1120  
Year of Publication: 2009
ISBN:978-1-60558-516-1
Authors
Kilian Weinberger  Yahoo! Research, Santa Clara, CA
Anirban Dasgupta  Yahoo! Research, Santa Clara, CA
John Langford  Yahoo! Research, Santa Clara, CA
Alex Smola  Yahoo! Research, Santa Clara, CA
Josh Attenberg  Yahoo! Research, Santa Clara, CA
Sponsors
: MITACS
: NSF
Microsoft Research : Microsoft Research
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 48,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1553374.1553516
What is a DOI?

ABSTRACT

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We demonstrate the feasibility of this approach with experimental results for a new use case --- multitask learning with hundreds of thousands of tasks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Bennett, J., & Lanning, S. (2007). The Netflix Prize. Proceedings of Conference on Knowledge Discovery and Data Mining Cup and Workshop 2007.
 
3
Bernstein, S. (1946). The theory of probabilities. Moscow: Gastehizdat Publishing House.
 
4
Daume, H. (2007). Frustratingly easy domain adaptation. Annual Meeting of the Association for Computational Linguistics (p. 256).
 
5
Ganchev, K., & Dredze, M. (2008). Small statistical models by random feature mixing. Workshop on Mobile Language Processing, Annual Meeting of the Association for Computational Linguistics.
 
6
 
7
Langford, J., Li, L., & Strehl, A. (2007). Vowpal wabbit online learning project (Technical Report). http://hunch.net/?p=309.
 
8
Ledoux, M. (2001). The concentration of measure phenomenon. Providence, RI: AMS.
 
9
Li, P., Church, K., & Hastie, T. (2007). Conditional random sampling: A sketch-based sampling technique for sparse data. In B. Schöölkopf, J. Platt and T. Hoffman (Eds.), Advances in neural information processing systems 19, 873--880. Cambridge, MA: MIT Press.
 
10
Rahimi, A., & Recht, B. (2008). Random features for large-scale kernel machines. In J. Platt, D. Koller, Y. Singer and S. Roweis (Eds.), Advances in neural information processing systems 20. Cambridge, MA: MIT Press.
 
11
Rahimi, A., & Recht, B. (2009). Randomized kitchen sinks. In L. Bottou, Y. Bengio, D. Schuurmans and D. Koller (Eds.), Advances in neural information processing systems 21. Cambridge, MA: MIT Press.
 
12
Shi, Q., Petterson, J., Dror, G., Langford, J., Smola, A., Strehl, A., & Vishwanathan, V. (2009). Hash kernels. Proc. Intl. Workshop on Artificial Intelligence and Statistics 12.

Collaborative Colleagues:
Kilian Weinberger: colleagues
Anirban Dasgupta: colleagues
John Langford: colleagues
Alex Smola: colleagues
Josh Attenberg: colleagues