ACM Home Page
Please provide us with feedback. Feedback
Demographic prediction based on user's browsing behavior
Full text PdfPdf (303 KB)
Source
International World Wide Web Conference archive
Proceedings of the 16th international conference on World Wide Web table of contents
Banff, Alberta, Canada
SESSION: Predictive modeling of web users table of contents
Pages: 151 - 160  
Year of Publication: 2007
ISBN:978-1-59593-654-7
Authors
Jian Hu  Microsoft Research Asia, Beijing, China
Hua-Jun Zeng  Microsoft Research Asia, Beijing, China
Hua Li  Microsoft Research Asia, Beijing, China
Cheng Niu  Microsoft Research Asia, Beijing, China
Zheng Chen  Microsoft Research Asia, Beijing, China
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 26,   Downloads (12 Months): 156,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1242572.1242594
What is a DOI?

ABSTRACT

Demographic information plays an important role in personalized web applications. However, it is usually not easy to obtain this kind of personal data such as age and gender. In this paper, we made a first approach to predict users' gender and age from their Web browsing behaviors, in which the Webpage view information is treated as a hidden variable to propagate demographic information between different users. There are three main steps in our approach: First, learning from the Webpage click-though data, Webpages are associated with users' (known) age and gender tendency through a discriminative model; Second, users' (unknown) age and gender are predicted from the demographic information of the associated Webpages through a Bayesian framework; Third, based on the fact that Webpages visited by similar users may be associated with similar demographic tendency, and users with similar demographic information would visit similar Webpages, a smoothing component is employed to overcome the data sparseness of web click-though log. Experiments are conducted on a real web click-through log to demonstrate the effectiveness of the proposed approach. The experimental results show that the proposed algorithm can achieve up to 30.4% improvements on gender prediction and 50.3% on age prediction in terms of macro F1, compared to baseline algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Berryman-Fink, C. L., J. R. Wilcox (1983). A multivariate investigation of perceptual attributions concerning gender appropriateness in language, Sex Roles 9, 1983.
 
2
Biber, D., S. Conrad, R. Reppen (1998). Corpus Linguistics Investigating Language Structure and Use, Cambridge University Press, Cambridge, 1998.
 
3
Computerworld Report: Men Want Facts, Women Seek Personal Connections on Web, http://www.computerworld.com/developmenttopics/websitemgmt/story/0,10801,107391p2,00.html.
 
4
Eckert, P. (1997). Gender and sociolinguistic variation, in J. Coates ed., Readings in Language and Gender, Blackwell, Oxford 1997, pp. 64--75.
 
5
Herring, S. (1996). Two variants of an electronic message schema, in S. Herring ed., Computer-Mediated Communication: Linguistic, Social and Cross-Cultural Perspectives (John Benjamins, Amsterdam, 1996), pp. 81--106.
 
6
Holmes, J. (1993). Women's talk: The question of sociolinguistic universals, Australian Journal of Communications 20, 3, 1993.
 
7
Google Personal. http://labs.google.com/personalized.
 
8
J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pages 43--52. Morgan Kaufman, 1998.
 
9
Lakoff, R. T. (1975). Language and Women's Place, Harper Colophon Books, New York, 1975.
10
 
11
M. Koppel, J. Schler, S. Argamon, and J.W. Pennebaker. Effects of age. and gender on blogging. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs, 2006.
 
12
M. Koppel, S. Argamon and A. R. Shimoni (2003). Automatically Categorizing Written Texts by Author Gender. In Literary and Linguistic Computing, 2003. Mulac, A., L. B. Studley, S. Blau (1990). The gender-linked language effect in primary and secondary students' impromptu essays, Sex Roles 23, 9/10, 1990.
 
13
Mulac, A., L. B. Studley, S. Blau (1990). The gender-linked language effect in primary and secondary students' impromptu essays, Sex Roles 23, 9/10, 1990.
 
14
Mulac, A., T. L. Lundell (1994). Effects of gender-linked language differences in adults' written discourse: Multivariate tests of language effects, Language & Communication 14, 3, 1994.
 
15
Palander-Collin, M. (1999). Male and female styles in 17th century correspondence, Language Variation and Change 11, pp. 123--141.
16
 
17
Simkins-Bullock, J. A., B. G. Wildman (1991). An investigation into the relationship between gender and language, Sex Roles 24, 1991.
 
18
Search Engine Watch Journal, Behavioral Targeting and Contextual Advertising, http://www.searchenginejournal.com/?p=836.
 
19
 
20
 
21
 
22
iMedia Connection: Behavioral Targeting Online Ad Spend, http://www.imediaconnection.com/content/9236.asp
 
23
 
24
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender systems-a case study, 2000.
 
25
26
 
27
Pazzani M., Muramatsu J., and Billsus D. Syskill & Webert: Identifying Interesting Web Sites. In Proc. of the 13th National Conference on Artificial Intelligence, pages: 54--61, 1996.
 
28
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
 
29
Amanda Lenhart, Susannah Fox. Bloggers: A portrait of the internet's new storytellers. http://www.pewinternet.org/pdfs/ PIP%20Bloggers%20Report%20July%2019%202006.pdf
 
30
 
31

CITED BY  8

Collaborative Colleagues:
Jian Hu: colleagues
Hua-Jun Zeng: colleagues
Hua Li: colleagues
Cheng Niu: colleagues
Zheng Chen: colleagues