ACM Home Page
Please provide us with feedback. Feedback
Natural language generation for sponsored-search advertisements
Full text PdfPdf (196 KB)
Source
Electronic Commerce archive
Proceedings of the 9th ACM conference on Electronic commerce table of contents
Chicago, Il, USA
SESSION: Sponsored search table of contents
Pages 1-9  
Year of Publication: 2008
ISBN:978-1-60558-169-9
Authors
Kevin Bartz  Department of Statistics, Harvard University, Cambridge, MA, USA
Cory Barr  Yahoo!, Inc., Burbank, CA, USA
Adil Aijaz  Yahoo!, Inc., Burbank, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGEcom: ACM Special Interest Group on Electronic Commerce
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 101,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1386790.1386792
What is a DOI?

ABSTRACT

In sponsored search, advertisers bid on phrases representative of offered products or services. For large advertisers, these phrases often come from quasi-algorithmically generated lists of thousands of terms prone to poor linguistic construction. A bidded term by itself is usually unsuitable for direct insertion into an ad copy template; it must be rephrased and capitalized properly to fit the template, possibly with additional language to avoid semantic ambiguity. We develop a natural language generation system to automate these steps, preparing a list of terms for insertion into an ad template. For each input term, our system first finds a proper word ordering by mining a corpus of Web search query logs. Next it determines whether the term is ambiguous and--if semantics dictate--attaches a clarifying modifier culled from query logs. Finally, it applies proper capitalization by analyzing pages from Web search engine results. Each step yields a plausible set of displayable forms from which a machine-learned model selects the best. The models are trained and tested on a large set of human-labeled data. The overall system significantly outperforms baseline systems that use simple heuristics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
H. Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716--723, 1974.
 
2
 
3
M. Arellano. Panel Data Econometrics. Oxford University Press, Oxford, UK, 2003.
 
4
C. Chelba and A. Acero. Adaptation of maximum entropy capitalizer: Little data can help a lot. Computer Speech & Language, 20(4):382--399, 2006.
 
5
D. Conway. An algorithmic approach to english pluralization. Proceedings of the Second Annual Perl Conference, 1998.
 
6
 
7
D. C. Fain and J. O. Pedersen. Sponsored search: A brief history. Proceedings of the ACM Conference on Electronic Commerce: Second Workshop on Sponsored Search Auctions, 2006.
 
8
D. Hardt. Comma checking in danish. Proceedings of the Corpus Linguistics 2001 Conference, pages 266--271, 2001.
 
9
 
10
D. McFadden. Conditional logit analysis of qualitative choice behavior. In P. Zarembka, editor, Frontiers in Econometrics. Academic Press, New York, 1974.
 
11
J. Oberlander and C. Brew. Stochastic text generation. Philosophical Transactions: Mathematical, Physical and Engineering Science, 358:1373--1387, 2000.
 
12
M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
 
13
 
14
C. Reed and D. Long. Generating punctuation in written arguments. Technical Report RN/97/157, Department of Computer Science, University College, London, UK, 1997.
 
15
 
16
K. Train. Discrete Choice Methods with Simulation. Cambridge University Press, Cambridge, UK, 2003.

Collaborative Colleagues:
Kevin Bartz: colleagues
Cory Barr: colleagues
Adil Aijaz: colleagues