LIS 7610/CSC 7481 Readings
Adapted from Doug Oard's LBSC 796/INFM 718R
Spring 2011.
Last update: August 16 2012.
The principal text for this course (referred to below as "MRS" for the
authors' initials) is Christopher D. Manning, Prabhakar Raghavan and
Heinrich Schuetze, An
Introduction to Information Retrieval, 2008. This book is also available on the Web at this point.
Some other books on information retrieval:
- Ricardo Baeza-Yates and Berthier Rubiero-Neto, Modern
Information Retrieval, Addison Wesley, 1999.
- Ian H. Witten, Alaitair Moffat, and Timmothy C. Bell,
Managing Gigabytes, Morgan Kaufmann, Second Edition,
1999.
- David A. Grossman and Ophir Frieder, Information Retrieval:
Algorithms and Heuristics, Kluwer Academic, 2004.
- William B. Frakes and Ricardo Baeza-Yates (ed.), Information
Retrieval: Data Structures and Algorithms, Prentice-Hall,
1992.
- Tomek Strzalkowski (ed.), Natural Language Information
Retrieval, Kluwer, 1999.
- Christopher D. Manning and Heinrich Schuetze, Statistical
Natural Language Processing, MIT Press, 2000.
- Karen Sparck-Jones and Peter Willet (ed.), Readings in
Information Retrieval, Morgan-Kaufmann, 1997.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990.
Downloading readings from the Web may require Microsoft Word or
Acrobat Reader, depending on the format.
Readings for Session 1 (Overview)
Required Readings:
- MRS Chapter 1: IR Using the Boolean Model
- Peter Ingwersen and Kalervo Jarvelin, The Turn: Integration of Information Seeking and Retrieval in Context, Springer, 2005.
Read only Chapter 1.
Recommended Readings:
- Tefko Saracevic, (1999) Information
Science. Journal of the American Society for Information
Science, 50(12)1051-1063.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990. Chapter 1,
pages 1-10.
Readings for Session 2 (Evidence from Content)
Required Readings:
- MRS Chapter 2: The dictionary and postings lists
- MRS Chapter 3: Tolerant retrieval
Recommended Readings:
- Jimmy Lin and Chris Dyer,
Data-Intensive Text Processing with MapReduce,
Synthesis Lectures on Human Language Technologies. Read only chapter 1.
- Christopher Manning and Heinrich Schuetze, Foundations of
Statistical Natural Language Processing, Chapter 5
(Collocations), MIT Press, 1999. Available from the
book's Web site.
- George A. Miller. (1995) WordNet:
lexical database for English.. Communications of the ACM, 38(11), 39-41. Also available from
the ACM Digital Library.
- Prager, John, Eric Brown, Anni Coden and Dragomir Radev.
"Question-Answering by Predictive Annotation," in
Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
July 24-28, 2000, Athens Greece, pp. 184-191. Available on
campus from the
ACM Digital Library.
- Donna Harman "Inverted Files," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 3. Available at LSU Middleton Library.
Readings for Session 3 (Ranked Retrieval)
Required Readings:
Recommended Readings:
- Stephen Robertson, Hugo Zaragoza and Michael Taylor,
"Simple BM25 Extension to Multiple Weighted Fields,"
Proceedings of the Thirteenth ACM International Conference
on Information and Knowledge Management, pp. 42049, 2004.
- Djoerd Hiemstra and Arjen P. de Vries, "Relating the New
Language Models of Information Retrieval to the Traditional Retrieval
Models," Technical Report TR-CTIT-00-09.
Link.
- Donna Harman "Ranking Algorithms," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 14. Available at LSU Middleton Library.
- S.E. Robertson et al, "Okapi at TREC-3," Proceedings of the
Third Text Retrieval Conference, 1994. Available on the TREC
Web site.
- Amit Singhal, "Pivoted Document Length Normalization," SIGIR
1996. Available through the ACM
Digital Library.
- James Allan, ed. "Challlenges in Information Retrieval and
Language Modeling", SIGIR Forum, 37(1)31-47, Spring, 2003.
Available from SIGIR or ACM Digital Library.
- David R. H. Miller, Tim Leek, and Richard M. Schwartz,
"A Hidden Markov Model Information Retrieval System,"
SIGIR 99. Available from the
ACM
Digital Library.
- W. B. Croft and J. Lafferty, ed., Language Modeling for
Information Retrieval, Kluwer, 2003.
Required Readings for Session 4 (Interaction)
Required Readings:
- Marti Hearst, "User Interfaces for Search," in Modern Information Retrieval, 2nd Edition,
Chapter 2, Addison-Wesley Longman, 2010. http://mir2ed.org/
- Ryen W. White and Resa A. Roth, Exploratory Search: Beyond the Query-Response Paradigm,
Synthesis Lectures on Information Concepts, Retrieval and Services, 2009.
Recommended Readings:
- Peter Pirolli and Stuart Card, "Information Foraging,"
Psychological Review. 106(4), 643-675, 1999.
Available throught LSU Libraries database search PsycInfo.
- Robert S. Taylor, "The Process of Asking Questions,"
American Documentation, 13(4)391-396, 1962. (Available from LSU Libraries, search for this journal in "EJOURNALS.")
- Efthimis N. Efthimiadis and Stephen E. Robertson. (1989)
Feedback and Interaction in Information Retrieval. In
Charles Oppenheim, ed., Perspectives in Information
Management. London: Butterworth.
Readings for Session 5 (Evaluation)
Required Readings:
- MRS Chapter 8: Evaluation in Information Retrieval
- Mark Sanderson, "Test Collection Based Evaluation of Information Retrieval Systems",
Foundations and Trends in Information Retrieval, 4(4), 247-375, 2010.
Link
Recommended Readings:
- Filip Radinski and Nick Craswell, "Comparing the Sensitivity of Information Retrieval Metrics,"
Proceedings of ACM SIGIR, pp. 667-674, 2010.
- Ellen M. Voorhees, "Variations in Relevance Judgments and the
Measurement of Retrieval Effectiveness," Information
Processing and Management, 36(5)697-716. Available on
campus from Science
Direct.
- Chris Buckley and Ellen M. Voorhees, "Evaluating Evaluation
Measure Stability", SIGIR 2000. Available through the ACM
Digital Library.
- Ellen M. Voorhees and Chris Buckley, "The Effect of Topic Set
Size on Retrieval Experiment Error," SIGIR 2002, Available through the ACM
Digital Library.
- R. Mamantha, Ao Feng and James Allan, "A Critical Evaluation of
TDT's Cost Function," SIGIR 2002. Available from the ACM
Digital Library.
- Stefano Mizzaro. (1999) How Many Relevances in Information
Retrieval? Interacting With Computers, 10(3)305-322.
- Andrew H. Turpin and William Hersh, "Why Batch and User
Evaluations Do Not Give the Same Results," SIGIR 2001.
Available from the ACM
Digital Library.
Readings for Session 6 (Web Search)
Required Readings:
- MRS Chapter 19. Web Search Basics
- MRS Chapter 20. Web Crawling and Indexing
- Dennis Fetterly, "Adversarial Information Retrieval: The Manipulation of Web Content,"
ACM Computing Reviews, 2007.
Link
Recommended Readings:
- Larry Page, Sergey Brin, Rajeev Motwani and Terry Winograd,
"Page Rank Citation Ranking: Bringing Order to the Web,"
Stanford Digital Library Working Paper SIDL-WP-1999-0120, 1998.
Link
- Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew
Ng. Data-Intensive Question Answering. Proceedings of the Tenth
Text REtrieval Conference (TREC 2001).
Readings for Session 7 (Evidence from Behavior)
Required Readings:
Recommended Readings:
- Yoshiyuki Inagaki, Narayanan Sadagopan, Georges Dupret, Ciya Liao, Anlei Dong, Yi Chang and Zhaohui Zheng,
"Session-Based Click Features for Recency Ranking,"
Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 1334-1339, 2010.
Link
- Larry Page, Sergey Brin, Rajeev Motwani and Terry Winograd, "Page
Rank Citation Ranking: Bringing Order to the Web," Stanford Digital Library Working Paper
SIDL-WP-1999-0120, 1998. Available from CiteSeer.
- Jon M. Kleinberg, "Authoratative Sources in a Hyperlinked
Environment," Journal of the ACM, 46(5)604-632. Available from the ACM
Digital Library.
- Douglas W. Oard and Jinmook Kim, "Modeling Information Content
Using Observable Behavior," in Proceedings of the 2001
Annual Meeting of the American Society for Information Science and
Technology, Washington, November, 2001. Available from Doug Oard's Web
site
Readings for Session 8 (Scanned Documents)
Required Readings:
- David Doermann, "The Indexing and Retrieval of Document Images:
A Survey",
Computer Vision and Image Understanding, 70(3), 287-298,
1998. Available on campus from Science Direct.
- Toni M. Rath, R. Manmatha, and Victor Lavrenko, "A Search
Engine for Historical Manuscripts," SIGIR 2004. Available from
CIIR
Recommended Readings:
- Tseng, Y.-H. and Oard, D. W., Document Image Retrieval
Techniques for Chinese. In Proceedings of the 2001 Symposium
on Document Image Understanding Technology, Columbia, MD, 2001.
Available from Doug Oard's
Web site
Readings for Session 9 (Evidence from Metadata)
Required Readings:
- Nigel Shadbolt, Wendy Hall and Tim Berners-Lee, "The
Semantic Web Revisited," IEEE Intelligent Systems, 12(3)96-101, 2006.
- National Information Standards Organization,
Understanding Metadata, 2004.
- Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak and Xachary Ives,
"DBpedia: A Nucleus for a Web of Open Data," 6th International Semantic Web Conference, pp. 503-517, 2007
Link.
Recommended Readings:
Readings for Session 10 (Filtering)
Required Readings:
- MRS Chapter 9: Relevance Feedback and Query Expansion
- Joshua Goodman, Gordon V. Comack and David Heckerman, Spam and
the Ongoing Battle for the Inbox, Communications of the ACM,
50(2)24-33, 2007. (available on campus from the ACM Digital
Library)
Recommended Readings:
- Robert M. Bell and Yehuda Koren, "Lessons from the Netflix Prize Challenge," SIGKDD Explorations, 9(2)75-79, 2007.
Link
- Douglas W. Oard, "The State of the Art in Text Filtering," User
Modeling and User-Adapted Interaction, 2007.
Readings for Session 11 (Audio: Speech and Music)
Required Readings:
- William Byrne et al, "Automatic Recognition of Spontaneous
Speech for Access to Multilingual Oral History Archives," IEEE
Transations on Audio and Speech Processing, 2004. Available on
campus from IEEE Xplore.
- Nicola Orio, Music Retrieval: A Tutorial and Review, Foundations and Trends in Information Retrieval (1)1, 2006.
Link
Recommended Readings:
- Elias Pampalk, Simon Dixon and Gerhard Widmer, "Exploring Music
Collections by Browsing Different Views," in International
Conference on Music Information Retrieval, 2003. Available on
the ISMIR 2003
Web site.
- Jonathan Foote, "An Overview of Audio Information Retrieval,"
ACM-Springer Multimedia Systems, 7(1)2-10,
1999. Available from CiteSeer
- John S. Garofolo, Cedric G. P. Auzanne and Ellen M. Voorhees,
"The TREC Spoken Document Retrieval Track: A success story,"
in Proceedings of the Eighth Text Retrieval Conference,
1999, pp. 107-130. Available from the TREC
Web site
- Rodger J. McNabb, Lloyd A. Smith, Ian H. Witten, and Clare
L. Henderson, "Tune
Retrieval in the Multimedia Library," Multimedia Tools and
Applications, 10(2-3)113-132, 2000. Available from the
New
Zealand Digital Library Web site.
Readings for Session 12 (CLIR)
Required Readings:
- Jian-Yun Nie, Cross-Language Information Retrieval, Synthesis Lectures on Human Language Technologies, Morgan Claypool, 2010.
Link
- Douglas W. Oard, Peter Brusilovsky, Daqing He, Judith Klavans, Tomasz Loboda, Leiming Qian, Dagobert Soergel and Pengyi Zhang, "Formative Evaluation for Multilingual Multimedia Search and Sense-Making," Handbook of Natural Language Processing and Machine Translation, Springer, 2011.
Link
Recommended Readings:
- Gina-Anne Levow, Douglas W. Oard, Philip Resnik,
"Dictionary-Based Techniques for Cross-Language Information
Retrieval," Information Processing and Management, 41(3), 2005.
Available from Levow (.ps) or available on campus from LSU Libraries Electronic Journals Service.
- Daqing He, et al., "Making MIRACLEs: Interactive Translingual
Search for Cebuano and Hindi," ACM Transactions on Asian
Language Information Retrieval, 2(2-3). Available from the
ACM Digital Library.
Readings for Session 13 (Images and Video)
Required Readings:
- Stefan Rueger, Multimedia Information Retrieval, Synthesis Lectures on Information Concepts, Retrieval and Services,
Morgan Claypool, 2010. Available from LSU Library Catalog as online electronic resource.
- Paul Over, George Awad, Jon Fiscus, Martial Michel, Alan Smeaton and Wessel Kraaij,
"TRECVID-2009 Goals, Tasks, Data, Evaluation Mechanisms and Metrics," April 8, 2010.
Available from the TRECVID Web site and
here as PDF.
Recommended Readings:
- Vekant N. Gudivada and Vijay V. Raghavan, "Modeling and
Retrieving Images by Content," Information Processing and
Management, 33(4)427-452, 1997. Available on campus from
Science Direct.
- Howard Wactlar et al., "Complementary Audio and Video Analysis
for Broadcast News Archives," Communicatuions of the
ACM, 43(2)42-47, 2000. Available on campus from the ACM
Digital Library.
- Chad Carson, Serge Belongie, Hayit Greenspan and Jitendra
Malik, "Blobworld: Image Segmentation Using
Expectation-Maximization and Its Application to Image Querying,"
IEEE Transactions on Pattern Analysis and Machine Intelligence,
24(8)1026-1038, 2002. Available on campus from IEEE
Explore.
- Alan Smeaton, Wessel Kraaij and Paul Over, "TRECVID-2003 Video
Retrieval Evaluation Overview," Powerpoint slides, 2003.
Available from the TRECVID
Web site.
Session 14: No Class
Go back to Syllabus Page.