LIS 7610/CSC 7481 Readings
Adapted from Doug Oard's INST 734.
Last update: August 19, 2015.
The principal text for this course (referred to below as "RBBR" for the
authors' initials) is:
W. Bruce Croft, Donald Metzler and Trevor Strohman, 2015. Search Engines:
Information Retrieval in Practice (referred to below as "CMS" for the authors' initials.)
Recommended (secondary) text:
Ricardo Baeza-Yates and Berthier Ribeiro-Neto,
Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition) (ACM Press Books), 2011.
Some other books on information retrieval:
- Christopher D. Manning, Prabhakar Raghavan and Heinrich Schuetze, An
Introduction to Information Retrieval, 2008. This book is available on the Web for free at this point.
- William B. Frakes and Ricardo Baeza-Yates (ed.), Information
Retrieval: Data Structures and Algorithms, Prentice-Hall,
1992.
- Ian H. Witten, Alaitair Moffat, and Timmothy C. Bell,
Managing Gigabytes, Morgan Kaufmann, Second Edition,
1999.
- David A. Grossman and Ophir Frieder, Information Retrieval:
Algorithms and Heuristics, Kluwer Academic, 2004.
- Tomek Strzalkowski (ed.), Natural Language Information
Retrieval, Kluwer, 1999.
- Christopher D. Manning and Heinrich Schuetze, Statistical
Natural Language Processing, MIT Press, 2000.
- Karen Sparck-Jones and Peter Willet (ed.), Readings in
Information Retrieval, Morgan-Kaufmann, 1997.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990.
Readings for Session 1 (Structure of IR systems)
Required Readings:
- CMS Chapter 1: Search engines and information retrieval.
- CMS Chapter 2: Architecture
- Peter Ingwersen and Kalervo Jarvelin, The Turn: Integration of Information Seeking and Retrieval in Context, Springer, 2005.
Read only Chapter 1. (Available on Moodle.) Slides about the paper.
Recommended Readings:
- Tefko Saracevic, (1999) Information
Science. Journal of the American Society for Information
Science, 50(12)1051-1063.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990. Chapter 1,
pages 1-10.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 2 (Content-based IR systems)
Required Readings:
Recommended Readings:
- Christopher Manning and Heinrich Schuetze, Foundations of
Statistical Natural Language Processing, Chapter 5
(Collocations), MIT Press, 1999. Available from the
book's Web site.
- George A. Miller. (1995) WordNet:
lexical database for English.. Communications of the ACM, 38(11), 39-41. Also available from
the ACM Digital Library.
- Prager, John, Eric Brown, Anni Coden and Dragomir Radev.
"Question-Answering by Predictive Annotation," in
Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
July 24-28, 2000, Athens Greece, pp. 184-191. Available on
campus from the
ACM Digital Library.
- Donna Harman "Inverted Files," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 3. Available at LSU Middleton Library.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 3 (Ranked Retrieval)
Required Readings:
- CMS Chapter 5: Ranking with indexes
- CMS Chapter 7: Retrieval models
Recommended Readings:
- Stephen Robertson, Hugo Zaragoza and Michael Taylor,
"Simple BM25 Extension to Multiple Weighted Fields,"
Proceedings of the Thirteenth ACM International Conference
on Information and Knowledge Management, pp. 42049, 2004.
- Djoerd Hiemstra and Arjen P. de Vries, "Relating the New
Language Models of Information Retrieval to the Traditional Retrieval
Models," Technical Report TR-CTIT-00-09.
Link.
- Donna Harman "Ranking Algorithms," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 14. Available at LSU Middleton Library.
- S.E. Robertson et al, "Okapi at TREC-3," Proceedings of the
Third Text Retrieval Conference, 1994. Available on the TREC
Web site.
- Amit Singhal, "Pivoted Document Length Normalization," SIGIR
1996. Available through the ACM
Digital Library.
- James Allan, ed. "Challlenges in Information Retrieval and
Language Modeling", SIGIR Forum, 37(1)31-47, Spring, 2003.
Available from SIGIR or ACM Digital Library.
- David R. H. Miller, Tim Leek, and Richard M. Schwartz,
"A Hidden Markov Model Information Retrieval System,"
SIGIR 99. Available from the
ACM
Digital Library.
- W. B. Croft and J. Lafferty, ed., Language Modeling for
Information Retrieval, Kluwer, 2003.
-------------------------------------------------------------------------------------------------------------------------------------------------
Required Readings for Session 4 (Interaction)
Required Readings:
- CMS Chapter 6: Queries and interfaces
- Marti Hearst, Search User
Interfaces, Cambridge University Press, 2009. Read only
Chapter 1.
- Ryen W. White and Resa A. Roth, Exploratory Search: Beyond the Query-Response Paradigm,
Synthesis Lectures on Information Concepts, Retrieval and Services, 2009. Read Chapter 4.
(Available on Moodle.) Slides about the paper.
Recommended Readings:
- Peter Pirolli and Stuart Card, "Information Foraging,"
Psychological Review. 106(4), 643-675, 1999.
Available throught LSU Libraries database search PsycInfo.
- Robert S. Taylor, "The Process of Asking Questions,"
American Documentation, 13(4)391-396, 1962. (Available from LSU Libraries, search for this journal in "EJOURNALS.")
- Efthimis N. Efthimiadis and Stephen E. Robertson. (1989)
Feedback and Interaction in Information Retrieval. In
Charles Oppenheim, ed., Perspectives in Information
Management. London: Butterworth.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 5 (Evaluation)
Required Readings:
- CMS Chapter 8: Evaluating search engines
- Diane Kelly, 2009. "Methods
for Evaluating Information Retrieval Systems with Users,"
Foundations and Trends in Information Retrieval, 3(1-2)1-224. Read only Chapter 4.
- Oliver Chapelle, Thorsten Joachims, Filip Radlinski and Yisong Yue. 2012.
Large-Scale Validation and Analysis of Interleaved Search Evaluation.
ACM Transactions on Information Systems, 30(1)6: 1-40. (Available on Moodle.)
Recommended Readings:
- Filip Radinski and Nick Craswell, "Comparing the Sensitivity of Information Retrieval Metrics,"
Proceedings of ACM SIGIR, pp. 667-674, 2010.
- Ellen M. Voorhees, "Variations in Relevance Judgments and the
Measurement of Retrieval Effectiveness," Information
Processing and Management, 36(5)697-716. Available on
campus from Science
Direct.
- Chris Buckley and Ellen M. Voorhees, "Evaluating Evaluation
Measure Stability", SIGIR 2000. Available through the ACM
Digital Library.
- Ellen M. Voorhees and Chris Buckley, "The Effect of Topic Set
Size on Retrieval Experiment Error," SIGIR 2002, Available through the ACM
Digital Library.
- R. Mamantha, Ao Feng and James Allan, "A Critical Evaluation of
TDT's Cost Function," SIGIR 2002. Available from the ACM
Digital Library.
- Stefano Mizzaro. (1999) How Many Relevances in Information
Retrieval? Interacting With Computers, 10(3)305-322.
- Andrew H. Turpin and William Hersh, "Why Batch and User
Evaluations Do Not Give the Same Results," SIGIR 2001.
Available from the ACM
Digital Library.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 6 (Web Search)
Required Readings:
- CMS Chapter 3: Crawls and feeds
- Carlos Castillo and Brian D. Davison, 2010. "Adversarial
Web Search, Foundations and Trends in Information
Retrieval, 4(5)377-486. Read only Chapter 2.
- Mounia Lalmas, 2011. "Aggregated
Search," in Advanced Topics in Information Retrieval (European
Summer School on IR), pp. 109-123, Springer.
Recommended Readings:
- Larry Page, Sergey Brin, Rajeev Motwani and Terry Winograd,
"Page Rank Citation Ranking: Bringing Order to the Web,"
Stanford Digital Library Working Paper SIDL-WP-1999-0120, 1998.
Link
- Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew
Ng. Data-Intensive Question Answering. Proceedings of the Tenth
Text REtrieval Conference (TREC 2001).
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 7 (Behavior-based IR systems)
Required Readings:
- Christopher D. Manning, Prabhakar Raghavan and Heinrich
Schütze, 2009. An
Introduction to Information Retrieval. Read only
Chapter 21 (Link Analysis)
- Yoshiyuki Inagaki, Narayanan Sadagopan, Georges Dupret, Ciya
Liao, Anlei Dong, Yi Chang and Zhaohui Zheng, 2010. Session-Based
Click Features for Recency Ranking. Proceedings of the 24th
AAAI Conference on Artificial Intelligence, 1334-1339.
- Diane Kelly and Jamie Teevan, 2003. Implicit Feedback for Inferring
User Preference, a Bibliography. SIGIR Forum, 37(2)18-28.
Recommended Readings:
- Emily Steel and Julia Angwin, "On the Web's Cutting Edge, Anonymity in Name Only,"
Wall Street Journal, 2010.
Link
- Yoshiyuki Inagaki, Narayanan Sadagopan, Georges Dupret, Ciya Liao, Anlei Dong, Yi Chang and Zhaohui Zheng,
"Session-Based Click Features for Recency Ranking,"
Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 1334-1339, 2010.
Link
- Larry Page, Sergey Brin, Rajeev Motwani and Terry Winograd, "Page
Rank Citation Ranking: Bringing Order to the Web," Stanford Digital Library Working Paper
SIDL-WP-1999-0120, 1998. Available from CiteSeer.
- Jon M. Kleinberg, "Authoratative Sources in a Hyperlinked
Environment," Journal of the ACM, 46(5)604-632. Available from the ACM
Digital Library.
- Douglas W. Oard and Jinmook Kim, "Modeling Information Content
Using Observable Behavior," in Proceedings of the 2001
Annual Meeting of the American Society for Information Science and
Technology, Washington, November, 2001. Available from Doug Oard's Web
site
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 8 (Metadata-based IR systems)
Required Readings:
Recommended Readings:
- Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak and Xachary Ives,
"DBpedia: A Nucleus for a Web of Open Data," 6th International Semantic Web Conference, pp. 503-517, 2007
Link.
- Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak and Xachary Ives,
"DBpedia: A Nucleus for a Web of Open Data," 6th International Semantic Web Conference, pp. 503-517, 2007
Link.
- Carl Lagoze and Herbert Van de Stomple, "The Open Archives
Initiative: Building a Low-Barrier Interoperability Framework,"
Proceedings of the First ACM/IEEE-CS Joint Conference on Digital
Libraries, Roanoke, VA, June 2001, pp. 54-62. Available
from the ACM
Digital Library.
- National Science Digital Library (NSDL) Metadata Guidelines:
http://nsdl.org/collection/metadata-guide.php
- Diane Hillman, "National Science Digital Library (NSDL)
Metadata Primer," Web publication, 2003.
http://marinemetadata.org/references/nsdlprimer
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 9 (Filtering, Clustering, and Classification)
Required Readings:
- CMS Chapter 9: Classification and clustering
- CMS Chapter 10: Social search
- Michael D. Ekstrand, John T. Riedl and Joseph A. Konstan, 2011. Collaborative
Filtering Recommender Systems. Foundations and Trends in
Human-Computer Interaction, 4(2):81-173. Read only Chapter 2.
Recommended Readings:
- Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition)
(ACM Press Books), 2011. Chapter 8. Text classification.
- Joshua Goodman, Gordon V. Comack and David Heckerman, Spam and
the Ongoing Battle for the Inbox, Communications of the ACM,
50(2)24-33, 2007. (available on campus from the ACM Digital
Library)
- Robert M. Bell and Yehuda Koren, "Lessons from the Netflix Prize Challenge," SIGKDD Explorations, 9(2)75-79, 2007.
Link
- Douglas W. Oard, "The State of the Art in Text Filtering," User
Modeling and User-Adapted Interaction, 2007.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 10 (Scanned Documents)
Required Readings:
- CMS Chapter 11: Beyond bag of words
- Toni M. Rath, R. Manmatha, and Victor Lavrenko, 2004. A Search
Engine for Historical Manuscripts. SIGIR 2004. Available from
CIIR
- Chew Lim Tan, Xi Zhang and Linlin Li, 2014. Image Based Retrieval of Keyword Spotting in Documents," in Handbook of Document Image Processing and Recognition, 805-842, Springer. (Available on Moodle.)
Recommended Readings:
- Tseng, Y.-H. and Oard, D. W., Document Image Retrieval
Techniques for Chinese. In Proceedings of the 2001 Symposium
on Document Image Understanding Technology, Columbia, MD, 2001.
Available from Doug Oard's
Web site
- David Doermann, "The Indexing and Retrieval of Document Images:
A Survey",
Computer Vision and Image Understanding, 70(3), 287-298,
1998. Available on campus from Science Direct.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 11 (Cross-Language Information Retrieval)
Required Readings:
- Jian-Yun Nie, Cross-Language Information Retrieval, Synthesis Lectures on Human Language Technologies, Morgan Claypool, 2010. Read only Chapter 3.
Link. (Available on Moodle.)
- Paul McNamee and James Mayfield, 2002. Comparing
cross-language query expansion techniques by degrading
translation resources. Proceedings of the 25th Annual
International ACM SIGIR Conference on Research and Development
in Information Retrieval, 159-166. (Available on Moodle.)
- Daniella Petrelli, Stephen Levin, Micheline Beaulieu and Mark
Sanderson, 2006. Which User Interaction for Cross-Language Information Retrieval?
Design issues and reflections. Journal of the Ametican
Society for Information Science and Technology, 57(5)709-722. (Available on Moodle.)
Recommended Readings:
- Douglas W. Oard, Peter Brusilovsky, Daqing He, Judith Klavans, Tomasz Loboda, Leiming Qian, Dagobert Soergel and Pengyi Zhang, "Formative Evaluation for Multilingual Multimedia Search and Sense-Making," Handbook of Natural Language Processing and Machine Translation, Springer, 2011.
Link
- Gina-Anne Levow, Douglas W. Oard, Philip Resnik,
"Dictionary-Based Techniques for Cross-Language Information
Retrieval," Information Processing and Management, 41(3), 2005.
Available from Levow (.ps) or available on campus from LSU Libraries Electronic Journals Service.
- Daqing He, et al., "Making MIRACLEs: Interactive Translingual
Search for Cebuano and Hindi," ACM Transactions on Asian
Language Information Retrieval, 2(2-3). Available from the
ACM Digital Library.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 12 (Audio: Speech and Music)
Required Readings:
- Martha Larson and Gareth J.F. Jones, 2012. "Spoken
Content Retrieval: A Survey of Techniques and Technologies,
Foundations and Trends in Information Retrieval, 5(4-5)235-422. Read only Chapter 2.
- Markus Schedl, Emilia Gomez and Julian Urbano, 2014. Music
Information Retrieval: Recent Developments and
Applications, Foundations and Trends in Information
Retrieval, 8(2-3)127-261. Read only Chapter 1.
- J. Stephen Downie, Andreas F. Ehmann, Mert Bay and M. Cameron
Jones, 2010. "The
Music Information Retrieval Evaluation eXchange: Some
Observations and Insights, Advances in Music Information
Retrieval, Vol. 274, pp. 93-115.
Recommended Readings:
- William Byrne et al, "Automatic Recognition of Spontaneous
Speech for Access to Multilingual Oral History Archives," IEEE
Transations on Audio and Speech Processing, 2004. Available on
campus from IEEE Xplore.
- Nicola Orio, Music Retrieval: A Tutorial and Review, Foundations and Trends in Information Retrieval (1)1, 2006.
Link
- Elias Pampalk, Simon Dixon and Gerhard Widmer, "Exploring Music
Collections by Browsing Different Views," in International
Conference on Music Information Retrieval, 2003. Available on
the ISMIR 2003
Web site.
- Jonathan Foote, "An Overview of Audio Information Retrieval,"
ACM-Springer Multimedia Systems, 7(1)2-10,
1999. Available from CiteSeer
- John S. Garofolo, Cedric G. P. Auzanne and Ellen M. Voorhees,
"The TREC Spoken Document Retrieval Track: A success story,"
in Proceedings of the Eighth Text Retrieval Conference,
1999, pp. 107-130. Available from the TREC
Web site
- Rodger J. McNabb, Lloyd A. Smith, Ian H. Witten, and Clare
L. Henderson, "Tune
Retrieval in the Multimedia Library," Multimedia Tools and
Applications, 10(2-3)113-132, 2000. Available from the
New
Zealand Digital Library Web site.
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 13 (Images and Video)
Required Readings:
- Stefan Rueger, 2010. Multimedia Information Retrieval, Synthesis Lectures on Information Concepts, Retrieval and Services,
Morgan Claypool, 2010. Read only chapter 3 through the end of section 3.2. Link.
(Available on Moodle.)
- Paul Over, George Awad, Jon Fiscus, Martial Michel, Alan Smeaton and Wessel Kraaij, 2010.
"TRECVID-2009 Goals, Tasks, Data, Evaluation Mechanisms and Metrics," April 8, 2010.
Available from the TRECVID Web site and
here as PDF.
- Cees G.M. Snoek and Marcel Worring, 2009. Concept-Based
Video Retrieval, Foundations and Trends in Information Retrieval, 2(4). Read only Chapter 3.
Recommended Readings:
- Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition)
(ACM Press Books), 2011. Chapter 14. Multimedia Information Retrieval
- Vekant N. Gudivada and Vijay V. Raghavan, "Modeling and
Retrieving Images by Content," Information Processing and
Management, 33(4)427-452, 1997. Available on campus from
Science Direct.
- Howard Wactlar et al., "Complementary Audio and Video Analysis
for Broadcast News Archives," Communicatuions of the
ACM, 43(2)42-47, 2000. Available on campus from the ACM
Digital Library.
- Chad Carson, Serge Belongie, Hayit Greenspan and Jitendra
Malik, "Blobworld: Image Segmentation Using
Expectation-Maximization and Its Application to Image Querying,"
IEEE Transactions on Pattern Analysis and Machine Intelligence,
24(8)1026-1038, 2002. Available on campus from IEEE
Explore.
- Alan Smeaton, Wessel Kraaij and Paul Over, "TRECVID-2003 Video
Retrieval Evaluation Overview," Powerpoint slides, 2003.
Available from the TRECVID
Web site.
-------------------------------------------------------------------------------------------------------------------------------------------------
Session 14: No Class
-------------------------------------------------------------------------------------------------------------------------------------------------
Readings for Session 15 (Future of IR)
Required Readings:
- Cathal Gurrin, Alan F. Smeaton and Aiden R. Doherty, 2014.
lifelogging: Personal Big Data. Foundations and Trends in Information
Retrieval 8(1):1-107. Read only chapter 4.
- Kira Radinsky and Eric Horvitz, 2013. Mining
the Web to Predict Future Events. In Proceedings of the
Sixth ACM International Conference on Web Search and Data
Mining, 255-264.
Go back to Syllabus Page.