LIS 7410 - Digital Libraries
Spring 2009 -- Section 01
Project (Research Track)
For the project in the Research Track, read below.
For the project in the Professional Track, click here instead.
What is the project about?
This is supposed to be a research project. In computer science (CS), a good research project
must (i) define a problem, (ii) propose a solution, (iii) implement the solution (simulated or real)
and (iv) evaluate against any applicable existing solutions or related work.
In library and information science (LIS), a good research project must (i) define research
questions (or one research question), (ii) propose a methodology to investigate the research questions,
(iii) collect data, (iv) analyze the data, and (v) draw a conclusion.
Remember, good research always teaches other researchers something new.
Your research project can take one of the following manifestations:
- New research problem/solution or findings
- [CS] Define a new, interesting problem and propose a solution. Your solution does not have to be real
good, since you are pioneering a new area of research.
- [LIS] Define and investigate a new research question, collect and analyze data, then draw conclusions.
As a pilot study, you may collect small amount of data and draw preliminary conclusions.
- Existing research problem/new solution or findings
- [CS] Investigate an existing, interesting problem, and propose a new, novel solution that
is better than existing solutions, which can lead to new ways of
looking/understanding the problem. Your solution doesn't have to
outperform existing methods in all categories but at least in some
particular domain. For example, we are concerned with digital
libraries in this course. It will suffice if your solution for
typical documents in digital libraries is statistically significantly
better than in the more general case.
- [LIS] Investigate an existing, interesting problem, propose methodologies different from existing ones, and
collect and analyze new data.
- Existing research problem/compare existing solutions or findings
- [CS] Look at an existing problem and its solutions. Implement the solutions,
compare them and provide new insights to why one solution is better
than another. Provide public-domain software for letting others share
and use your work.
- [LIS] Look at an existing problem and its findings from different perspectives or communities,
compare them and provide new insights into what is common, what is different, what is missing, etc.
- [CS] Build an innovative system - Build a novel application that no
one, or few, have built before. But most importantly, identify new
issues in your system that no existing solutions can adequately
solve.
- [CS, LIS] Empirical analysis of some collected data - Researchers often
need to build systems that actually solve or improve on real problems.
Papers that analyze the usability of systems or characterize the data
in some way assist others to understand the problem or the clientele
(our users) for a particular problem. Ethnographic studies of digital library users
and human evaluation of digital libraries fall in this category.
Work individually, in pairs or groups of three (maximum).
Note that grading criteria for projects will not differ between
projects based on manpower; individuals and teams of two are often
better coordinated than teams of three, especially in short projects.
Some useful software and data:
- NUS NLP/IR software repository.
- WEKA / SVMlight / Boostexter / etc. - easy to use machine
learning utilities. Probably the easiest to use is Boostexter;
and the most complete one is WEKA. WEKA includes a number of
different machine learners so that one can do a comparative
analysis of different machine learning algorithms on your data.
- I have some software and data; talk to me if you need some software and data
to work with your project.
Choosing a project
Below you will find a list of possible final projects.
Since you have selected the research track, I expect and demand that each
individual or team achieve some novel research development or
finding that is not a rehashing of the existing literature. The
survey paper is intended to foster this understanding and
encourage you to poke into new territories.
You are welcomed and encouraged to propose alternate
projects. Your topic should blend together your strengths from your
background, experience and current coursework, yet be applicable to
digital libraries research. Teams that have taken projects that interest them
and/or have relevance to their research or jobs seem to always do
best. Some of the possible project ideas include (but are
not limited to):
- Access and Usability Issues
- Critique of current approaches in crosswalking of metadata
- Organizing photo and video content
- Novel querying tools for E-mail, blogs, and IM
- Multi-object summarization
- The use of VR and immersive environments in the DL
- Efficient social network visualization
- User modeling
- Classifying browsing and searching strategies based on information trails
- Differences in retrieval effectiveness in speech queries
as opposed to text/typed queries
- Conceptual Search / Polysemy and synonymy
- Query expansion and restriction from user query logs
- Characterizing known item queries
- Automatic jargon and terminology canonicalization
- Classification and Filtering
- Automatic ACM classification for theses and technical reports
- Home page interest networking
- Automatic ODP categorization for web sites
- Threading and summarizing blog, email or IM searches
- Digital Library Creation
- Inferring useful metadata for genres of web documents
- Dateline and timeline history collection and canonicalization
- GIS: Integration of maps at different scales
- Digital Library Cataloging and Indexing
- Multimedia Metadata Features
- Digital Library Policy:
- Cost models for the digital library in specialized domains/forms of media
- Convenience, user rights and usability of linkages in the digital library
- Exploring the integrity of skyreading/skywriting and its effect on scholarship
- Authorship Analysis
- Styles and Genres for authorship identification in web pages
- Linkage styles and classification for webpage creators
- Linking SMS and chat log short forms to long forms
- Social Network Analysis
- Building a better citation parser
- Web hyperlink classification
- Exploring the relationships between prestige, authorities and hubs
- Centrality and density of different genres of websites
- Automatic computation of an area's journal and conference reputations
You may find it helpful to view past projects by previous
students at NUS in a similar course,
Special Topics in Computer Science.
Project write-up, presentation and grading
Here are some slides on how to do your project proposal.
[ .pdf ]
[ .htm ]
Part of the skills that you should practice in a project-based
graduate class is how to report your work. Expert researchers will
tell you that half (if not most) of your time on a project will
involve polishing your paper so it is easy to read and
straightforward. Generally, filling up the page limit is easy, but
deciding what to omit and how to succinctly express your idea is difficult.
Your team's write-up will take the form of a research paper intended
for a conference submission with a 10 page limit. You should use an
ACM proceedings style (You can follow the instructions for WWW 2004,
for example). You may supplement this with a reference to your
project's website / blog (if one was created) and any amount of
appendices that you feel will help determine a grade. Selected final
projects will be asked to submit their work to a relevant conference
or journal, such as the ones listed on this page.
Grading for the project's final report and presentation are likely
to follow similar weights as NUS' digital libraries course.
Final Workload Disclaimer
The project is the primary method in which you will be assessed
for your course. The workload throughout the rest of the course is
purposely light to ensure that you have enough time to produce
high-quality research in the project. As such you need to budget your
team's time wisely and ensure that you have appropriately scoped your
project and covered the topic with enough detail and with appropriate
evaluation. Part-time students with other commitments need to be
particularly aware of this, as past cases have shown this problem
crops up with part-time students most often.
Some students inevitably start the project too late or mismanage
their time and neglect such open-ended courses, in order to advance in
other classes that have more concrete assessment milestones. I warn you now
to budget your time between classes wisely. You are advised to spend at least
9 hours per week on the course, including 3 hours in class. You should invest about
8 weeks * 6 hours/week = 48 hours on your project, excluding the survey assignment.
Information about the project presentation and a complete
listing of projects at NSU can be found here.
Submit your paper either as a hard copy or a PDF/PS/MS-Word document by either posting it on your course Website or
sending it as an email attachment. The due dates for project paper & deliverables, and presentation are indicated
on the syllabus page.
Acknowledgement to Min-Yen Kan.
Back to main page
Yejun Wu