CSc 4402 Introduction to Database Management Systems
Fall 2007
Course: Mon & Wed 3:40-5:00 PM Coates 102
Instructor: Donald H. Kraft Office: 286 Coates Phone: 578-2253
Office Hours: Mon & Wed 2:00-3:30 pm . Tues-Thurs 3:00:-4:30 pm
Email: kraft@csc.lsu.edu Web: http://www.csc.lsu.edu/~kraft
Grader: Anindya Poddar Email:
apodda2@lsu.edu
Text: A. Silberschatz, H.F. Korth, and S. Sudarshan. Database System Concepts, 5th edition, McGraw-Hill, 2007 (http://codex.cs.yale.edu/avi/db-book/)
Reference: L. Welling and L. Thomson. PHP and MySQL Web Development, 3rd edition,
Developers Library, 2004
Additional References: C.J. Date, An Introduction to Database Systems, 8th edition,
Addison-Wesley, 2004
Credits: 3 hours Prerequisites: CSc 3102
Course Catalog Description: Network, hierarchical, and relational, and entity-relationship models; data definition, manipulation languages, and conversion among these models; relational database design theory, efficient query evaluation; elementary query optimization techniques.
Course Outcomes: Master the basic concepts and appreciate the applications of database systems; Master the basics of SQL and construct queries using SQL; Be familiar with a commercial relational database system (Access) by writing SQL using the system; Be familiar with the relational database theory, and be able to write relational algebra expressions for queries; Master sound design principles for logical design of databases, including the E-R method and normalization approaches; Be familiar with basic database storage structures and access techniques – file and page organizations and indexing methods including B-trees and hashing; Master the basics of query evaluation techniques and query optimization; Be familiar with the basic issues of transaction processing and concurrency control; and Master working successfully on a team to design and develop a database application (optional).
Topics:
Basic Concepts Chap. 1
Basic terminology, Notion of a database (DB) system, DBMS, DBA, Data independence (physical and logical), Data abstraction, Advantages of a DB, Data models (e.g., hierarchical, E-R model, relational model), Data storage and query processing, DB system architecture (3-level ANSI/Sparc), Languages (DDL, DML, SQL), Design
Relational Model Chap. 2
Data structures, Domains, Attributes, Relations, Schema, Instances, Primary keys, Candidate keys, Foreign keys, Properties of relations, Formal theory underlying relational database systems, Relational data structure, Relational data manipulation, Relational data integrity (entity, referential), Relational algebra (RA, expressions)
SQL Language Chaps. 3-4
Data definition, Data manipulation, Select statement, Simple queries, Join
queries, Aggregate functions, Subqueries, Set-theoretic operators (group
by, having), Similarity and Differences (views versus base tables), Update, Delete, Insert, Data types, Integrity constraints (check constraints, assertions, authorizations), Embedded SQL, Dynamic SQL
PHP Welling-Thomson, Chaps. 1,2
MySQL Welling-Thomson, Chaps. 9-12
Security Welling-Thomson, Chaps. 15,22
Database Design Chaps. 6-7
Design (approaches, pitfalls, redundancy, incompleteness, choices) E-R modeling for DB design (Is-A hierarchy), Cardinality constraints, E-R diagrams, Functional dependencies (fd, Armstrong’s axioms, logical implications), Normalization (normal forms, algorithms to go from 1NF to 2NF to 3NF to BCNF), Examples, Logical design (properties of a good or bad design, Attribute closure, Canonical covers, LLJ and fd-preserving decomposition, BCNF decomposition
Transaction Processing (Management) and
Concurrency Control Chaps. 15-16
Notion of transactions; ACID properties of transactions, Concurrent
Schedules, Serializability, Conflict serializable and view serializable schedules, Use precedence graph, Concurrency control, Locking Protocols, waiting compatibility matrix, Two-phase locking protocol, Dead-lo0ck detection and recovery, Wait-for graphs
Storage and Query Processing Chaps. 11-14
Storage structures, Storage hierarchy, Raid file organization (fixed-length records), Indexing, B-trees, Dense vs. non-dense indexes, Hashing; Query processing, Measures of query processing costs, Implementation of select and join operators, Query optimization (relational algebra level, RA operator level, implementation level), Transformation of RA expressions
Grading:
Homework (7 homeworks) 15%
Hard copy if possible, if late up to 3 days then lose 10 points/day
Program Assignments (5) 30%
Submit source code and output, can submit electronically, if late
up to 5 days then lose 10 points/day
Paper (databases in society issues (see below) 10%
Midterm examination 20%
Final examination - Monday, December 10 5:30-7:30 PM 20%
Participation 5%
Approximate
Grading Scale:
For undergraduate
students: 85-100 A, 75-84 B, 65-74 C, 55-64 D, below 55 F. For graduate students: 90-100 A, 80-89 B,
70-79 C, 60-69 D, below 60 F.
Paper – all students must write a short
(ten pages or so, but do not take that number literally), research paper on
some aspect of computers and databases in society (e.g., privacy, security,
ethics). This should be along the lines
of a bibliographic essay, describing a concept and relaying the
state-of-the-art in terms of the social informatics aspects of the topic. One should not provide a tutorial, but
should focus on research and what the literature is saying. One should not just concatenate
several abstracts from papers nor rely solely on textbooks or the web; in fact,
the use of journal articles, technical reports, and conference proceedings papers
is encouraged. One should weave a
pattern of what is going on with the topic selected. One should also list all
references consulted in the paper.
Remember that copying entire sections, especially without citing them,
of your consulted articles is plagiarism and is academically and intellectually
dishonest – violators will be prosecuted.
The choice of topics must be
approved by the instructor by the end of the third week of the semester. The paper is due by the end of the twelfth
week of the semester. Students taking
this course for graduate credit must write a second paper on some other technical
aspect of databases (e.g., fuzziness in databases, normalization beyond BCNF)
that focuses upon research issues in database management systems topics.
Assignments - the due dates for all homeworks (http://www.csc.lsu.edu/~kraft/courses/csc4402.homeworks.html,
each graded as 100 points) are considered absolute - however, late submissions
are accepted at the cost of 10 points per day up to 3 days (holiday and weekend
counts). Hard-copies are preferred for
homework submissions, but electronic submissions are accepted. No make-ups will
be given for exams. If a student missed the mid-term exam for a justified
reason (illness, family emergency, etc.), then the student's final exam grade
will account for 80 percent of the student's grade for the entire course. Students are responsible for knowing all
information presented in class and materials assigned for self-reading. The students are strongly encouraged to ask
questions in and out of class. It is
recommended that students keep copies of all the submitted materials. It is important that students observe the
student conduct code in doing works in this course. Each homework assignment and exam must be
done by each individual student independently;
"team work" for the examinations (midterm and final) and homework is not allowed. For the project
(http://www.csc.lsu.edu/~kraft/courses/csc4402.project.html,
program assignments), each team (usually consisting of 3-5 students) has
to complete the project together but
independently of other teams. Self study through reading the material
BEFORE (and after) class and completing all the homeworks and project is very
important for learning in this course.
Good luck!