CS345 – Topics in Data Warehousing Autumn 2004

Assignments
There will be five homework assignments: one problem set and four small programming assignments. The goal of the assignments is for students to gain hands-on experience with the ideas convered in the class.Students who find their course projects to be particularly interesting have the option to expand the scope of their project instead of completing one or two of the assignments. Make arrangements with the instructor if you are interested in electing this option.
Late Policy
Homework assignments will be given out on Thursdays and will be due 12 days later (one week from the following Tuesday).Late assignments will be accepted without penalty, provided that they are submitted no more than three days late (that is, by 11:59 pm on the Friday following the due date). No submissions will be accepted that are more than three days late.Students are strongly encouraged to complete the assignments by the nominal due date, rather than relying on the automatic three-day extension. Requests for additional extensions due to exceptional circumstances will be considered on a case-by-case basis, provided they are received before the nominal due date of the assignment.
Programming Environment
All of the assignments after the first involve writing programs that interact with a database. The database management system we are using for this class is Oracle. Each student in the class will have an Oracle account created. A web page with useful tips about interacting with Oracle is here.You may use the programming language of your choice for the programming assignments.
Assignment #1: Data Modeling Trade-Offs
Due Date: Tuesday, October 19This assignment will be a written problem set. It will consist of various data modeling scenarios. In each scenario, you will be asked to compare alternative schema designs, listing the pros and cons of each.Assignment #1 (Postscript format)
Assignment #2: Extraction, Transformation, and Load
Due Date: Tuesday, November 2In this programming assignment, you will extract data from three different source systems–a web site, a relational database, and a flat file–and load it into a data warehouse. The assignment will include both an initial load and an incremental load.Assignment #2 (Postscript format)Logging into Oracle using sqlplus Also describes how to change your password.CS 145 Oracle Tips Page Lots of helpful info.Using the Oracle bulk loader (sqlldr)Using TOAD / TOra on your Windows or Linux machine“Hello World” in JDBCDifferences between Players source system tables and Players dimension tableUsing the extract.pl scriptTest queries to validate your data warehouse
Assignment #3: Aggregate Selection and Navigation
Due Date: Friday, November 19(Late assignments will be accepted without penalty until Monday, November 22, 11:59 PM.)This programming assignment has three parts. In the first part, you will implement an algorithm for selecting a good set of precomputed aggregate tables (materialized views) to create for a given database. In the second part, you will generate the SQL to create the aggregate tables that were selected in the first part. In the third part, you will implement a query re-writer that takes an aggregation query that is written to run against the fact table, selects the most efficient aggregate table to use for that query, and re-writes the query to execute against the aggregate table.Assignment #3 (Postscript format)Metadata Tables Describes the format of the metadata tables in which configuration information is stored.Scripts that are providedSQL query formatStar schema for testingSpecifics of the aggregate selection algorithmSome tips for the aggregate creation phase of the assignmentResults on test data Results of the aggregate selection algorithm on the two test data sets. Use these to verify that your algorithm is working correctly.
Assignment #4: Decision Tree Learning
Due Date: Thursday, December 2(Late assignments will be accepted without penalty until Sunday, December 5, 11:59 PM.)In this programming assignment, you will implement the ID3 algorithm to learn a decision tree classifier, along with pruning using a holdout set. The data to be classified will be stored in the database, and computation of the statistics needed by the classifier will be performed using SQL.Assignment #4 (Postscript format)Scripts that are providedResults on test data You can compare the decision tree learned by your program with these test results.

What Students Are Saying About Us

.......... Customer ID: 12*** | Rating: ⭐⭐⭐⭐⭐
"Honestly, I was afraid to send my paper to you, but splendidwritings.com proved they are a trustworthy service. My essay was done in less than a day, and I received a brilliant piece. I didn’t even believe it was my essay at first 🙂 Great job, thank you!"

.......... Customer ID: 14***| Rating: ⭐⭐⭐⭐⭐
"The company has some nice prices and good content. I ordered a term paper here and got a very good one. I'll keep ordering from this website."

"Order a Custom Paper on Similar Assignment! No Plagiarism! Enjoy 20% Discount"