University of Utah School of Computing
CS 2420:
Introduction to Computer Science II
Homework 9
Assigned: Friday, Nov 7, 2014
Due: Monday, Nov 17, 2014, at 3pm.

Homework 9

Part I: Implement a Hash Table

Implement a hash table. Your implementation should support linear probing, quadratic probing, and separate chaining (separate chaining can be implemented in a seperate class). Your hash table should have a goal load factor (perhaps passed to the constructor). When the load factor goes beyond this value the tablesize should be increased. You must support insert(), find(), and delete().

Part II: Implement three (3) hash functions

Implement at least three different hash functions. One should be a "bad" hash function. One should be an "okay" hash function. One should be a "good" hash function.

Part III: Use your hash table to compute common words

Download some books as plain text files (see, e.g. Project Gutenberg). Write a program to load the text file. Use the words as keys in your hash table. When a word is encountered for the first time add it to the hash table with a counter initialized to 1. Every subsequent time the word is used increment the counter. After processing the text file your program should output the 25 most common words along with how many times they were seen.

Part IV: Compute collisions and perform timing experiments

For each of your collision handling mechanisms and each of the hash functions, compute the total number of collisions and the time required to process a book. Are the results as you expected?

Part V: Written Homework

What to hand in

Create a web page for the assignment. Post the output of your program and an analysis of your hash functions and collision handling mechanisms. Include numbers and what conclusions you can draw from the experiments. Use the CADE online handin system to handin your source code. Please zip and handin your entire project directory (including an info.txt file). If you work in a team only one of you need to submit the source code.

Grading

Each homework assignment will receive a grade out of 100 points. This week, the available points will be broken down like so:
  • 35 points: Hash Tables
  • 10 points: Hash Functions
  • 10 points: Word frequency calculator
  • 15 points: Written Homework
  • 10 points: Comments describing the class, each method as well as the interesting pieces of the code itself
  • 10 points: Elegence of programming
  • 10 points: Web page, experiements, and analysis