Next: Chapter 6: Priority Queues
Up: CS 3345: Algorithm Analysis
Previous: Chapter 4: Trees
Hashing
- Hash Table:
- an array of fixed size
- Hash Function:
- maps keys into numbers in the range
- Goal:
- distribute keys evenly among array
elements
- Collision:
- two keys hash to same value
Open Hashing (Separate Chaining)
- use a hash function to determine hash value
- keep a list of all elements that hash to the same value
Load Factor of Hash Table ()
# of elements in hash table / table size
- Successful search:
-
links traversed on average
- Unsuccessful search:
-
links traversed on
average
Therefore:
- make table size large (to reduce )
- table size = prime number: helps in good distribution
Open Addressing
- try alternative cells on collision
- stop when empty cell found
where f(0) = 0
- does not require pointers
- a bigger table needed
- lower load factor ()
- lazy deletion
Collision Resolution Strategies
Linear Probing: f(i) = i
- free cell can be found if table is not full
- primary cluster formation: time consuming
- can be very expensive if table is quite full
Expected number of probes:
- insertion, unsuccessful search:
- successful searches:
Collision Resolution (contd.)
Quadratic Probing:
f(i) = i2
- eliminates primary clustering problem
- no guarantee of finding an empty cell
(especially if table size is not
prime)
- at most half the table can be used as alternative location for conflict
resolution
Double Hashing:
- very efficient for good choices of hash2 function
Rehashing
- Open addressing may have poor
performance when table gets too full.
- Solution:
- built another table about twice as big,
- use a new hash function,
- compute new hash value for each
nondeleted element,
- insert elements in new table.
Extendible Hashing
- Maintain a directory.
- Insert and finds performed with few disk accesses.
- Possible to restructure the directory
without actually accessing the
data.
Next: Chapter 6: Priority Queues
Up: CS 3345: Algorithm Analysis
Previous: Chapter 4: Trees
Ravi Prakash
1999-11-17