Hash tables

A hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

Hashing derives a fixed size result from an input. See Hash table.

Properties of a hashing algorithm

  • Stable: the same input generates the same output every time
  • Uniform: the hash values should be uniformly distributed through the available space
  • Efficient: the cost of generating a hash must be balanced with application need
  • Secure: the cost of finding data that produces a given hash is prohibitive

String hashing

  • Sum ASCII values – not uniform, not secure
  • Fold bytes of every four characters into an integer – not secure
  • CRC32 – not secure
  • MD5 – not efficient, not secure
  • SHA2 – stable, uniform, not efficient, secure

Handling collisions

  • Chaining
  • Open addressing
  • Load/fill factor – the ratio of filled hash table array slots
  • DFS/BFS – depth-first search versus breadth-first
  • Brute force
  • Greedy algo – stall at local maxima
  • Dynamic programming
  • Longest common subsequence
  • Simulated annealing
  • Random solutions
  • Polynomial
  • PTAS – approximation scheme
  • More Hash Function Tests
  • Linear probing
  • Robinhood hashing
  • Cuckoo hashing
  • Preimage (attack)
  • URL shortener