Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A "slice" is the set of pages of a single column in a cluster (the name 'slice' might still change).We most likely have one more level of indirection to get to the cluster object: a footer object points to a list of cluster summaries, each cluster summary containing page meta-data for a set of consecutive clusters.

Scale:

  • A data set can be up to O(10 TB) in size
  • A data set can have O(1000) columns
  • A cluster is 10-100 MB (e.g. a data set can have up to 10^5 clusters)
  • A page is O(10kB) of compressed data (i.e. a cluster can have up to 10^4 pages)

...