Thumbnail
Access Restriction
Subscribed

Author Sheil, B. A.
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Language English
Subject Keyword Dictionary lookup ♦ Binary search ♦ Heaps ♦ Balanced trees ♦ Zipf's law ♦ Information retrieval ♦ Tree search
Abstract Split trees are a new technique for searching sets of keys with highly skewed frequency distributions. A split tree is a binary search tree each node of which contains two key values—a node value which is a maximally frequent key in that subtree, and a split value which partitions the remaining keys (with respect to their lexical ordering) between the left and right subtrees. A median split tree (MST) uses the lexical median of a node's descendents as its split value to force the search tree to be perfectly balanced, achieving both a space efficient representation of the tree and high search speed. Unlike frequency ordered binary search trees, the cost of a successful search of an MST is log n bounded and very stable around minimal values. Further, an MST can be built for a given key ordering and set of frequencies in time n log n, as opposed to n2 for an optimum binary search tree. A discussion of the application of MST's to dictionary lookup for English is presented, and the performance obtained is contrasted with that of other techniques.
Description Affiliation: Harvard Univ. (Sheil, B. A.)
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2005-08-01
Publisher Place New York
Journal Communications of the ACM (CACM)
Volume Number 21
Issue Number 11
Page Count 12
Starting Page 947
Ending Page 958


Open content in new tab

   Open content in new tab
Source: ACM Digital Library