Thumbnail
Access Restriction
Subscribed

Author Bookstein, Abraham
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Language English
Subject Keyword Information storage and retrieval ♦ Substring ♦ Hashing ♦ String
Abstract This note comments on a technique by Malcolm Harrison [1] that tests whether a given string of characters, S1, is a substring of another string of characters, S2. This technique first transforms S2 into the set consisting of all the smaller fixed size substrings of length k and then hashes each of these segments into the m positions of a computer “word”; the bit corresponding to a position in this word is turned on if at least one segment is hashed onto it; else it is zero. The test is based on the observation that the same procedure applied to an arbitrary substring of the first string will not turn on any new bits. In the following we shall assume that S1 and S2 are decomposed into l1 and l2 segments respectively.
Description Affiliation: The Univ. of Chicago, Chicago, IL (Bookstein, Abraham)
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2005-08-01
Publisher Place New York
Journal Communications of the ACM (CACM)
Volume Number 16
Issue Number 3
Page Count 2
Starting Page 180
Ending Page 181


Open content in new tab

   Open content in new tab
Source: ACM Digital Library