### Compactly encoding unstructured inputs with differential compressionCompactly encoding unstructured inputs with differential compression

Access Restriction
Subscribed

 Author Ajtai, Miklos ♦ Burns, Randal ♦ Fagin, Ronald ♦ Long, Darrell D E ♦ Stockmeyer, Larry Source ACM Digital Library Content type Text Publisher Association for Computing Machinery (ACM) File Format PDF Copyright Year ©2002 Language English
 Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science Subject Keyword Delta compression ♦ Differencing ♦ Differential compression Abstract The subject of this article is differential compression, the algorithmic task of finding common strings between versions of data and using them to encode one version compactly by describing it as a set of changes from its companion. A main goal of this work is to present new differencing algorithms that (i) operate at a fine granularity (the atomic unit of change), (ii) make no assumptions about the format or alignment of input data, and (iii) in practice use linear time, use constant space, and give good compression. We present new algorithms, which do not always compress optimally but use considerably less time or space than existing algorithms. One new algorithm runs in $\textit{O}(\textit{n})$ time and $\textit{O}(1)$ space in the worst case (where each unit of space contains [log $\textit{n}]$ bits), as compared to algorithms that run in $\textit{O}(\textit{n})$ time and $\textit{O}(\textit{n})$ space or in $O(n^{2})$ time and $\textit{O}(1)$ space. We introduce two new techniques for differential compression and apply these to give additional algorithms that improve compression and time performance. We experimentally explore the properties of our algorithms by running them on actual versioned data. Finally, we present theoretical results that limit the compression power of differencing algorithms that are restricted to making only a single pass over the data. ISSN 00045411 Age Range 18 to 22 years ♦ above 22 year Educational Use Research Education Level UG and PG Learning Resource Type Article Publisher Date 2002-05-01 Publisher Place New York e-ISSN 1557735X Journal Journal of the ACM (JACM) Volume Number 49 Issue Number 3 Page Count 50 Starting Page 318 Ending Page 367

#### Open content in new tab

Source: ACM Digital Library