### Linear approximation of shortest superstringsLinear approximation of shortest superstrings

Access Restriction
Subscribed

 Author Blum, Avrim ♦ Jiang, Tao ♦ Li, Ming ♦ Tromp, John ♦ Yannakakis, Mihalis Source ACM Digital Library Content type Text Publisher Association for Computing Machinery (ACM) File Format PDF Copyright Year ©1994 Language English
 Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science Subject Keyword Approximation algorithm ♦ Shortest common superstring Abstract We consider the following problem: given a collection of strings $\textit{s1},…,$ $\textit{sm},$ find the shortest string $\textit{s}$ such that each $\textit{si}$ appears as a substring (a consecutive block) of $\textit{s}.$ Although this problem is known to be NP-hard, a simple greedy procedure appears to do quite well and is routinely used in DNA sequencing and data compression practice, namely: repeatedly merge the pair of (distinct) strings with maximum overlap until only one string remains. Let $\textit{n}$ denote the length of the optimal superstring. A common conjecture states that the above greedy procedure produces a superstring of length $\textit{O(n)}$ (in fact, $2\textit{n}),$ yet the only previous nontrivial bound known for any polynomial-time algorithm is a recent $\textit{O(n}$ log $\textit{n})$ result.We show that the greedy algorithm does in fact achieve a constant factor approximation, proving an upper bound of $4\textit{n}.$ Furthermore, we present a simple modified version of the greedy algorithm that we show produces a superstring of length at most $3\textit{n}.$ We also show the superstring problem to be MAXSNP-hard, which implies that a polynomial-time approximation scheme for this problem is unlikely. ISSN 00045411 Age Range 18 to 22 years ♦ above 22 year Educational Use Research Education Level UG and PG Learning Resource Type Article Publisher Date 1994-07-01 Publisher Place New York e-ISSN 1557735X Journal Journal of the ACM (JACM) Volume Number 41 Issue Number 4 Page Count 18 Starting Page 630 Ending Page 647

#### Open content in new tab

Source: ACM Digital Library