Skip to main content
Log in

Approximating Tree Edit Distance through String Edit Distance

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We present an algorithm to approximate edit distance between two ordered and rooted trees of bounded degree. In this algorithm, each input tree is transformed into a string by computing the Euler string, where labels of some edges in the input trees are modified so that structures of small subtrees are reflected to the labels. We show that the edit distance between trees is at least 1/6 and at most O(n 3/4) of the edit distance between the transformed strings, where n is the maximum size of two input trees and we assume unit cost edit operations for both trees and strings. The algorithm works in O(n 2) time since computation of edit distance and reconstruction of tree mapping from string alignment takes O(n 2) time though transformation itself can be done in O(n) time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Akutsu, T.: A relation between edit distance for ordered trees and edit distance for Euler strings. Inf. Process. Lett. 100, 105–109 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  2. Bar-Yossef, Z., Jayram, T., Krauthgamer, R., Kumar, R.: Approximating edit distance efficiently. In: Proc. 45th IEEE Symp. Foundations on Computer Science, pp. 550–559 (2004)

  3. Batu, T., Ergun, F., Sahinalp, S.C.: Oblivious string embeddings and edit distance approximations. In: Proc. 17th ACM-SIAM Symp. Discrete Algorithms, pp. 792–801 (2006)

  4. Bille, P.: A survey on tree edit distance and related problem. Theor. Comput. Sci. 337, 217–239 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  5. Chen, W.: New algorithm for ordered tree-to-tree correction problem. J. Algorithms 40, 135–158 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  6. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. 13th ACM-SIAM Symp. Discrete Algorithms, pp. 667–676 (2002)

  7. Cormode, G., Paterson, M., Sahinalp, S.C., Vishkin, U.: Communication complexity of document exchange. In: Proc. 11st ACM-SIAM Symp. Discrete Algorithms, pp. 197–206 (2000)

  8. Czumaj, A., Gasieniec, L.: On the complexity of determining the period of a string. In: Proc. 11th Symp. Combinatorial Pattern Matching, pp. 412–422 (2000)

  9. Demaine, E., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Proc. 34th International Colloquium on Automata, Languages and Programming, pp. 146–157 (2007)

  10. Garofalakis, M., Kumar, A.: XML stream processing using tree-edit distance embedding. ACM Trans. Database Syst. 30, 279–332 (2005)

    Article  Google Scholar 

  11. Grossi, R.: On finding common subtrees. Theor. Comput. Sci. 108, 345–256 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  12. Guha, S., Jagadish, H.V., Koudas, N., Srivastava, D., Yu, T.: Approximate XML joins. In: Proc. ACM SIGMOD, pp. 287–298 (2002)

  13. Khot, S., Naor, A.: Nonembeddability theorems via Fourier analysis. In: Proc. 46th IEEE Symp. Foundations on Computer Science, pp. 101–110 (2005)

  14. Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Proc. 6th European Symp. Algorithms, pp. 91–102 (1998)

  15. Krauthgamer, R., Rabani, Y.: Improved lower bounds for embeddings into L 1. In: Proc. 17th ACM-SIAM Symp. Discrete Algorithms, pp. 1010–1017 (2006)

  16. Muthukrishnan, S., Sahinalp, S.C.: Approximate nearest neighbors and sequence comparison with block operations. In: Proc. 32nd ACM Symp. Theory of Computing, pp. 412–416 (2000)

  17. Ostrovsky, R., Rabani, Y.: Low distortion embeddings for edit distance. In: Proc. 37th ACM Symp. Theory of Computing, pp. 218–224 (2005)

  18. Tai, K.-C.: The tree-to-tree correction problem. J. ACM 26, 422–433 (1979)

    MATH  MathSciNet  Google Scholar 

  19. Valiente, G.: Algorithms on Trees and Graphs. Springer, Berlin (2002)

    MATH  Google Scholar 

  20. Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proc. ACM SIGMOD, pp. 754–765 (2005)

  21. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 1245–1262 (1989)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatsuya Akutsu.

Additional information

A preliminary version of this paper appeared in the Proceedings of the 17th Annual International Symposium on Algorithms and Computation (ISAAC’06), Lecture Notes in Computer Science, 4288, pp. 90–99, 2006. This work is partially supported by Grants-in-Aid on Scientific Research on Priority Areas “Cyber Infrastructure for the Information-explosion Era” and “Systems Genomics”, and Grant-in-Aid #16300092 from MEXT, Japan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akutsu, T., Fukagawa, D. & Takasu, A. Approximating Tree Edit Distance through String Edit Distance. Algorithmica 57, 325–348 (2010). https://doi.org/10.1007/s00453-008-9213-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-008-9213-z

Keywords

Navigation