Abstract
Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied. In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized \(O(n\log ^2n)\) time \(O(\sqrt{n})\)-approximation algorithm. Then, we generalize our result to give a randomized \(\alpha \)-approximation algorithm for any \(\alpha \in [\sqrt{\log n}, \sqrt{n/\log n}]\), running in time \(O(n^2/\alpha ^2 \log n)\). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability.
Similar content being viewed by others
Notes
The (discrete) Fréchet and DTW distances are defined similarly to GED; however, they use one-to-many correspondences instead of one-to-one matchings, and they disallow the use of gap points. As in GED, DTW aims to minimize the sum of distances between corresponding points, while discrete Fréchet distance aims to minimize the maximum distance over corresponding points.
We say an event occurs with high probability if it occurs with probability at least \(1-\frac{1}{n^c}\) for some constant \(c>0\).
This variant of the string edit distance is really closer to the shortest common supersequence length of the strings rather than the traditional Levenshtein distance, but we stick with “edit distance” for simplicity.
References
Abboud, A., Backurs, A., Williams, V.V.: Tight hardness results for LCS and other sequence similarity measures. In: Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 59–78 (2015)
Agarwal, P.K., Fox, K., Pan, J., Ying, R.: Approximating dynamic time warping and edit distance for a pair of point sequences. In: Proceedings of the 32nd International Symposium on Computational Geometry, pp. 6:1–6:16 (2016)
Andoni, A., Krauthgamer, R., Onak, K.: Polylogarithmic approximation for edit distance and the asymmetric query complexity. In: Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science, pp. 377–386 (2010)
Andoni, A., Nosatzki, N.S.: Edit distance in near-linear time: it’s a constant factor. In: Proceedings of the IEEE 61st Annual Symposium on Foundations of Computer Science (2020, to appear)
Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In: Proceedings of the 47th Annual ACM Symposium on Theory of Computing, pp. 51–58 (2015)
Bringmann, K.: Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless SETH fails. In: Proceedings of the IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 661–670 (2014)
Bringmann, K., Künnemann, M.: Quadratic conditional lower bounds for string problems and dynamic time warping. In: Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 79–97 (2015)
Bringmann, K., Mulzer, W.: Approximability of the discrete Fréchet distance. JoCG 7(2), 46–76 (2016)
Chakraborty, D., Das, D., Goldenberg, E., Koucky, M., Saks, M.: Approximating edit distance within constant factor in truly sub-quadratic time. In: Proceedings of the 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 979–990. IEEE (2018)
Chan, T.M., Rahmati, Z.: An improved approximation algorithm for the discrete Fréchet distance. Inf. Process. Lett. 138, 72–74 (2018)
Chen, L., Ng, R.: On the marriage of Lp-norms and edit distance. In: Proceedings of the 30th International Conference on Very Large Databases, pp. 792–803 (2004)
Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2005)
Fox, K., Li, X.: Approximating the geometric edit distance. In: Proceedings of the 30th International Symposium on Algorithms and Computation, pp. 26:1–26:16 (2019)
Gold, O., Sharir, M.: Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans. Algorithms 14(4), 50 (2018)
Har-Peled, S.: Geometric approximation algorithms, chap. 11, Random Partition via Shifting. American Mathematical Society (2011)
Impagliazzo, R., Paturi, R.: On the complexity of k-SAT. J. Comp. Sys. Sci. 62(2), 367–375 (2001)
Kuszmaul, W.: Dynamic time warping in strongly subquadratic time: Algorithms for the low-distance regime and approximate evaluation. In: Proceedings of the 46th International Colloquium on Automata, Languages and Programming (2019)
Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998)
Marteau, P.F.: Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 306–318 (2009)
Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. Comput. Syst. Sci. 20(1), 18–31 (1980)
Sankararaman, S., Agarwal, P.K., Mølhave, T., Pan, J., Boedihardjo, A.P.: Model-driven matching and segmentation of trajectories. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 234–243 (2013)
Stojmirovic, A., Yu, Y.K.: Geometric aspects of biological sequence comparison. J. Comput. Biol. 16(4), 579–611 (2009)
Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1–3), 100–118 (1985)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21(1), 168–173 (1974)
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2013)
Acknowledgements
The authors would like to thank Anne Driemel and Benjamin Raichel for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A preliminary version of this work appeared in the Proceedings of the 30th International Symposium on Algorithms and Computation [13]. Most of the work was done while the second author was a student at the University of Texas at Dallas.
Rights and permissions
About this article
Cite this article
Fox, K., Li, X. Approximating the Geometric Edit Distance. Algorithmica 84, 2395–2413 (2022). https://doi.org/10.1007/s00453-022-00966-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-022-00966-4