×

On duplication-free codes for disjoint or equal-length errors. (English) Zbl 07926031

Summary: Motivated by applications in DNA storage, we study a setting in which strings are affected by tandem-duplication errors. In particular, we look at two settings: disjoint tandem-duplication errors, and equal-length tandem-duplication errors. We construct codes, with positive asymptotic rate, for the two settings, as well as for their combination. Our constructions are duplication-free codes, comprising codewords that do not contain tandem duplications of specific lengths. Additionally, our codes generalize previous constructions, containing them as special cases.

MSC:

68R15 Combinatorics on words
94B25 Combinatorial codes
94B35 Decoding

References:

[1] Ben-Tolila, E.; Schwartz, M., On the reverse-complement string-duplication system, IEEE Trans. Inform. Theory, 68, 11, 7184-7197, 2022 · Zbl 1534.68272 · doi:10.1109/TIT.2022.3182873
[2] Berstel, J., Growth of repetition-free words—a review, Theor. Comp. Sci., 340, 2, 280-290, 2005 · Zbl 1078.68112 · doi:10.1016/j.tcs.2005.03.039
[3] Church, GM; Gao, Y.; Kosuri, S., Next-generation digital information storage in DNA, Science, 337, 1628, 2012 · doi:10.1126/science.1226355
[4] Elishco O.: On the long-term behavior of \(k\)-tuples frequencies in mutation systems. arXiv preprint arXiv:2401.04020 (2024).
[5] Elishco, O.; Farnoud, F.; Schwartz, M.; Bruck, J., The entropy rate of some Pólya string models, IEEE Trans. Inform. Theory, 65, 12, 8180-8193, 2019 · Zbl 1433.68629 · doi:10.1109/TIT.2019.2936556
[6] Farnoud, F.; Schwartz, M.; Bruck, J., The capacity of string-duplication systems, IEEE Trans. Inform. Theory, 62, 2, 811-824, 2016 · Zbl 1359.94317 · doi:10.1109/TIT.2015.2505735
[7] Farnoud, F.; Schwartz, M.; Bruck, J., Estimation of duplication history under a stochastic model for tandem repeats, BMC Bioinform., 20, 64, 1-11, 2019
[8] Goldman, N.; Bertone, P.; Chen, S.; Dessimoz, C.; LeProust, EM; Sipos, B.; Birney, E., Towards practical, high-capacity, low-maintenance information storage in synthesized DNA, Nature, 494, 7435, 77-80, 2013 · doi:10.1038/nature11875
[9] Goshkoder D., Polyanskii N., Vorobyev I.: Codes correcting a single long duplication error. In: Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT2023), Taipei, Taiwan, pp. 2708-2713 (2023).
[10] Jain, S.; Farnoud, F.; Bruck, J., Capacity and expressiveness of genomic tandem duplication, IEEE Trans. Inform. Theory, 63, 10, 6129-6138, 2017 · Zbl 1390.92088 · doi:10.1109/TIT.2017.2728079
[11] Jain, S.; Farnoud, F.; Schwartz, M.; Bruck, J., Duplication-correcting codes for data storage in the DNA of living organisms, IEEE Trans. Inform. Theory, 63, 8, 4996-5010, 2017 · Zbl 1372.94471 · doi:10.1109/TIT.2017.2688361
[12] Kovačević, M., Zero-error capacity of duplication channels, IEEE Trans. Commun., 67, 10, 6735-6742, 2019 · doi:10.1109/TCOMM.2019.2931342
[13] Lenz, A.; Wachter-Zeh, A.; Yaakobi, E., Duplication-correcting codes, Des. Codes Cryptogr., 87, 277-298, 2019 · Zbl 1448.94289 · doi:10.1007/s10623-018-0523-0
[14] Lind, D.; Marcus, BH, An Introduction to Symbolic Dynamics and Coding, 1985, Cambridge: Cambridge University Press, Cambridge
[15] Marcus B.H., Roth R.M., Siegel P.H.: An Introduction to Coding for Constrained Systems (2001). Unpublished lecture notes. https://personal.math.ubc.ca/ marcus/Handbook/index.html.
[16] Nguyen T.T., Cai K., Song W., Immink K.A.S.: Optimal single chromosome-inversion correcting codes for data storage in live DNA. In: Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT2022), Espoo, Finland, pp. 1791-1796 (2022).
[17] Shipman, SL; Nivala, J.; Macklis, JD; Church, GM, CRISPR-Cas encoding of digital movie into the genomes of a population of living bacteria, Nature, 547, 345-349, 2017 · doi:10.1038/nature23017
[18] Tang, Y.; Farnoud, F., Error-correcting codes for short tandem duplication and edit errors, IEEE Trans. Inform. Theory, 68, 2, 871-880, 2021 · Zbl 1489.94067 · doi:10.1109/TIT.2021.3125724
[19] Tang, Y.; Yehezkeally, Y.; Schwartz, M.; Farnoud, F., Single-error detection and correction for duplication and substitution channels, IEEE Trans. Inform. Theory, 66, 11, 6908-6919, 2020 · Zbl 1453.94047 · doi:10.1109/TIT.2020.3006228
[20] Tang, Y.; Wang, S.; Lou, H.; Gabrys, R.; Farnoud, F., Low-redundancy codes for correcting multiple short-duplication and edit errors, IEEE Trans. Inform. Theory, 69, 5, 2940-2954, 2023 · Zbl 1542.94064 · doi:10.1109/TIT.2022.3233733
[21] Yohananov L., Schwartz M.: Optimal reverse-complement-duplication error-correcting codes. arXiv preprint arXiv:2312.00394 (2023).
[22] Zeraatpisheh, M.; Esmaeili, M.; Gulliver, TA, Construction of tandem duplication correcting codes, IET Commun., 13, 15, 2217-2225, 2019 · doi:10.1049/iet-com.2018.6053
[23] Zeraatpisheh, M.; Esmaeili, M.; Gulliver, TA, Construction of duplication correcting codes, IEEE Access, 8, 96150-96161, 2020 · doi:10.1109/ACCESS.2020.2995812
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.