×

Algorithms for data migration. (English) Zbl 1184.68629

Summary: The data migration problem is the problem of computing a plan for moving data objects stored on devices in a network from one configuration to another. Load balancing or changing usage patterns might necessitate such a rearrangement of data. In this paper, we consider the case where the objects are fixed-size and the network is complete. Our results are both theoretical and empirical. Our main theoretical results are (1) a polynomial time algorithm for finding a near-optimal migration plan in the presence of space constraints when a certain number of additional nodes is available as temporary storage; and (2) a 3/2-approximation algorithm for the case where data must be migrated directly to its destination. We also run extensive experiments on several algorithms for various data migration problems and show that empirically, many algorithms perform better in practice than their theoretical bounds suggest. We conclude that many of the algorithms we present are both practical and effective for data migration.

MSC:

68W25 Approximation algorithms
Full Text: DOI

References:

[1] Anderson, E., Hall, J., Hartline, J., Hobbes, M., Karlin, A., Saia, J., Swaminithan, R., Wilkes, J.: An experimental study of data migration algorithms. In: Proceedings of the Workshop on Algorithm Engineering (WAE), pp. 145–158 (2001) · Zbl 1002.68626
[2] Anderson, E., Hobbs, M., Keeton, K., Spence, S., Uysal, M., Veitch, A.: Hippodrome: running circles around storage administration. In: Proceedings of the USENIX Conference on File and Storage Technologies (FAST), pp. 175–188 (2002)
[3] Borowsky, E., Golding, R., Merchant, A., Schreier, L., Shriver, E., Spasojevic, M., Wilkes, J.: Using attribute-managed storage to achieve QoS. In: Proceedings of the International Workshop on Quality of Service (IWQoS), pp. 199–202. Columbia University, New York, June 1997
[4] Coffman, E., Garey, M., Johnson, D., Lapaugh, A.: Scheduling file transfers. SIAM J. Comput. 14(3), 744–780 (1985) · Zbl 0604.68039 · doi:10.1137/0214054
[5] Gavish, B., Sheng, O.: Dynamic file migration in distributed computer systems. Commun. ACM 33(2), 177–189 (1990) · doi:10.1145/75577.75583
[6] Golubchik, I., Khuller, S., Khanna, S., Thurimella, R., Zhu, A.: Approximation algorithms for data placement on parallel disks. In: Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 223–232 (2000) · Zbl 0961.68010
[7] Golubchik, L., Khuller, S., Kim, Y., Shargorodskaya, S., Wan, Y.: Data migration on parallel disks. Algorithmica 45(1), 137–158 (2006) · Zbl 1117.68091 · doi:10.1007/s00453-005-1194-6
[8] Grammatikakis, M., Hsu, D., Kraetzl, M., Sibeyn, J.: Packet routing in fixed-connection networks: A survey. J. Parallel Distrib. Process. 54(2), 77–132 (1998) · Zbl 0911.68018 · doi:10.1006/jpdc.1998.1483
[9] Hall, J., Hartline, J., Karlin, A., Saia, J., Wilkes, J.: On algorithms for efficient data migration. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 620–629 (2001) · Zbl 0987.68006
[10] Kashyap, S., Khuller, S., Wan, Y., Golubehik, L.: Fast reconfiguration of data placement in parallel disks. In: Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 95–107 (2006)
[11] Khuller, S., Kim, Y., Wan, Y.: Algorithms for data migration with cloning. SIAM J. Comput. 33(2), 448–461 (2004) · Zbl 1101.68984 · doi:10.1137/S009753970342585X
[12] Khuller, S., Kim, Y., Malekian, A.: Improved algorithms for data migration. In: Proceedings of the International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), pp. 164–175 (2006) · Zbl 1155.68580
[13] Kim, Y.: Data migration to minimize the average completion time. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 97–98 (2003) · Zbl 1095.68549
[14] Kim, Y.: Data migration to minimize the average completion time. J. Algorithms 55(1), 42–57 (2005) · Zbl 1066.90032 · doi:10.1016/j.jalgor.2004.07.009
[15] Kubale, M.: Preemptive versus nonpreemptive scheduling of biprocessor tasks on dedicated processors. Eur. J. Oper. Res. 94(2), 242–251 (1996) · Zbl 0949.68507 · doi:10.1016/0377-2217(96)00131-2
[16] Micali, S., Vazirani, V.: An \(o(\sqrt{|V|}|e|)\) algorithm for finding a maximum matching in general graphs. In: Proceedings of the Symposium on Foundations of Computer Science (FOCS), pp. 17–27 (1980)
[17] Nishizeki, T., Kashiwagi, K.: On the 1.1 edge-coloring of multigraphs. SIAM J. Discrete Math. 3(3), 391–410 (1990) · Zbl 0702.05036 · doi:10.1137/0403035
[18] Sanders, P., Solis-Oba, R.: How helpers hasten h-relations. In: Proceedings of the European Symposium on Algorithms (ESA), pp. 392–402 (2000) · Zbl 0974.68502
[19] Shannon, C.: A theorem on colouring lines of a network. J. Math. Phys. 28, 148–151 (1949) · Zbl 0032.43203
[20] Transaction Processing Performance Council. TPC Benchmark D (Decision Support) Standard Specification Revision 2.1. Transaction Processing Performance Council (1996)
[21] Vizing, V.: On an estimate of the chromatic class of a p-graph. Discrete Anal. 3, 23–30 (1964) (in Russian)
[22] Wolf, J.: The placement optimization problem: a practical solution to the disk file assignment problem. In: Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 1–10 (1989)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.