×

Performance assessment and reliability analysis of dependable and distributed computing systems based on BDD and recursive merge. (English) Zbl 1214.68075

Summary: System reliability evaluation, sensitivity analysis, failure frequency analysis, importance measures, and optimal design are important issues that have become research topics for distributed dependable computing. Finding all of the Minimal File Spanning Trees (MFST) and avoiding repeatedly computing the redundant MFSTs have been key techniques for evaluating the reliability of a distributed computing system (DCS) in previous works. However, identifying all of the disjointed MFSTs is difficult and time consuming for large-scale networks. Although existing algorithms have been demonstrated to work well on medium-scale networks, they have two inherent drawbacks. First, they do not support efficient manipulation of Boolean algebra. The sum-of-disjoint-products method used by these algorithms is inefficient when dealing with large Boolean functions. Second, the tree-based partitioning algorithm does not merge isomorphic sub-problems, and therefore, redundant computations cannot be avoided. In this paper, we propose a new efficient algorithm for the reliability evaluation of a DCS based on the recursive merge and the binary decision diagram (BDD). Using the BDD substitution method, we can easily apply our algorithm to a network with imperfect nodes. The experimental results show a significant improvement in the execution time compared to previous works.

MSC:

68M14 Distributed systems
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
68M15 Reliability, testing and fault tolerance of networks and computer systems
Full Text: DOI

References:

[1] Kumar, V. K.P.; Hariri, S.; Raghavendra, C. S., Distributed program reliability analysis, IEEE Transactions on Software Engineering, 12, 42-50 (1986) · Zbl 0575.68028
[2] Raghavendra, C. S.; Kumar, V. K.P.; Hariri, S., Reliability analysis in distributed systems, IEEE Transactions on Computers, 37, 3, 352-358 (1988)
[3] Kumar, A.; Rai, S.; Agrawal, D. P., On computer communication network reliability under program execution constraints, IEEE Journal of Selected Areas in Communications, 6, 1393-1400 (1988)
[4] Hariri, S.; Ragavendra, C. S., SYREL: a symbolic reliability algorithm based on path and cut set methods, IEEE Transactions on Computers, 36, 1224-1232 (1987)
[5] Kumar, A.; Agrawal, D. P., A generalized algorithm for evaluating distributed-program reliability, IEEE Transactions on Reliability, 42, 3, 416-426 (1993) · Zbl 0800.68187
[6] Chen, D. J.; Huang, T. H., Reliability analysis of distributed systems based on a fast reliability algorithm, IEEE Transactions on Parallel and Distributed Systems, 3, 2, 139-154 (1992)
[7] Chen, D. J.; Chen, R. S.; Huang, T. H., A heuristic approach to generating file spanning trees for reliability analysis of distributed computing systems, Computers and Mathematics with Application, 34, 115-131 (1997) · Zbl 0903.68010
[8] Ke, W. J.; Wang, S. D., Reliability evaluation for distributed computing networks with imperfect nodes, IEEE Transactions on Reliability, 46, 3, 342-349 (1997)
[10] Lin, M. S.; Chang, M. S.; Chen, D. J., Efficient algorithms for reliability analysis of distributed computing systems, Information Sciences, 117, 89-106 (1999)
[11] Chang, Y. R.; Amari, S. V.; Kuo, S. Y., OBDD-based evaluation of reliability and importance measures for multistate systems subject to imperfect fault coverage, IEEE Transactions on Dependable and Secure Computing, 2, 4, 336-347 (2005)
[12] Mo, Y., Variable ordering to improve BDD analysis of phased-mission systems with multimode failures, IEEE Transactions on Reliability, 58, 1, 53-57 (2009)
[13] Xing, L., An efficient binary-decision-diagram-based approach for network reliability and sensitivity analysis, IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 38, 105-115 (2008)
[14] Keren, O., Reduction of average path length in binary decision diagrams by spectral methods, IEEE Transactions on Computers, 57, 4, 520-531 (2008) · Zbl 1373.68198
[15] Kuo, S. Y.; Yeh, F. M.; Lin, H. Y., Efficient and exact reliability evaluation for networks with imperfect vertices, IEEE Transactions on Reliability, 56, 2, 288-300 (2007)
[16] Chang, Y. R.; Amari, S. V.; Kuo, S. Y., Computing system failure frequencies and reliability importance measures using OBDD, IEEE Transactions on Computers, 53, 1, 54-68 (2004)
[17] Rauzy, A., New algorithms for fault tree analysis, Reliability Engineering and System Safety, 40, 203-211 (1993)
[18] Sinnamon, R. M.; Andrews, J. D., Improved efficiency in qualitative fault tree analysis, Quality and Reliability Engineering International, 13, 293-298 (1997)
[19] Bryant, R. E., Graph-based algorithms for Boolean function manipulation, IEEE Transactions on Computers, 35, 677-691 (1986) · Zbl 0593.94022
[20] Mo, Y., New insights into the BDD-based reliability analysis of phased-mission systems, IEEE Transactions on Reliability, 58, 4, 667-678 (2009)
[21] Rauzy, A., A new methodology to handle Boolean models with loops, IEEE Transactions on Reliability, 52, 1, 96-105 (2003)
[22] Netes, V. A.; Filin, B. P., Consideration of node failures in network-reliability calculation, IEEE Transactions on Reliability, 45, 1, 127-128 (1996)
[23] Aggarwal, K.; Gupta, J. S.; Misra, K. B., A simple method for evaluation of a communication system, IEEE Transactions on Communications, 23, 563-566 (1975) · Zbl 0349.94003
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.