Lecture Notes in Computer Science vol:919 pages:372-377
HPCN 1995 location:Milan, Italy date:3-5 May 1995
The reconfiguration approach presented in this paper provides a solution to the need for fault tolerance in large systems. The developed techniques all have a data complexity and an execution time complexity less than proportional to the number of nodes in the system. Hence the approach is extremely suited for massively parallel systems. The reconfiguration strategy consists of four different subtasks, repartitioning (each application must have sufficient working processors), loading of injured networks, remapping (to replace faulty processors by working ones) and deadlock-free fault tolerant compact routing.