IEEE Transactions on Parallel and Distributed Systems vol:accepted
Fault-tolerance plays an important role in improving the reliability of multiple earth-observing satellites, especially in emergent scenarios such as obtaining photographs on battlefields or earthquake areas. Fault tolerance can be implemented through scheduling approaches. Unfortunately, little attention has been paid to fault-tolerant scheduling on satellites. To address this issue, we propose a novel dynamic fault-tolerant scheduling model for real-time tasks running on multiple observation satellites. In this model, the primary-backup policy is employed to tolerate one satellite’s permanent failure at one time instant. In the light of the fault-tolerant model, we develop a novel fault-tolerant satellite scheduling algorithm named FTSS. To improve the resource utilization, we apply the overlapping technology that includes primary-backup copy overlapping (i.e., PB overlapping) and backup-backup copy overlapping (i.e., BB overlapping). According to the satellites characterized with time windows for observations, we extensively analyze the overlapping mechanism on satellites. We integrate the overlapping mechanism with FTSS, which employs the task merging strategies including primary-backup copy merging (i.e., PB merging), backup-backup copy merging (i.e., BB merging) and primary-primary copy merging (i.e., PP merging). These merging strategies are used to decrease the number of tasks required to be executed, thereby enhancing system schedulability. To demonstrate the superiority of our FTSS, we conduct extensive experiments using the real-world satellite parameters supplied from the satellite tool kit or STK; we compare FTSS with the three baseline algorithms, namely, NMFTSS, NOFTSS, and NMNOFTSS. The experimental results indicate that FTSS efficiently improves the scheduling quality of others and is suitable for fault-tolerant satellite scheduling.