Disaster Recovery in Cloud Computing: Site Reliability Engineering Strategies for Resilience and Business Continuity

Chisom Elizabeth Alozie; Joshua Idowu Akerele; Eunice Kamau; Teemu Myllynen

doi:https://doi.org/10.54660/IJMOR.2024.3.1.36-48

Disaster Recovery in Cloud Computing: Site Reliability Engineering Strategies for Resilience and Business Continuity

Author(s): Chisom Elizabeth Alozie, Joshua Idowu Akerele, Eunice Kamau, Teemu Myllynen

Published: 2024

Volume: 3 | Issue: 1 | Pages: 36-48

Subject:

Country: United States

DOI: https://doi.org/10.54660/IJMOR.2024.3.1.36-48

License: CC BY 4.0

Full Text (PDF)

Open Access - Free to Download

Download Full Article (PDF)

Abstract

In the rapidly evolving landscape of cloud computing, disaster recovery (DR) remains a critical aspect of ensuring resilience and business continuity. This review explores the integration of Site Reliability Engineering (SRE) strategies into disaster recovery frameworks, highlighting their role in enhancing cloud-based systems' robustness and recovery capabilities. Disaster recovery in cloud environments involves more than just data backup and system restore; it requires a comprehensive approach that encompasses preparation, response, and recovery to minimize downtime and data loss. Site Reliability Engineering, with its focus on reliability, performance, and efficiency, provides a structured methodology for managing disaster recovery. Key strategies include implementing robust redundancy mechanisms, such as multi-region deployments and automated failover processes, which ensure that systems remain operational even in the face of significant disruptions. Additionally, SRE practices emphasize the importance of proactive monitoring and alerting, which facilitate early detection of potential issues and enable rapid response to incidents. Another crucial aspect is the use of chaos engineering principles to test and validate disaster recovery plans. By simulating failure scenarios, organizations can identify weaknesses in their DR strategies and make necessary adjustments before actual incidents occur. This proactive approach helps in building more resilient systems capable of withstanding real-world disruptions. Effective disaster recovery also requires a well-defined incident response plan, which includes clear protocols for data backup, recovery, and communication. SRE strategies advocate for regular testing and updating of these plans to ensure their effectiveness and alignment with evolving business needs. In summary, the integration of Site Reliability Engineering strategies into disaster recovery practices provides a robust framework for enhancing cloud computing resilience and business continuity. By leveraging redundancy, proactive monitoring, and chaos engineering, organizations can better prepare for and respond to disruptions, ensuring minimal impact on operations and maintaining service reliability.

How to Cite This Article

Chisom Elizabeth Alozie, Joshua Idowu Akerele, Eunice Kamau, Teemu Myllynen (2024). Disaster Recovery in Cloud Computing: Site Reliability Engineering Strategies for Resilience and Business Continuity . International Journal of Management and Organizational Research (IJMOR), 3(1), 36-48. DOI: https://doi.org/10.54660/IJMOR.2024.3.1.36-48

Publication Information

Journal: International Journal of Management and Organizational Research (IJMOR)

Publisher: Anfo Publication House

ISSN: (Print), 2583-6641 (Online)

Frequency: Bimonthly

Language: English

Open Access: Yes - This article is distributed under the terms of the Creative Commons Attribution 4.0 International License

International Journal of Management and Organizational Research

Disaster Recovery in Cloud Computing: Site Reliability Engineering Strategies for Resilience and Business Continuity

Full Text (PDF)

Abstract

How to Cite This Article

Publication Information

Share This Article:

Company

Useful Links

Follow Us