Disaster Recovery

6 years

0 Lacs

Posted:16 hours ago| Platform: SimplyHired logo

Apply

Work Mode

On-site

Job Description

Role 2: Positions 2

Disaster Recovery –

1 resource with 6 to 10 Years experience.

1 resource with 10+ years experience

Technology Resiliency and Recovery (Disaster Recovery) with Automation and AWS Expertise

Description:

We are seeking a highly skilled and motivated Technology Resiliency and Recovery Specialist with deep expertise in disaster recovery, automation, and AWS cloud infrastructure. This role will focus on ensuring that the organizations IT infrastructure remains resilient and capable of recovering quickly in the event of any disasters or disruptions. You will leverage automation tools and AWS services to design, implement, and maintain robust disaster recovery strategies that minimize downtime and ensure business continuity.

Roles and Responsibilities:

1. Disaster Recovery Planning & Implementation:

- Design, implement, and maintain disaster recovery (DR) plans for the organizations IT infrastructure, ensuring business continuity.

- Assess and analyze business impact, defining recovery objectives (RTO and RPO) and aligning them with organizational goals.

- Regularly test disaster recovery procedures through simulations and mock drills to ensure operational readiness.

- Work with different teams to identify critical systems and services that need to be included in the disaster recovery plan.

- Evaluate DR tools and solutions, focusing on AWS-based services, to ensure a scalable and cost-effective recovery solution.

2. Technology Resiliency and Business Continuity:

- Ensure that all IT systems are designed with resiliency in mind, ensuring high availability and fault tolerance.

- Implement and maintain cloud-based disaster recovery strategies using AWS services such as Amazon EC2, S3, RDS, Route 53, and more.

- Collaborate with architecture teams to ensure resiliency and continuity measures are embedded into infrastructure design.

- Oversee and optimize backup strategies, ensuring that systems can be quickly restored with minimal data loss.

3. Automation & Infrastructure as Code (IaC):

- Automate disaster recovery processes and workflows using modern DevOps tools such as AWS CloudFormation, Tidal, Terraform, Ansible, or other automation frameworks.

- Implement Infrastructure as Code (IaC) practices to streamline the provisioning and management of recovery environments.

- Use SumoLogic, Dynatrace, AWS Lambda, CloudWatch, and other automation tools to proactively monitor and respond to system events or failures.

4. Documentation & Reporting:

- Maintain clear and up-to-date documentation of disaster recovery plans, runbooks, and processes.

- Provide detailed post-disaster recovery reports, outlining the effectiveness of the recovery process and any lessons learned.

- Report on resiliency metrics, recovery objectives, and automation progress to senior leadership.

6. Incident Response & Post-Incident Analysis:

- Lead the response during actual disaster recovery events, coordinating with IT and business units to ensure a smooth recovery process.

- Perform post-incident analysis to identify root causes, implement corrective actions, and improve recovery plans.

7. Collaboration & Training:

- Collaborate closely with cross-functional teams including IT operations, security, engineering, and business continuity.

- Provide training and awareness on disaster recovery procedures to staff, helping them understand the importance of disaster recovery and their roles during recovery scenarios.

Skills & Qualifications:

Required:

  • Disaster Recovery & Business Continuity Expertise:
  • Proven experience in designing, implementing, and managing disaster recovery plans for both on-premises and cloud-based infrastructure.
  • Experience with automation tools such as Tidal, Terraform, AWS CloudFormation, Ansible, or similar.
  • Proficiency in scripting languages (Python, Shell, etc.) to automate processes and workflows.
  • Excellent verbal and written communication skills for technical and non-technical stakeholders.
  • Ability to lead recovery efforts, coordinate between various teams, and communicate effectively during high-pressure situations.
  • AWS Certified Practitioner and Solutions Architect

Preferred:

  • Proficient in monitoring, alerting, and performance tuning using AWS and third-party monitoring tools like SumoLogic, Dynatrace and such others.
  • Strong understanding of IT resilience, high availability architectures, RTO/RPO objectives, and best practices for disaster recovery.
  • Knowledge of DevOps principles, Continuous Integration (CI), Continuous Deployment (CD), and configuration management.
  • ITIL Foundation or similar business continuity certifications.
  • Certified Business Continuity Professional (CBCP) or similar DR/BCP certification

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Noida, Chennai, Bengaluru