senior software engineer

8 - 13 years

6 - 10 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are from the Technology Operations Platform, and our vision is to improve the user experience of business users and the engineering community by making technology operations simpler and more efficient. Our ultimate goal is to develop a comprehensive suite of tools and solutions that empower Site Reliability Engineering (SRE) teams to seamlessly get started and effectively manage performance and reliability across the organization.

As a Site Reliability Engineer at Maersk, you will play a critical role in ensuring the reliability, scalability, and performance of our global systems. You will work closely with development and operations teams to automate processes, build resilient infrastructure, and drive continuous improvement. This role demands strong expertise in Golang, Python, Ansible, and SRE principles, with a focus on automation and observability. AI/ML knowledge is a valuable plus.

Key Focus Areas: Status Page: Emphasis on maintaining a status page for transparency and incident communication. Zero-Touch Automation: Highlighting strategies to eliminate manual interventions. Middleware & Microservices: Demonstrating architectural know-how for robust service interactions. Observability: Utilize observability tools and practices to build scalable SRE and automation solutions for global platforms and users. Collaborate effectively with the Observability team to enhance system insights without overlapping responsibilities. Operational Excellence: Applying principles to improve reliability and team efficiency.
Key Responsibilities:
  • Design, implement, and maintain scalable and reliable infrastructure using Golang, Python, and Ansible.
  • Develop automations to eliminate manual, redundant toil.
  • Collaborate with cross
  • functional teams to define SLIs, SLOs, and error budgets.
  • Monitor system performance and availability using tools like Prometheus and Grafana.
  • Conduct root cause analysis and postmortems for incidents.
  • Drive adoption of SRE best practices across engineering teams.
  • Participate in on
  • call rotations and proactively prevent incidents.
  • Support AI/ML workloads and infrastructure where applicable.
  • Demonstrate strong expertise in SRE and automation technologies.
  • Act as a problem solver and critical thinker in complex technical scenarios.
Required Qualifications: Bachelor?s or Master?s degree in Computer Science, Engineering, or a related field. 8+ years of experience in SRE, DevOps, or backend engineering roles. Strong programming skills in Golang, Python, and Ansible. Hands-on experience with at least one cloud platform (AWS, GCP, or Azure). Proficiency in container orchestration (Kubernetes, Docker). Deep understanding of distributed systems and reliability engineering. Experience with monitoring, logging, and alerting systems. Excellent problem-solving and communication skills

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You