Senior Staff Site Reliability Engineer

12 - 17 years

14 - 18 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Palo Alto Networks is looking for a talented Senior Site Reliability Engineer for our ever expanding Infrastructure & Cloud Operations. This position will be a part of the Infrastructure team, you will be working and partnering with our Network, Compute, Security, Database, Applications, and other teams to provide availability, reliability, and observability for our global IT infrastructure environments. You will help with building our next-generation IT operations through Automation, Code, Analytics, and continuous improvement. We are looking for analytical, agile, and influential leaders who can quickly deliver meaningful results and solutions with the flexibility to accommodate evolving business needs and shifting priorities. Are you a motivated, intelligent, creative, and hardworking individual who wants to contribute and make a difference? If yes, this job is for you!

The ideal candidate enjoys working in a fast-paced environment with highly innovative technologies. Our team partners closely with IT and Engineering groups and requires individuals to bring a can-do, positive attitude, with a focus on delivering exceptional customer support.

Your Impact

  • Implementing and supporting the Linux infrastructure as code where our globally distributed customer-facing platform runs.
  • Provision, configure & support resilient hybrid cloud deployment architecture using the automation framework and make it more efficient
  • Manage Linux infrastructure CI/CD platform, work with other SREs in deploying and maintaining automation framework, capacity planning, create and review PKI operational runbooks.
  • Manage scalability, capacity planning, redundancy, and resiliency.
  • Maintain service availability and performance SLAs based on business and product requirements.
  • Contribute to documentation related to design, deployment, validation, operations and DR/BCP.
  • Design proactive service monitoring, alerting and trend analysis of underlying infrastructure, and support the operations team in implementation.
  • Build and operate compute fabric for 1000s of VMs, Kubernetes Clusters. Develop scripts, build tools and write code to automate routine tasks.
  • Provide technical support to platform users
  • Respond to security implementation and audits of the environment.
  • Plan maintenance windows, write up change requests, present technical updates.
  • Participate in On-Call support including participating in RCA as required.
  • Design and implement network, compute and application-level monitoring solutions
  • Implement integrated and automated processes that drive operational excellence
  • Advise on industry best practices as it relates to new product selection
  • Drive operational cadences around business planning and performance management to ensure the efficient running of the IT org

Qualifications

Your Experience

  • First-hand experience with Enterprise infrastructure and application monitoring and reporting tools
  • Strong working experience and exposure to containers and orchestration ( Docker, Kubernetes)
  • Infrastructure as Code knowledge - Terraform, Ansible, Git, Puppet
  • Fluent Scripting skills preferably Python OR Shell OR Bash
  • Exposure to Public Cloud Platforms - GCP (Google cloud) OR AWS
  • Proficient in CI/CD platforms like Jenkins, CircleCI, etc
  • Excellent problem-solving skills; ability to multi-task and prioritize
  • Ability to work independently; works well under pressure
  • Possess solid communication skills, and will be comfortable working in a fast-paced technical environment
  • Background knowledge of network and security technologies
  • Strong hands-on Linux experience in managing and supporting Linux server infrastructure in CentOS/RHEL/Ubuntu.
  • Bachelors/Masters degree in Computer Science, Information Technology or technical stream with the equivalent combination of work experience required.
  • Design and performance tuning for Linux infrastructure and API, in-depth knowledge of multi-tier web applications.
  • Experience in developing and managing APIs, understanding of API infrastructure optimization and security.
  • In-depth knowledge of Certificate Lifecycle Management
  • Fluent in Linux security & system hardening, vulnerability management & patching process. Familiarity with CIS compliance levels.
  • Must be comfortable with Ansible, Chef or similar configuration management tool to manage infrastructure as code and source code control systems such as GIT or SVN.
  • Ability to work cross-functionally across multiple business units, such as product development and engineering
  • Must be able to collaborate with a global team spread across multiple time zones.
  • Passion, drive, energy, a sense of humour and a great attitude!
  • 6+ years of relevant experience, Bachelor or Masters degree in Computer Science or a related technical field.
  • Experience with administration and orchestration of cloud computing (AWS, GCP, etc.) running virtual or container environments.
  • Good user and admin Linux skills (Ubuntu a plus).Experience with virtual networking.
  • Working experience with IaC tools like Terraform and Ansible. Knowledge of Python and shell scripting.
  • Experience with CI/CD development using platforms like - Jenkins, Harness, Artifactory.
  • Solid problem solving, troubleshooting, critical thinking, communication, and teamwork skills.
  • Passion for automation and monitoring instrumentation in the code.
  • Fluency in coding with one or more - Python, Go, Java, You will have to take coding and design tests as required.
  • Experience in Infrastructure as Code environment - Terraform, Ansible.You will be asked to write and troubleshoot IaC code during interview.
  • Proficient in Kubernetes based deployments, CI/CD platforms like Jenkins, Harness etc..
  • Takes great care in documenting conceptual work, detailed design specifications and can present ideas to engineers and engineering leaders.
  • Knowledge of AIOps, Application of Machine Learning/Artificial Intelligence in Cloud Infrastructure or IT Operations.

Additional experience in one or more of the following areas is a big plus

  • Development of self-healing infrastructure and applications.
  • Understanding of Big data, data analytics theory and application.
  • Exposure to Enterprise Business Applications, ITSM frameworks and tools is a big plus.

On an everyday basis bring the following traits to succeed:

  • Self-motivated, decisive, with the ability to work through ambiguity, and adapt to change and competing demands.
  • Excellent problem-solving skills; ability to multitask and prioritize
  • Ability to work independently; works well under pressure
  • Possess solid communication skills, and will be comfortable working in a fast-paced technical environment

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now
Palo Alto Networks logo
Palo Alto Networks

Cybersecurity

Santa Clara

RecommendedJobs for You

hyderabad, telangana, india