Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
3.0 - 7.0 years
0 Lacs
haryana
On-site
Job Description: You are a skilled and experienced Site Reliability Engineering (SRE) Consultant with over 7 years of experience. As an SRE Consultant, you will be responsible for implementing, maintaining, and enhancing the reliability, scalability, and performance of systems. Your role will involve collaborating closely with development teams to design and deploy robust and scalable solutions. Your responsibilities will include implementing best practices for reliability, scalability, and performance, collaborating with development teams to meet SRE standards, monitoring system performance, troubleshooting issues, implementing automation for process optimization, planning and executing system upgrades and migrations, providing on-call support for critical incidents, documenting processes and procedures, and staying updated on industry trends and best practices. To excel in this role, you should have a Bachelor's degree in Computer Science or a related field, at least 3 years of experience in Site Reliability Engineering or a related field, strong knowledge of cloud technologies and platforms such as AWS, GCP, and Azure, experience with monitoring and alerting tools like Prometheus, Grafana, and Datadog, proficiency in scripting and automation using tools like Python and Bash, strong problem-solving skills, attention to detail, excellent communication and teamwork skills, and the ability to work independently and collaborate effectively with cross-functional teams. Key Skills: SRE Engineer, Site Reliability, Resiliency, Cloud Technologies, AWS, GCP, Azure, Monitoring Tools, Automation, Problem Solving, Communication Skills, Teamwork, Computer Science, Reliability, Performance, Scalability, On-Call Support.,
Posted 2 weeks ago
10.0 - 14.0 years
0 Lacs
karnataka
On-site
As a motivated Senior SRE Engineer at Palo Alto Networks, you will be an integral part of the Cortex Devops Production group based at our India development (IDC) center. Collaborating closely with the Cortex Cyber Security Research group, you will play a key role in planning, executing, and reporting on various infrastructure and code projects. Your responsibilities will include managing high-pressure production maintenance tasks and addressing related issues to ensure the smooth functioning of the production environment. In this role, you will take full end-to-end responsibility for the production environment of our SaaS product deployed on GCP. You will be involved in building tools for automatic remediation of known issues, developing Infrastructure-as-code for orchestrating production and development environments, and designing, building, maintaining, and scaling production services with thousands of Kubernetes clusters. Additionally, you will focus on securing production environments, integrating new security tools and features, and collaborating with development teams to enhance software architecture for improved scalability, service reliability, cost, and performance. You will be responsible for building CI pipelines and automation processes, participating in the on-call rotation to support applications and infrastructure, and researching cutting-edge technologies to deploy them to production. Your experience as a DevOps/SRE Engineer with a passion for technology and a strong sense of responsibility for high reliability and service levels will be crucial in this role. Proficiency in various tools and technologies such as Python, Go, Linux, GCP, Terraform, Kubernetes, Docker, Jenkins, and databases like Cassandra, ScyllaDB, MemSQL, and MySQL will be advantageous. With excellent communication and interpersonal skills, you will effectively collaborate with internal and external stakeholders, manage high-scale production environments, and adapt quickly to new technologies while multitasking on multiple responsibilities. Joining our dynamic engineering team, you will have the opportunity to drive innovation, challenge the industry norms, and contribute to shaping the future of cybersecurity at Palo Alto Networks. At Palo Alto Networks, we value diversity, innovation, and collaboration. We are committed to providing reasonable accommodations for individuals with disabilities and ensuring a supportive work environment where every employee can thrive. If you are excited about the prospect of solving complex cybersecurity challenges and making a meaningful impact, we encourage you to join our team and be part of our mission to protect our digital way of life.,
Posted 3 weeks ago
2.0 - 6.0 years
0 Lacs
hyderabad, telangana
On-site
You have 2 to 4 years of experience and we are looking for an SRE Engineer to provide strategic direction and technical expertise to ensure the ongoing success and reliability of the platform and products. Your job responsibilities will include supporting the design, building, and maintenance of highly available, scalable, and reliable SaaS infrastructure. You will be responsible for ensuring resilient systems and solutions that meet stringent SLAs, leading efforts to maintain product reliability, and driving proactive monitoring, alerting, and incident response practices. Additionally, you will develop and implement strategies for fault tolerance, disaster recovery, and capacity planning. You will conduct post-incident reviews and root cause analyses to identify areas for improvement and prevent recurrence. Automation initiatives to streamline operational workflows, reduce manual effort, and improve efficiency will be a key part of your role. You will also champion DevOps best practices, promote infrastructure as code, CI/CD pipelines, and other automation tools and methodologies. Collaborating with other teams to improve observability systems and monitor site stability and performance will be essential. Continuous learning and exploration of new tools, techniques, and methodologies to drive innovation and enhance the DevOps platform will also be part of your responsibilities. Working closely with development teams to optimize application performance and efficiency will be crucial. You will implement tools and techniques to measure and improve service latency, throughput, and resource utilization, as well as identify and implement cost-saving measures to optimize cloud infrastructure spending. Proactively identifying and addressing security vulnerabilities in the cloud environment and collaborating closely with engineering, product management, CISO, and other teams to align on reliability goals, prioritize projects, and drive cross-functional initiatives. Effective communication with stakeholders to provide visibility into reliability initiatives, progress, and challenges will be essential. You will also maintain documentation of processes, configurations, and technical guidelines. At GlobalLogic, we prioritize a culture of caring where you'll have the chance to build meaningful connections with collaborative teammates, supportive managers, and compassionate leaders. We are committed to your continuous learning and development, offering various programs, training curricula, and hands-on opportunities to grow personally and professionally. You'll have the chance to work on projects that matter and engage your curiosity and creative problem-solving skills while bringing new solutions to market. We believe in the importance of balance and flexibility, offering various work arrangements to help you achieve the perfect balance between work and life. GlobalLogic is a high-trust organization where integrity is key, providing a safe, reliable, and ethical global environment. Truthfulness, candor, and integrity are at the core of our values and everything we do. GlobalLogic, a Hitachi Group Company, is a trusted digital engineering partner known for creating innovative and impactful digital products and experiences. We collaborate with clients to transform businesses and redefine industries through intelligent products, platforms, and services.,
Posted 3 weeks ago
3.0 - 5.0 years
3 - 8 Lacs
Pune
Work from Office
Role Overview As an SRE Engineer , you will work on building reliable, scalable, and automated infrastructure using modern IaC and scripting tools. Youll also contribute to end-to-end cloud migration projects — including on-prem to cloud and cloud-to-cloud scenarios — using industry-standard services and tools. Key Responsibilities Develop and maintain Infrastructure as Code (IaC) using Terraform and AWS CloudFormation (CFT) to provision and manage cloud environments. Automate infrastructure setup, configuration, and deployment workflows using Python and Ansible . Participate in cloud migration projects , assisting in workload planning, execution, and validation for both on-prem to cloud and cloud-to-cloud scenarios. Work with migration tools like AWS Application Migration Service (MGN) , Application Discovery Service , Elastic Disaster Recovery etc. Implement CI/CD pipelines , monitor system health, and support high-availability and disaster recovery configurations. Follow SRE best practices , including monitoring, alerting, incident management, and root cause analysis. Contribute to reusable automation modules, internal knowledge base, and technical documentation. Required Skills & Experience 3–4 years of hands-on experience in cloud infrastructure, automation, or SRE/DevOps roles. Proven experience with: Terraform and CloudFormation (CFT) Python scripting Ansible for configuration automation Involved in migration of workloads from on-premises to cloud or cloud-to-cloud across AWS and/or Azure. Working knowledge of AWS migration tools , such as: AWS Application Migration Service (MGN) Application Discovery Service Elastic Disaster Recovery AWS Database Migration Service AWS Migration Hub Understanding of networking, IAM, VPCs, security groups, firewalls, and DNS in cloud environments. Experience with monitoring/logging tools (e.g., CloudWatch , Azure Monitor , Prometheus , Grafana ) and troubleshooting. Good grasp of SRE concepts like SLAs, SLOs, incident response , and automation-first mindset .
Posted 1 month ago
3.0 - 8.0 years
9 - 19 Lacs
Pune, Chennai, Bengaluru
Work from Office
Dear Applicant, We have an exciting opportunity in the field of SRE Engineering (Python Scripting) .The successful candidate shall resolve SRE incidents and proactively improve the observability About this position: We are looking for a skilled SRE/DevOps Engineer with expertise in scripting, cloud infrastructure, monitoring, and incident management to ensure the reliability, scalability, and performance of our systems. The ideal candidate will have hands-on experience in Python/Go scripting, GCP, Kubernetes, and CI/CD tools, along with strong troubleshooting skills in Linux and networking. Impact you will realize: Job Responsibilities Enhances Cloud & DevOps Expertise Working with GCP, Kubernetes, and CI/CD tools will deepen your cloud infrastructure and automation skills. Sharpens Scripting & Debugging Abilities: Developing and optimizing Python/Go scripts will improve your coding efficiency and troubleshooting mindset. Builds Strong Observability & Incident Management Skills Hands-on experience with monitoring tools (Grafana, Datadog) and log analysis will make you adept at maintaining system reliability. Boosts Problem-Solving in Real-World Scenarios Troubleshooting Linux, networking, and cloud security issues will refine your ability to diagnose and resolve production challenges effectively. Key skills you will require: Primary Skills Strong scripting skills in Python (must) and/or Go (preferred). Hands-on experience with GCP (logging, security, resource management). Familiarity with monitoring tools (Grafana, Datadog, Prometheus). Knowledge of Linux, Kubernetes, and networking fundamentals. Experience with CI/CD pipelines (Jenkins, Terraform, Ansible). Ability to analyze logs, debug issues, and optimize performance. Qualifications you must require Bachelors degree in computer science, Engineering, or a related field, or equivalent work experience.
Posted 1 month ago
3.0 - 8.0 years
9 - 15 Lacs
Pune, Bengaluru
Hybrid
Dear Applicant, We have an exciting opportunity in the field of SRE Engineering (Python Scripting) .The successful candidate shall resolve SRE incidents and proactively improve the observability About this position: We are looking for a skilled SRE/DevOps Engineer with expertise in scripting, cloud infrastructure, monitoring, and incident management to ensure the reliability, scalability, and performance of our systems. The ideal candidate will have hands-on experience in Python/Go scripting, GCP, Kubernetes, and CI/CD tools, along with strong troubleshooting skills in Linux and networking. Impact you will realize: Job Responsibilities Enhances Cloud & DevOps Expertise Working with GCP, Kubernetes, and CI/CD tools will deepen your cloud infrastructure and automation skills. Sharpens Scripting & Debugging Abilities: Developing and optimizing Python/Go scripts will improve your coding efficiency and troubleshooting mindset. Builds Strong Observability & Incident Management Skills Hands-on experience with monitoring tools (Grafana, Datadog) and log analysis will make you adept at maintaining system reliability. Boosts Problem-Solving in Real-World Scenarios Troubleshooting Linux, networking, and cloud security issues will refine your ability to diagnose and resolve production challenges effectively. Key skills you will require: Primary Skills Strong scripting skills in Python (must) and/or Go (preferred). Hands-on experience with GCP (logging, security, resource management). Familiarity with monitoring tools (Grafana, Datadog, Prometheus). Knowledge of Linux, Kubernetes, and networking fundamentals. Experience with CI/CD pipelines (Jenkins, Terraform, Ansible). Ability to analyze logs, debug issues, and optimize performance. Qualifications you must require Bachelors degree in computer science, Engineering, or a related field, or equivalent work experience.
Posted 1 month ago
3.0 - 5.0 years
15 - 18 Lacs
Pune
Work from Office
Experience: 3 to 5 years in cloud infrastructure operations, L1 incident management, automation support, and observability, with team coordination or mentoring experience. Location: Pune Shift: 24x7 Support (Rotational Shifts) Education: BE/B.Tech (Relevant certifications preferred AWS Cloud Practitioner/Associate, Azure Fundamentals, CKA, Terraform Associate) Job Summary: We are seeking a L1 Lead – Site Reliability Engineer (SRE) to guide and manage the frontline SRE team in ensuring the stability, availability, and efficiency of enterprise-scale cloud infrastructure operations. This role involves supervising incident response, ensuring adherence to runbooks and SOPs, providing technical guidance to L1 engineers, and being the key escalation point for L1 issues. You will be responsible for monitoring cloud services, triaging alerts, validating remediation efforts, mentoring junior engineers, and collaborating with L2/L3 teams for escalations and root cause analysis. Responsibilities: Lead and mentor the L1 SRE team during shifts, ensuring timely response and proper handling of incidents, service requests, and alerts. Oversee infrastructure and application monitoring using tools such as Prometheus, Grafana, AWS CloudWatch, and Azure Monitor. Validate and guide remediation actions like pod restarts, disk space cleanup, scaling, and alert verification. Ensure SOPs, runbooks , and shift handover notes are followed and updated regularly. Execute and validate predefined Ansible playbooks, Terraform scripts, and CI/CD pipelines with junior team members. Act as the first point of escalation for unresolved L1 issues and coordinate with L2/L3 teams for resolution and RCA. Govern and track shift performance, including SLA compliance, FCR (First Call Resolution), and ticket hygiene. Coordinate patching, backup checks, standard changes, and validations in AWS/Azure environments. Facilitate onboarding of new L1 engineers, and deliver knowledge-sharing and refresher training sessions. Support automation initiatives by identifying repetitive tasks and creating/reviewing simple scripts. Conduct weekly/monthly shift reports and participate in SRE governance and review calls with operations leadership. Monitor the health of Kubernetes clusters and guide the team in basic pod/node/service troubleshooting. Skills/Expertise: 3+ years of experience in cloud infrastructure operations with at least 1 year in a lead or mentoring role. Strong troubleshooting, coordination, documentation, and escalation management skills. Proven ability to lead shifts in a 24x7 support model. Familiarity with ITSM practices and SLA management ( ServiceNow or similar). Proactive and structured communicator, capable of shift planning, reporting, and stakeholder updates. Technical Skills: Experience monitoring and operating cloud-based environments with basic troubleshooting for system and application-level issues. Familiarity with cloud services and concepts across AWS, such as EC2, S3, IAM, VPC, etc and Azure DevOps services. Basic knowledge of container platforms such as Docker and Kubernetes (understanding pod/service basics, logs, etc.). Exposure to scripting using Shell, Bash, or Python for automation of routine tasks. Basic understanding of version control systems like Git, GitHub, or GitLab. Awareness of infrastructure-as-code and automation tools such as Ansible, Terraform, or CloudFormation (execution under guidance). Familiar with CI/CD concepts and tools like Jenkin or GitLab CI (executing builds, monitoring pipelines). Understanding of alerting and monitoring tools like Grafana, ELK, site 24*7, CloudWatch and Prometheus Hands-on with ITSM tools such as ServiceNow for incident and ticket tracking. Role & responsibilities Preferred candidate profile
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
40175 Jobs | Dublin
Wipro
19626 Jobs | Bengaluru
Accenture in India
17497 Jobs | Dublin 2
EY
16057 Jobs | London
Uplers
11768 Jobs | Ahmedabad
Amazon
10704 Jobs | Seattle,WA
Oracle
9513 Jobs | Redwood City
IBM
9439 Jobs | Armonk
Bajaj Finserv
9311 Jobs |
Accenture services Pvt Ltd
8745 Jobs |