Jobs
Interviews

8 Opsgenie Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 9.0 years

0 Lacs

kolkata, west bengal

On-site

You are a passionate and customer-focused AWS Solutions Architect seeking to join Workmates, the fastest-growing partner to the world's major cloud provider, AWS. In this role, you will play a crucial part in driving innovation, creating differentiated solutions, and shaping new customer experiences. Collaborating with industry specialists and technology experts, you will help customers maximize the benefits of AWS in their cloud journey. By choosing Workmates and the AWS Practice, you will elevate your AWS expertise to new heights in an innovative and collaborative setting. Embrace the opportunity to lead the way in native cloud transformation with the leading partner in AWS growth worldwide. At Workmates, we value our people as our greatest assets and are committed to fostering a culture of excellence in cloud-native operations. Join us in our mission to drive innovation across Cloud Management, Media, DevOps, Automation, IoT, Security, and more. Be part of a team where independence and ownership are encouraged, allowing you to thrive authentically. Role Description: - Build and manage cloud infrastructure environments - Ensure availability, performance, security, and scalability of production systems - Collaborate with application teams to implement DevOps practices throughout the development lifecycle - Ability to develop solution prototypes and conduct proof of concepts for new tools - Design automated, repeatable, and scalable processes to enhance efficiency and software quality, including managing Infrastructure as Code and developing internal tooling to simplify workflows - Automate and streamline operations and processes - Troubleshoot and diagnose issues/outages, providing operational support - Engage in incident handling, promoting a culture of post-mortem analysis and knowledge sharing Requirements: - Minimum of 5 years of hands-on experience in building and supporting large-scale environments - Strong background in Architecting and Implementing AWS Cloud solutions - Proficiency in AWS CloudFormation and Terraform - Experience with Docker Containers, container environment build and deployment - Proficient in Kubernetes and EKS - Sysadmin and infrastructure expertise (Linux internals, filesystems, networking) - Skilled in scripting, particularly Bash scripting - Experience with code check-in, peer review, and collaboration within distributed teams - Hands-on experience in CI/CD pipeline setup and release - Strong familiarity with CI/CD tools such as Jenkins, GitLab, or TravisCI - Proficient in AWS Developer tools like AWS Code Pipeline, Code Build, Code Deploy, AWS Lambda, AWS Step Function, etc. - Experience with log management solutions (ELK/EFK or similar) - Proficiency in Configuration Management tools like Ansible or similar - Expertise in modern Monitoring and Alerting tools (CloudWatch, Prometheus, Grafana, Opsgenie, etc.) - Passion for automating tasks and troubleshooting production issues - Experience in automation testing, script generation, and integration with CI/CD - Skilled in AWS Security (IAM, Security Groups, KMS, etc.) - Must have CKA/CKAD/CKS Certifications and knowledge of Python/Go/Bash Good to have: - AWS Professional Certifications - Experience with Service Mesh and Distributed tracing - Knowledge of Scrum/Agile methodology Choose Workmates to advance your career and be part of a team dedicated to delivering innovative solutions in a dynamic and supportive environment. Join us in shaping the future of cloud technology and making a meaningful impact on the industry.,

Posted 1 day ago

Apply

7.0 - 9.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Job Title: SRE Engineer with GCP cloud Location: Hyderabad & Ahmedabad Work Model: Hybrid 3 Days from office Exp in year: 7years+ Job Overview Dynamic, motivated individuals deliver exceptional solutions for the production resiliency of the systems. The role incorporates aspects of software engineering and operations, DevOps skills to come up with efficient ways of managing and operating applications. The role will require a high level of responsibility and accountability to deliver technical solutions. Summary: As a Senior SRE, you will ensure platform reliability, incident management, and performance optimization. You&aposll define SLIs/SLOs, contribute to robust observability practices, and drive proactive reliability engineering across services. Roles and Responsibilities: Define and measure Service Level Indicators (SLIs), Service Level Objectives (SLOs), and manage error budgets across services. Lead incident management for critical production issues drive root cause analysis (RCA) and post-mortems. Create and maintain run books and standard operating procedures for high[1]availability services. Design and implement observability frameworks using ELK, Prometheus, and Grafana; drive telemetry adoption. Coordinate cross-functional war-room sessions during major incidents and maintain response logs. Develop and improve automated system recovery, alert suppression, and escalation logic. Use GCP tools like GKE, Cloud Monitoring, and Cloud Armor to improve performance and security posture. Collaborate with DevOps and Infrastructure teams to build highly available and scalable systems. Analyse performance metrics and conduct regular reliability reviews with engineering leads. Participate in capacity planning, failover testing, and resilience architecture reviews. Mandatory: Cloud: GCP (GKE, Load Balancing, VPN, IAM) Observability: Prometheus, Grafana, ELK, Data dog Containers & Orchestration: Kubernetes, Docker Incident Management: On-call, RCA, SLIs/SLOs IaC: Terraform, Helm Incident Tools: PagerDuty, OpsGenie Nice to Have: GCP Monitoring, Sky walking Service Mesh, API Gateway GCP Spanner, MongoDB (basic) Show more Show less

Posted 4 days ago

Apply

2.0 - 6.0 years

0 Lacs

kolkata, west bengal

On-site

You are a passionate and customer obsessed AWS Solutions Architect looking to join Workmates, the fastest growing partner to the worlds major cloud provider, AWS. Your role will involve driving innovation, building differentiated solutions, and defining new customer experiences to help customers maximize their AWS potential in their cloud journey. Working alongside industry specialist organizations and technology groups, you will play a key role in leading our customers towards native cloud transformation. Choosing Workmates and the AWS Practice will enable you to elevate your AWS experience and skills in an innovative and collaborative environment. At Workmates, you will have the opportunity to lead the worlds AWS growing partner in pioneering cloud transformation and be at the forefront of cloud advancements. Join Workmates in delivering innovative work as part of your extraordinary career. People are considered the biggest assets at Workmates, and together we aim to achieve best-in-class cloud native operations. Be part of our mission to drive innovations across Cloud Management, Media, DevOps, Automation, IoT, Security, and more, where independence and ownership are valued, allowing you to thrive and contribute your best. Responsibilities: - Building and maintaining cloud infrastructure environments - Ensuring availability, performance, security, and scalability of production systems - Collaborating with application teams to implement DevOps practices - Creating solution prototypes and conducting proof of concepts for new tools - Designing repeatable, automated, and scalable processes to enhance efficiency - Automating and streamlining operations and processes - Troubleshooting and diagnosing issues/outages and providing operational support - Engaging in incident handling and supporting a culture of post-mortem and knowledge sharing Requirements: - 2+ years of hands-on experience in building and supporting large-scale environments - Strong Architecting and Implementation Experience with AWS Cloud - Proficiency in AWS CloudFormation and Terraform - Experience in Docker Containers and container environment deployment - Good understanding and work experience in Kubernetes and EKS - Sysadmin and infrastructure background (Linux internals, filesystems, networking) - Proficiency in scripting, particularly writing Bash scripts - Familiarity with CI/CD pipeline build and release - Experience with CICD tools like Jenkins/GitLab/TravisCI - Hands-on experience with AWS Developer tools such as AWS Code Pipeline, Code Build, Code Deploy, AWS Lambda, AWS Step Function, etc. - Experience in log management solutions (ELK/EFK or similar) - Experience with Configuration Management tools like Ansible or similar - Proficiency in modern Monitoring and Alerting tools like CloudWatch, Prometheus, Grafana, Opsgenie, etc. - Strong passion for automating routine tasks and solving production issues - Experience in automation testing, script generation, and integration with CI/CD - Familiarity with AWS Security features (IAM, Security Groups, KMS, etc.) - Good to have experience in database technologies (MongoDB/MySQL, etc.) Desired Skills: - AWS Professional Certifications - CKA/CKAD Certifications - Knowledge of Python/Go - Experience with Service Mesh and Distributed tracing - Familiarity with Scrum/Agile methodology Join Workmates and be part of a team that values innovation, collaboration, and continuous improvement in the cloud technology landscape. Your expertise and skills will play a crucial role in driving customer success and shaping the future of cloud solutions.,

Posted 1 week ago

Apply

3.0 - 7.0 years

0 Lacs

chennai, tamil nadu

On-site

You should have excellent/good communication skills and team management skills to effectively handle L1/L2 Monitoring and Incident Management. Your responsibilities will include managing shifts independently, monitoring alerts, initiating Bridge Calls, engaging stakeholders, being present throughout the bridge call, and preparing Problem statements. It is crucial to adhere to SLAs and follow up on issues with respective application teams. You should also have experience in ticket creations using Service Now/JIRA Ticketing tools and monitoring server/application alerts using tools like SolarWinds, Opsgenie, Splunk. Basic knowledge of Linux, Windows, and Networks is preferred, with L0 level expertise being adequate. This role requires working in 24x7 rotational shifts with week offs.,

Posted 1 week ago

Apply

10.0 - 16.0 years

30 - 45 Lacs

Bengaluru

Remote

- AWS & SaaS architecture - monitoring tools(Datadog, New Relic, Prometheus, Grafana) - incident mngmnt (PagerDuty, ServiceNow, Zendesk, Opsgenie) - Exp running 24x7 Cloud Ops team - DevOps processes, CI/CD pipelines, IaC tools(Terraform, Ansible)

Posted 1 week ago

Apply

1.0 - 5.0 years

0 Lacs

chennai, tamil nadu

On-site

As a candidate for the position of L1/ L2 Monitoring and Incident Management, you should possess excellent communication skills and the ability to manage shifts independently. Your day-to-day responsibilities will include monitoring alerts, initiating Bridge Calls, engaging with stakeholders, being actively involved throughout the bridge call, and preparing Problem statements. It is essential to adhere to SLAs and follow up on issues with the respective application teams. You should have experience in creating tickets in Service Now/JIRA Ticketing tools and monitoring server/application alerts using Monitoring Tools like SolarWinds, Opsgenie, and Splunk. Additionally, having basic knowledge (L0 is adequate) of Linux, Windows, and Networks is preferred for this role. This position requires working in 24x7 rotational shifts with week offs included.,

Posted 1 week ago

Apply

8.0 - 12.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

About Zeta Zeta is a Next-Gen Banking Tech company that empowers banks and fintechs to launch banking products for the future. It was founded by and Ramki Gaddipati in 2015. Our flagship processing platform - Zeta Tachyon - is the industry's first modern, cloud-native, and fully API-enabled stack that brings together issuance, processing, lending, core banking, fraud & risk, and many more capabilities as a single-vendor stack. 20M+ cards have been issued on our platform globally. Zeta is actively working with the largest Banks and Fintechs in multiple global markets transforming customer experience for multi-million card portfolios. Zeta has over 1700+ employees - with over 70% roles in R&D - across locations in the US , EMEA , and Asia . We raised $280 million at a $1.5 billion valuation from Softbank, Mastercard, and other investors in 2021. Learn more @, , , The Site Delivery Manager is responsible for end-to-end service delivery and operational excellence for a specific site. This role ensures the stability, performance, and continuous improvement of IT services, while managing key performance indicators (KPIs), incident and change management, cost governance, and customer satisfaction. The individual will serve as the primary liaison between business stakeholders, SRE/infra teams, and other technology units to drive operational maturity and service reliability. Responsibilities: Service Delivery & Operations Management Own and manage site-level SLAs for incidents, problems, and changes Ensure adherence to MTTA (Mean Time to Acknowledge) and MTTR (Mean Time to Resolve) metrics for Alerts & Incidents Oversee incident lifecycle and ensure timely Root Cause Analysis (RCA) Track problem ticket aging and drive problem resolution Manage service delivery reviews, post-incident reviews, and escalations Change Management Lead the Change Advisory Board (CAB) process at the site level Review and approve changes ensure minimal service disruption during deployments Validate and document post-deployment summaries and outcomes Monitoring & Governance Oversee handover of SaaS product monitoring responsibilities to Zeta command center (ZCC) Monitor alerts, dashboards, and performance trends to proactively prevent incidents Maintain high security posture by coordinating with InfoSec and Compliance teams Customer and Stakeholder Engagement Act as the primary point of contact for internal and external stakeholders at the site Own customer-facing RCA communication and service quality improvements Facilitate cross-functional collaboration across product, SRE, infrastructure, and customer teams Cost & Resource Management Own and manage the site's technology budget ensure cost adherence Conduct monthly/quarterly cost anomaly analysis and optimizations Work with platform and finance team for infrastructure/resource planning People & Process Drive process improvements and operational maturity Foster a culture of accountability, resilience, and continuous improvement Skills: Strong operational and delivery management Excellent communication, stakeholder, and conflict-resolution skills Data-driven decision-making and analytical thinking Budgeting, cost analysis, and resource planning Familiarity with cloud platforms (AWS) Experience & Qualifications: Bachelor's degree in computer science, Engineering, or a related field (master's preferred) 8-12 years of experience in IT Service Management, SRE, or infrastructure operations Strong understanding of ITIL framework, site reliability principles, and cloud operations Experience with monitoring tools (e.g., Datadog, Prometheus, Grafana), incident platforms (e.g., OpsGenie/PagerDuty, Jira Service Management / ServiceNow), and change management tools Proven leadership skills in managing cross-functional teams and engaging with senior stakeholders

Posted 1 month ago

Apply

5 - 7 years

20 - 27 Lacs

Pune

Hybrid

Role: AppOps engineer Location: Pune, Hinjewadi Hybrid (3 days a week) Exp - 5 - 7 years Responsibilities: • Designing and implementing infrastructure and systems (such as metrics, monitoring, node management, alerting, deployment, logging) • Setup new environments & deploying solutions • Application migration from EC2 to containers • Building proactive Monitoring & alerting service. • Automation using ansible, python, Perl scripting • Performance and stability problems investigation - internal and on client sites • Tuning Actimize Platform(AIS and RCM)/Operating System/Application servers/Databases for optimal performance and stability • Identifying performance bottlenecks and assisting in root cause analysis. • Performance related design reviews • Create and setup deployment scripts for different environments (i.e. Test properties vs Prod properties) • Configure and optimize instances and web servers for optimal performance. (ex: adjusting default connection limits, adjusting request queuing thresholds) • AWS troubleshooting support • Support, Architect and Implement alongside Technical & Operations teams to meet our customers' individual needs for their infrastructure & application deployments. • Work on critical, highly complex customer problems that will span multiple AWS services (dealing daily with high severity incidents). • Help build and improve customer operations through scripts to automate and deploy AWS resources seamlessly with as little manual intervention as possible. • Collaborate and help build utilities and tools for internal use that enable you and your fellow AWS Engineers to operate safely at high speed / wide scale. • Drive customer communication during critical events. • Flexible to work over the weekends and in shift environment ( as per • Good experience in a DevOps environment / Operations team / Infrastructure Operations team. • Excellent Troubleshooting skills • Expertise in Performance tuning / investigation / root cause analysis / mitigate bottlenecks • Excellent hands-on experience in managing Application Support (3 tier/2 tier apps) • AWS service knowledge for core services (EC2, S3, IAM, ASG, ELB, CFN, VPC, DX, VPN, ) • Good exposure on managing Containers & Kubernetes, deployment and configuration on containers • Good hands-on experience in deployment, release management, migration activities • Exposure to scripting language (Ansible, Perl, Python, Ruby, Shell script, PowerShell etc.) • Database skills ( SQL ,Oracle or Postgres / Cassandra ) • Good exposure on ELK, Splunk, Kafka • Application Server (skills on any of Middleware technologies e.g. • Tomcat, WebLogic , WebSphere) • Good exposure on Application performance monitoring tools like • AppDynamics, Dynatrace • Strong problem solving, analytical and communication skills • Good communication both written and verbal • Troubleshooting performance issues & tuning • Working with Architecture team on hardware sizing recommendations • JAVA performance testing, diagnosis, and tuning JAVA applications Additional Skills Desired: • Cloud / Application level Security experience • Has worked in an Agile / Sprint development model. • Experience in working with tools like OpsGenie, AlertOps, Pagerduty/OpenDuty • Troubleshooting Java related issues • performance testing/investigation experience • Database performance testing, diagnosis, and tuning. please drop mail with your details and resume to chaithra.j@xoriant.com to proceed further.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies