Home
Jobs

257 Opentelemetry Jobs - Page 4

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 years

0 Lacs

Chennai, Tamil Nadu, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

7.0 years

0 Lacs

Kolkata, West Bengal, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

7.0 years

0 Lacs

Hyderabad, Telangana, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

7.0 years

0 Lacs

Kanayannur, Kerala, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

7.0 years

0 Lacs

Trivandrum, Kerala, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

7.0 years

0 Lacs

Noida, Uttar Pradesh, India

Remote

Linkedin logo

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. The opportunity We are looking for a skilled Cloud DevOps Engineer with expertise in both AWS and Azure platforms. This role is responsible for end-to-end DevOps support, infrastructure automation, CI/CD pipeline troubleshooting, and incident resolution across cloud environments. The role will handle escalations, lead root cause analysis, and collaborate with engineering and infrastructure teams to deliver high-availability services. You will also contribute to enhancing runbooks, SOPs, and mentoring junior engineers Your Key Responsibilities Act as a primary escalation point for DevOps-related and infrastructure-related incidents across AWS and Azure. Provide troubleshooting support for CI/CD pipeline issues, infrastructure provisioning, and automation failures. Support containerized application environments using Kubernetes (EKS/AKS), Docker, and Helm. Create and refine SOPs, automation scripts, and runbooks for efficient issue handling. Perform deep-dive analysis and RCA for recurring issues and implement long-term solutions. Handle access management, IAM policies, VNet/VPC setup, security group configurations, and load balancers. Monitor and analyze logs using AWS CloudWatch, Azure Monitor, and other tools to ensure system health. Collaborate with engineering, cloud platform, and security teams to maintain stable and secure environments. Mentor junior team members and contribute to continuous process improvements. Skills And Attributes For Success Hands-on experience with CI/CD tools like GitHub Actions, Azure DevOps Pipelines, and AWS CodePipeline. Expertise in Infrastructure as Code (IaC) using Terraform; good understanding of CloudFormation and ARM Templates. Familiarity with scripting languages such as Bash and Python. Deep understanding of AWS (EC2, S3, IAM, EKS) and Azure (VMs, Blob Storage, AKS, AAD). Container orchestration and management using Kubernetes, Helm, and Docker. Experience with configuration management and automation tools such as Ansible. Strong understanding of cloud security best practices, IAM policies, and compliance standards. Experience with ITSM tools like ServiceNow for incident and change management. Strong documentation and communication skills. To qualify for the role, you must have 7+ years of experience in DevOps, cloud infrastructure operations, and automation. Hands-on expertise in AWS and Azure environments. Proficiency in Kubernetes, Terraform, CI/CD tooling, and automation scripting. Experience in a 24x7 rotational support model. Relevant certifications in AWS and Azure (e.g., AWS DevOps Engineer, Azure Administrator Associate). Technologies and Tools Must haves Cloud Platforms: AWS, Azure CI/CD & Deployment: GitHub Actions, Azure DevOps Pipelines, AWS CodePipeline Infrastructure as Code: Terraform Containerization: Kubernetes (EKS/AKS), Docker, Helm Logging & Monitoring: AWS CloudWatch, Azure Monitor Configuration & Automation: Ansible, Bash Incident & ITSM: ServiceNow or equivalent Certification: AWS and Azure relevant certifications Good to have Cloud Infrastructure: CloudFormation, ARM Templates Security: IAM Policies, Role-Based Access Control (RBAC), Security Hub Networking: VPC, Subnets, Load Balancers, Security Groups (AWS/Azure) Scripting: Python/Bash Observability: OpenTelemetry, Datadog, Splunk Compliance: AWS Well-Architected Framework, Azure Security Center What We Look For Enthusiastic learners with a passion for cloud technologies and DevOps practices. Problem solvers with a proactive approach to troubleshooting and optimization. Team players who can collaborate effectively in a remote or hybrid work environment. Detail-oriented professionals with strong documentation skills. What We Offer EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across six locations – Argentina, China, India, the Philippines, Poland and the UK – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career. Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next. Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way. Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs. Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate. Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today. Show more Show less

Posted 4 days ago

Apply

5.0 years

0 Lacs

India

Remote

Linkedin logo

eTip eTip is the leading digital tipping platform for the hospitality and service industry, empowering businesses with tools to attract, retain, and motivate their hardworking staff. Trusted by thousands of leading hotels, restaurants, and management companies, eTip stands out due to its commitment to customer security, product customization, dedication to customer service, and to its numerous partnerships including with Visa. Your Calling As a Senior DevOps Engineer, you will own and drive the DevOps strategy for our cloud-native tech stack built on top of AWS, Kubernetes, and Karpenter. You’ll design, implement, and optimize scalable, secure, and highly available infrastructure that processes millions of dollars while fostering a culture of automation, observability, and CI/CD excellence. What You’ll Do Infrastructure & Cloud Leadership Architect, deploy, and manage AWS cloud infrastructure (EKS, EC2, VPC, IAM, RDS, S3, Lambda, etc). Lead Kubernetes (EKS) cluster design, scaling, and optimization using Karpenter for cost-efficient autoscaling. Optimize cloud costs while ensuring performance and reliability. CI/CD & Automation Develop & maintain Github Action CI/CD pipeline workflows for backend, web frontend, & Android/iOS. Observability & Reliability Develop & maintain logging (Loki), monitoring (Prometheus, Grafana), and alerting to ensure system health. Security & Compliance Harden Kubernetes clusters (RBAC, network policies, OPA/Gatekeeper). Ensure compliance with SOC2, ISO 27001, or other security frameworks. Application development When infrastructure work is down, develop application features on backend/frontend depending on where your strengths/interests fit. Skills You Bring 5+ years of DevOps/SRE experience for SaaS companies. Deep expertise in AWS & Kubernetes. Proficiency in Karpenter, Helm and other Kubernetes operators. Strong development skills (Kotlin, Python, Go, or Bash). Experience with observability tools (Prometheus, Grafana, OpenTelemetry). Security-first mindset with knowledge of networking and cost optimization. Why You’ll Love Working Here Own and shape DevOps for a cutting-edge cloud-native stack. Work alongside very passionate & talented engineers. Work on a very high impact product that processes millions of dollars. Remote first, flexible work environment. Growth opportunities in a small, collaborative, high-impact team. Participate in yearly off-sites that take place all around the world. Eager to build the future of tipping with us? 💪 Apply today! 🚀 Show more Show less

Posted 4 days ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Description We are looking for a talented Software Engineer (Go) to join our dynamic team. In this role, you will play a crucial part in developing high-performance back-end services that support financial service applications. Your focus will be on collaborating with cross-functional teams to create innovative solutions for complex problems in the asset management space. This position offers the flexibility of hybrid working, allowing you to balance your work and personal life effectively. We are particularly seeking candidates who are proficient in integrating AI tools into their daily development cycle to improve productivity, code quality, and problem-solving. Key Responsibilities Design and develop highly scalable and reliable services in GO language. Collaborating with cross-functional teams to design, develop, and test software solutions. Kafka integration and implementation with Go services. Leverage the corporate AI assistant and other strategic coding tools to enhance development workflows. Actively use AI tools to support code generation, debugging, documentation, and testing. Ensure that all microservices are highly available and fault tolerant. Troubleshooting and debugging issues as they arise. Keeping up to date with emerging trends, AI-assisted development practices, and best practices in front-end development. Participating in code reviews and contributing to a positive team culture. Ensure all code written has the appropriate level of unit test coverage. Requirements & Qualifications (Go Developer) Bachelor's degree in computer science, Software Engineering, or a related field. Proven experience as a Go Developer or in a similar back-end development role. Strong proficiency in the Go programming language and its standard library. Experience building scalable, high-performance backend services and APIs. Familiarity with RESTful and gRPC API design and implementation. Understanding of concurrency patterns and goroutine-based architecture in Go. Knowledge of modern Go development tools such as go mod, go test, and golangci-lint. Experience working with databases (SQL and NoSQL), e.g., PostgreSQL, MySQL, MongoDB. Hands-on experience with version control systems such as Git. Demonstrated ability to leverage AI tools (e.g., GitHub Copilot, ChatGPT, AI-powered testing/linting tools) to boost development productivity and code quality. Excellent problem-solving skills and keen attention to detail. Ability to work independently and collaboratively in a fast-paced environment. Strong verbal and written communication skills. Familiarity with cloud platforms such as AWS, GCP, or Azure, and infrastructure tools like Docker and Kubernetes. Experience with CI/CD pipelines and tools like GitHub Actions, CircleCI, or Jenkins. Knowledge of observability practices and tools such as Prometheus, Grafana, and OpenTelemetry. Understanding of security best practices in backend development. Show more Show less

Posted 5 days ago

Apply

3.0 years

0 Lacs

Noida, Uttar Pradesh, India

On-site

Linkedin logo

Site Reliability Engineer I Job Summary Site Reliability Engineers (SRE's) cover the intersection of Software Engineer and Systems Administrator. In other words, they can both create code and manage the infrastructure on which the code runs. This is a very wide skillset, but the end goal of an SRE is always the same: to ensure that all SLAs are met, but not exceeded, so as to balance performance and reliability with operational costs. As a Site Reliability Engineer I, you will be learning our systems, improving your craft as an engineer, and taking on tasks that improve the overall reliability of the VP platform. Key Responsibilities Design, implement, and maintain robust monitoring and alerting systems. Lead observability initiatives by improving metrics, logging, and tracing across services and infrastructure. Collaborate with development and infrastructure teams to instrument applications and ensure visibility into system health and performance. Write Python scripts and tools for automation, infrastructure management, and incident response. Participate in and improve the incident management and on-call process, driving down Mean Time to Resolution (MTTR). Conduct root cause analysis and postmortems following incidents, and champion efforts to prevent recurrence. Optimize systems for scalability, performance, and cost-efficiency in cloud and containerized environments. Advocate and implement SRE best practices, including SLOs/SLIs, capacity planning, and reliability reviews. Required Skills & Qualifications 3+ years of experience in a Site Reliability Engineer or similar role. Proficiency in Python for automation and tooling. Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, New Relic, OpenTelemetry, etc. Experience with log aggregation and analysis tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Fluentd. Good understanding of cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes). Familiarity with infrastructure-as-code (Terraform, Ansible, or similar). Strong debugging and incident response skills. Knowledge of CI/CD pipelines and release engineering practices. Show more Show less

Posted 5 days ago

Apply

10.0 years

0 Lacs

Hyderābād

On-site

Company: Qualcomm India Private Limited Job Area: Engineering Group, Engineering Group > Software Engineering General Summary: Job Summary: Qualcomm is seeking a seasoned Staff Engineer, DevOps to join our central software engineering team. In this role, you will lead the design, development, and deployment of scalable cloud-native and hybrid infrastructure solutions, modernize legacy systems, and drive DevOps best practices across products. This is a hands-on architectural role ideal for someone who thrives in a fast-paced, innovation-driven environment and is passionate about building resilient, secure, and efficient platforms. Key Responsibilities: Architect and implement enterprise-grade AWS cloud solutions for Qualcomm’s software platforms. Design and implement CI/CD pipelines using Jenkins, GitHub Actions, and Terraform to enable rapid and reliable software delivery. Develop reusable Terraform modules and automation scripts to support scalable infrastructure provisioning. Drive observability initiatives using Prometheus, Grafana, Fluentd, OpenTelemetry, and Splunk to ensure system reliability and performance. Collaborate with software development teams to embed DevOps practices into the SDLC and ensure seamless deployment and operations. Provide mentorship and technical leadership to junior engineers and cross-functional teams. Manage hybrid environments, including on-prem infrastructure and Kubernetes workloads supporting both Linux and Windows. Lead incident response, root cause analysis, and continuous improvement of SLIs for mission-critical systems. Drive toil reduction and automation using scripting or programming languages such as PowerShell, Bash, Python, or Go. Independently drive and implement DevOps/cloud initiatives in collaboration with key stakeholders. Understand software development designs and compilation/deployment flows for .NET, Angular, and Java-based applications to align infrastructure and CI/CD strategies with application architecture. Required Qualifications: 10+ years of experience in IT or software development, with at least 5 years in cloud architecture and DevOps roles. Strong foundational knowledge of infrastructure components such as networking, servers, operating systems, DNS, Active Directory, and LDAP. Deep expertise in AWS services including EKS, RDS, MSK, CloudFront, S3, and OpenSearch. Hands-on experience with Kubernetes, Docker, containerd, and microservices orchestration. Proficiency in Infrastructure as Code using Terraform and configuration management tools like Ansible and Chef. Experience with observability tools and telemetry pipelines (Grafana, Prometheus, Fluentd, OpenTelemetry, Splunk). Experience with agent-based monitoring tools such as SCOM and Datadog. Solid scripting skills in Python, Bash, and PowerShell. Familiarity with enterprise-grade web services (IIS, Apache, Nginx) and load balancing solutions. Excellent communication and leadership skills with experience mentoring and collaborating across teams. Preferred Qualifications: Experience with api gateway solutions for API security and management. Knowledge on RDBMS, preferably MSSQL/Postgresql is good to have. Proficiency in SRE principles including SLIs, SLOs, SLAs, error budgets, chaos engineering, and toil reduction. Experience in core software development (e.g., Java, .NET). Exposure to Azure cloud and hybrid cloud strategies. Bachelor’s degree in Computer Science or a related field Minimum Qualifications: Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 4+ years of Software Engineering or related work experience. OR Master's degree in Engineering, Information Systems, Computer Science, or related field and 3+ years of Software Engineering or related work experience. OR PhD in Engineering, Information Systems, Computer Science, or related field and 2+ years of Software Engineering or related work experience. 2+ years of work experience with Programming Language such as C, C++, Java, Python, etc. Applicants : Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, rest assured that Qualcomm is committed to providing an accessible process. You may e-mail disability-accomodations@qualcomm.com or call Qualcomm's toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to be able participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. (Keep in mind that this email address is used to provide reasonable accommodations for individuals with disabilities. We will not respond here to requests for updates on applications or resume inquiries). Qualcomm expects its employees to abide by all applicable policies and procedures, including but not limited to security and other requirements regarding protection of Company confidential information and other confidential and/or proprietary information, to the extent those requirements are permissible under applicable law. To all Staffing and Recruiting Agencies : Our Careers Site is only for individuals seeking a job at Qualcomm. Staffing and recruiting agencies and individuals being represented by an agency are not authorized to use this site or to submit profiles, applications or resumes, and any such submissions will be considered unsolicited. Qualcomm does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to our jobs alias, Qualcomm employees or any other company location. Qualcomm is not responsible for any fees related to unsolicited resumes/applications. If you would like more information about this role, please contact Qualcomm Careers.

Posted 5 days ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Matillion is The Data Productivity Cloud. We are on a mission to power the data productivity of our customers and the world, by helping teams get data business ready, faster. Our technology allows customers to load, transform, sync and orchestrate their data. We are looking for passionate, high-integrity individuals to help us scale up our growing business. Together, we can make a dent in the universe bigger than ourselves. With offices in the UK, US and Spain, we are now thrilled to announce the opening of our new office in Hyderabad, India. This marks an exciting milestone in our global expansion, and we are now looking for talented professionals to join us as part of our founding team. About the Role Are you ready to shape the future of reliability at scale? At Matillion, we’re looking for a Principal Engineer - Reliability to lead our cloud architecture and observability strategy across mission-critical systems. This high-impact role puts you at the heart of our cloud-native engineering team, designing resilient distributed systems that power data workloads across the globe. You’ll work cross-functionally with engineering, product, and leadership, helping to scale our platform as we continue our journey of global growth. We value in-person collaboration here at Matillion, therefore this role will work from our central Hyderabad office. What you'll be doing Leading the design and architecture of scalable, cloud-native systems that prioritise reliability and performance Owning observability and infrastructure strategy to ensure global uptime and rapid incident response Driving automation, sustainable incident practices, and blameless postmortems across teams Collaborating with engineering and product to shape scalable solutions from ideation to delivery Coaching and mentoring engineers, fostering a culture of technical excellence and innovation What we are looking for Deep expertise in Kubernetes and modern tooling like Linkerd, ArgoCD, or Traefik Pro-level programming skills (Go, Java or Python preferred) and familiarity with the broader ecosystem Proven experience building large-scale distributed systems in public cloud (AWS or Azure) Hands-on knowledge of observability tools like Prometheus, Grafana, OpenTelemetry, or Datadog Experience with messaging systems (e.g., Kafka) and secrets management (Vault, AWS Secrets Manager) A collaborative leader with strong communication skills and a passion for scalability, availability, and innovation Matillion has fostered a culture that is collaborative, fast-paced, ambitious, and transparent, and an environment where people genuinely care about their colleagues and communities. Our 6 core values guide how we work together and with our customers and partners. We operate a truly flexible and hybrid working culture that promotes work-life balance, and are proud to be able to offer the following benefits: - Company Equity - 27 days paid time off - 12 days of Company Holiday - 5 days paid volunteering leave - Group Mediclaim (GMC) - Enhanced parental leave policies - MacBook Pro - Access to various tools to aid your career development More about Matillion Thousands of enterprises including Cisco, DocuSign, Slack, and TUI trust Matillion technology to load, transform, sync, and orchestrate their data for a wide range of use cases from insights and operational analytics, to data science, machine learning, and AI. With over $300M raised from top Silicon Valley investors, we are on a mission to power the data productivity of our customers and the world. We are passionate about doing things in a smart, considerate way. We’re honoured to be named a great place to work for several years running by multiple industry research firms. We are dual headquartered in Manchester, UK and Denver, Colorado. We are keen to hear from prospective Matillioners, so even if you don’t feel you match all the criteria please apply and a member of our Talent Acquisition team will be in touch. Alternatively, if you are interested in Matillion but don't see a suitable role, please email talent@matillion.com. Matillion is an equal opportunity employer. We celebrate diversity and we are committed to creating an inclusive environment for all of our team. Matillion prohibits discrimination and harassment of any type. Matillion does not discriminate on the basis of race, colour, religion, age, sex, national origin, disability status, genetics, sexual orientation, gender identity or expression, or any other characteristic protected by law. Show more Show less

Posted 5 days ago

Apply

15.0 years

0 Lacs

Ahmedabad, Gujarat, India

Remote

Linkedin logo

At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact worldwide. When you apply for this position, you're taking the first step in a process that goes beyond the ordinary. We aim to align your passions and skills with our vacancies, setting you on a path to exceptional career development and success. DevOps Engineer - AWS at BairesDev We are looking for a DevOps Engineer with expertise in infrastructure as code using TypeScript with AWS CDK, and experience in deploying and managing cloud-native applications. This role focuses on automating cloud infrastructure, supporting CI/CD workflows, and maintaining observability and data integrity in production systems. The ideal candidate will bring solid skills in DevOps practices, IaC, and experience working with systems built on Amazon Web Services. You will work cross-functionally with developers, data teams, and SREs to support deployments, maintain CI/CD pipelines in Jenkins, and monitor environments using observability tools like Splunk and OpenTelemetry. What You'll Do: - Implement infrastructure as code using AWS CDK with TypeScript. - Deploy and manage cloud-native applications on AWS using Lambda, ECS Tasks, and S3. - Support and maintain CI/CD pipelines using Jenkins. - Collaborate with development teams to automate deployment processes. - Perform database operations and basic SQL tasks with Amazon Aurora and PostgreSQL. - Monitor production systems using observability tools. - Work with cross-functional teams to ensure system reliability and performance. What we are looking for: - 3+ years of experience in DevOps engineering roles. - Experience with AWS CDK using TypeScript. - Knowledge of AWS Cloud services, particularly Lambda, ECS Tasks, and S3. - Familiarity with database tasks and basic SQL, experience with Amazon Aurora and PostgreSQL. - Experience with automation and CI/CD using Jenkins. - Understanding of infrastructure as code principles. - Experience working with production systems. - Good communication skills and ability to work in cross-functional teams. - Advanced level of English. How we do make your work (and your life) easier: - 100% remote work (from anywhere). - Excellent compensation in USD or your local currency if preferred - Hardware and software setup for you to work from home. - Flexible hours: create your own schedule. - Paid parental leaves, vacations, and national holidays. - Innovative and multicultural work environment: collaborate and learn from the global Top 1% of talent. - Supportive environment with mentorship, promotions, skill development, and diverse growth opportunities. Apply now and become part of a global team where your unique talents can truly thrive! Show more Show less

Posted 5 days ago

Apply

15.0 years

0 Lacs

Agra, Uttar Pradesh, India

Remote

Linkedin logo

At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley. Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact worldwide. When you apply for this position, you're taking the first step in a process that goes beyond the ordinary. We aim to align your passions and skills with our vacancies, setting you on a path to exceptional career development and success. DevOps Engineer - AWS at BairesDev We are looking for a DevOps Engineer with expertise in infrastructure as code using TypeScript with AWS CDK, and experience in deploying and managing cloud-native applications. This role focuses on automating cloud infrastructure, supporting CI/CD workflows, and maintaining observability and data integrity in production systems. The ideal candidate will bring solid skills in DevOps practices, IaC, and experience working with systems built on Amazon Web Services. You will work cross-functionally with developers, data teams, and SREs to support deployments, maintain CI/CD pipelines in Jenkins, and monitor environments using observability tools like Splunk and OpenTelemetry. What You'll Do: - Implement infrastructure as code using AWS CDK with TypeScript. - Deploy and manage cloud-native applications on AWS using Lambda, ECS Tasks, and S3. - Support and maintain CI/CD pipelines using Jenkins. - Collaborate with development teams to automate deployment processes. - Perform database operations and basic SQL tasks with Amazon Aurora and PostgreSQL. - Monitor production systems using observability tools. - Work with cross-functional teams to ensure system reliability and performance. What we are looking for: - 3+ years of experience in DevOps engineering roles. - Experience with AWS CDK using TypeScript. - Knowledge of AWS Cloud services, particularly Lambda, ECS Tasks, and S3. - Familiarity with database tasks and basic SQL, experience with Amazon Aurora and PostgreSQL. - Experience with automation and CI/CD using Jenkins. - Understanding of infrastructure as code principles. - Experience working with production systems. - Good communication skills and ability to work in cross-functional teams. - Advanced level of English. How we do make your work (and your life) easier: - 100% remote work (from anywhere). - Excellent compensation in USD or your local currency if preferred - Hardware and software setup for you to work from home. - Flexible hours: create your own schedule. - Paid parental leaves, vacations, and national holidays. - Innovative and multicultural work environment: collaborate and learn from the global Top 1% of talent. - Supportive environment with mentorship, promotions, skill development, and diverse growth opportunities. Apply now and become part of a global team where your unique talents can truly thrive! Show more Show less

Posted 5 days ago

Apply

8.0 years

0 Lacs

India

On-site

Linkedin logo

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing. "DDN's A3I solutions are transforming the landscape of AI infrastructure." – IDC “The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments” - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence. Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management. Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage. As a Lead/Sr. Lead Software Engineer - L4 , you’ll be the final escalation point for the most complex and critical issues affecting enterprise and hyperscale environments. This hands-on role is ideal for a deep technical expert who thrives under pressure and has a passion for solving distributed system challenges at scale. You’ll collaborate with Engineering, Product Management, and Field teams to drive root cause resolutions, define architectural best practices, and continuously improve product resiliency. Leveraging AI tools and automation, you’ll reduce time-to-resolution, streamline diagnostics, and elevate the support experience for strategic customers. Key Responsibilities Technical Expertise & Escalation Leadership Own critical customer case escalations end-to-end, including deep root cause analysis and mitigation strategies. Act as the highest technical escalation point for Infinia support incidents — especially in production-impacting scenarios. Lead war rooms, live incident bridges, and cross-functional response efforts with Engineering, QA, and Field teams. Utilize AI-powered debugging, log analysis, and system pattern recognition tools to accelerate resolution. Product Knowledge & Value Creation Become a subject-matter expert on Infinia internals: metadata handling, storage fabric interfaces, performance tuning, AI integration, etc. Reproduce complex customer issues and propose product improvements or workarounds. Author and maintain detailed runbooks, performance tuning guides, and RCA documentation. Feed real-world support insights back into the development cycle to improve reliability and diagnostics. Customer Engagement & Business Enablement Partner with Field CTOs, Solutions Architects, and Sales Engineers to ensure customer success. Translate technical issues into executive-ready summaries and business impact statements. Participate in post-mortems and executive briefings for strategic accounts. Drive adoption of observability, automation, and self-healing support mechanisms using AI/ML tools. Required Qualifications 8+ years in enterprise storage, distributed systems, or cloud infrastructure support/engineering. Deep understanding of file systems (POSIX, NFS, S3), storage performance, and Linux kernel internals. Proven debugging skills at system/protocol/app levels (e.g., strace, tcpdump, perf). Hands-on experience with AI/ML data pipelines, container orchestration (Kubernetes), and GPU-based architectures. Exposure to RDMA, NVMe-oF, or high-performance networking stacks. Exceptional communication and executive reporting skills. Experience using AI tools (e.g., log pattern analysis, LLM-based summarization, automated RCA tooling) to accelerate diagnostics and reduce MTTR. Preferred Qualifications Experience with DDN, VAST, Weka, or similar scale-out file systems. Strong scripting/coding ability in Python, Bash, or Go. Familiarity with observability platforms: Prometheus, Grafana, ELK, OpenTelemetry. Knowledge of replication, consistency models, and data integrity mechanisms. Exposure to Sovereign AI, LLM model training environments, or autonomous system data architectures. This position requires participation in an on-call rotation to provide after-hours support as needed. Show more Show less

Posted 5 days ago

Apply

1.0 years

0 Lacs

Chennai, Tamil Nadu, India

Remote

Linkedin logo

Company Overview At Zuora, we do Modern Business. We’re helping people subscribe to new ways of doing business that are better for people, companies and ultimately the planet. It’s an approach resulting from the shift to the Subscription Economy that puts customers first by building recurring relationships instead of one-time product sales and focuses on sustainable growth. Through our leading expertise and multi-product suite, we are transforming all industries and working with the world’s most innovative companies to monetize new business models, nurture subscriber relationships and optimize their digital experiences. The Team & Role Join Zuora’s high-impact Operations team, where you’ll be instrumental in maintaining the reliability, scalability, and performance of our SaaS platform. This role involves proactive service monitoring, incident response, infrastructure service management, and ownership of internal and external shared services to ensure optimal system availability and performance. You will work alongside a team of skilled engineers dedicated to operational excellence through automation, observability, and continuous improvement. In this cross-functional role, you’ll collaborate daily with Product Engineering & Management, Customer Support, Deal Desk, Global Services, and Sales teams to ensure a seamless and customer-centric service delivery model. As a core member of the team, you’ll have the opportunity to design and implement operational best practices, contribute to service provisioning strategies, and drive innovations that enhance the overall platform experience. If you’re driven by solving complex problems in a fast-paced environment and are passionate about operational resilience and service reliability, we’d love to hear from you. Our Tech Stack: Linux Administration, Python, Docker, Kubernetes, MySQL, Kafka, ActiveMQ, Tomcat App & Web, Oracle, Load Balancers, REDIS Cache, Debezium, AWS, WAF, LBs, Jenkins, GitOps, Terraform, Ansible, Puppet, Prometheus, Grafana, Open Telemetry In this role you’ll get to Architect and implement intelligent automation workflows for infrastructure lifecycle management, including self-healing systems, automated incident remediation, and configuration analomy detection using Infrastructure as Code (IaC) and AI-driven tooling. Leverage predictive monitoring and anomaly detection techniques powered by AI/ML to proactively assess system health, optimize performance, and preemptservice degradation or outages. Lead complex incident response efforts, applying deep root cause analysis (RCA) and postmortem practices to drive long-term stability, while integrating automated detection and remediation capabilities. Partner with development and platform engineering teams to build resilient CI/CD pipelines, enforce infrastructure standards, and embed observability and reliability into application deployments. Identify and eliminate reliability bottlenecks through automated performance tuning, dynamic scaling policies, and advanced telemetry instrumentation. Maintain and continuously evolve operational runbooks by incorporating machine learning insights, updating playbooks with AI-suggested resolutions, and identifying automation opportunities for manual steps. Stay abreast of emerging trends in AI for IT operations (AIOps), distributed systems, and cloud-native technologies to influence strategic reliability engineering decisions and tool adoption. Who we’re looking for Hands-on experience with Linux Servers Administration and Python Programming. Deep experience with containerization and orchestration using Docker and Kubernetes, managing highly available services at scale. Working with messaging systems like Kafka and ActiveMQ, databases like MySQL and Oracle, and caching solutions like REDIS. Understands and applies AI/ML techniques in operations, including anomaly detection, predictive monitoring, and self-healing systems. Has a solid track record in incident management, root cause analysis, and building systems that prevent recurrence through automation. Is proficient in developing and maintaining CI/CD pipelines with a strong emphasis on observability, performance, and reliability. Monitoring and observability using Prometheus, Grafana, and OpenTelemetry, with a focus on real-time anomaly detection and proactive alerting. Is comfortable writing and maintaining runbooks and enjoys enhancing them with automation and machine learning insights. Keeps up-to-date with industry trends such as AIOps, distributed systems, SRE best practices, and emerging cloud technologies. Brings a collaborative mindset, working cross-functionally with engineering, product, and operations teams to align system design with business objectives. 1+ years of experience working in a SaaS environment. Nice To Have Red Hat Certified System Administrator (RHCSA) – Red Hat AWS Certification Certified Associate in Python Programming (PCAP) – Python Institute Docker Certified Associate (DCA) or Certified Kubernetes Administrator (CKA) Good knowledge of Jenkins #ZEOLife at Zuora Advanced certifications in SRE or related fields As an industry pioneer, our work is constantly evolving and challenging us in new ways that require us to think differently, iterate often and learn constantly—it’s exciting. Our people, whom we refer to as “ZEOs” are empowered to take on a mindset of ownership and make a bigger impact here. Our teams collaborate deeply, exchange different ideas openly and together we’re making what’s next possible for our customers, community and the world. As part of our commitment to building an inclusive, high-performance culture where ZEOs feel inspired, connected and valued, we support ZEOs with: Competitive compensation, corporate bonus program, performance rewards and retirement programs Medical insurance Generous, flexible time off Paid holidays, “wellness” days and company wide end of year break 6 months fully paid parental leave Learning & Development stipend Opportunities to volunteer and give back, including charitable donation match Free resources and support for your mental wellbeing Specific benefits offerings may vary by country and can be viewed in more detail during your interview process. Location & Work Arrangements Organizations and teams at Zuora are empowered to design efficient and flexible ways of working, being intentional about scheduling, communication, and collaboration strategies that help us achieve our best results. In our dynamic, globally distributed company, this means balancing flexibility and responsibility — flexibility to live our lives to the fullest, and responsibility to each other, to our customers, and to our shareholders. For most roles, we offer the flexibility to work both remotely and at Zuora offices. Our Commitment to an Inclusive Workplace Think, be and do you! At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an Equal Opportunity Employer committed to creating an inclusive environment for all. Zuora does not discriminate on the basis of, and considers individuals seeking employment with Zuora without regards to, race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, politicalviews or activity, or other applicable legally protected characteristics. We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us by sending an email to assistance(at)zuora.com. Show more Show less

Posted 6 days ago

Apply

3.0 years

2 - 6 Lacs

Noida

On-site

Site Reliability Engineer I Job Summary Site Reliability Engineers (SRE's) cover the intersection of Software Engineer and Systems Administrator. In other words, they can both create code and manage the infrastructure on which the code runs. This is a very wide skillset, but the end goal of an SRE is always the same: to ensure that all SLAs are met, but not exceeded, so as to balance performance and reliability with operational costs. As a Site Reliability Engineer I, you will be learning our systems, improving your craft as an engineer, and taking on tasks that improve the overall reliability of the VP platform. Key Responsibilities: Design, implement, and maintain robust monitoring and alerting systems. Lead observability initiatives by improving metrics, logging, and tracing across services and infrastructure. Collaborate with development and infrastructure teams to instrument applications and ensure visibility into system health and performance. Write Python scripts and tools for automation, infrastructure management, and incident response . Participate in and improve the incident management and on-call process , driving down Mean Time to Resolution (MTTR). Conduct root cause analysis and postmortems following incidents, and champion efforts to prevent recurrence. Optimize systems for scalability, performance, and cost-efficiency in cloud and containerized environments. Advocate and implement SRE best practices , including SLOs/SLIs, capacity planning, and reliability reviews. Required Skills & Qualifications: 3+ years of experience in a Site Reliability Engineer or similar role. Proficiency in Python for automation and tooling. Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, New Relic, OpenTelemetry, etc. Experience with log aggregation and analysis tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Fluentd. Good understanding of cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes). Familiarity with infrastructure-as-code (Terraform, Ansible, or similar). Strong debugging and incident response skills . Knowledge of CI/CD pipelines and release engineering practices.

Posted 6 days ago

Apply

15.0 years

0 Lacs

Chandigarh, India

On-site

Linkedin logo

We are seeking a seasoned Observability Architect to define and lead our end-to-end observability strategy across highly distributed, cloud-native, and hybrid environments. This role requires a visionary leader with deep hands-on experience in New Relic and a strong working knowledge of other modern observability platforms like Datadog, Prometheus/Grafana, Splunk, OpenTelemetry, and more. You will design scalable, resilient, and intelligent observability solutions that empower engineering, SRE, and DevOps teams to proactively detect issues, optimize performance, and ensure system reliability. This is a senior leadership role with significant influence over platform architecture, monitoring practices, and cultural transformation across global teams. Key Responsibilities Architect and implement full-stack observability platforms, covering metrics, logs, traces, synthetics, real user monitoring (RUM), and business-level telemetry using New Relic and other tools like Datadog, Prometheus, ELK, or AppDynamics. Design and enforce observability standards and instrumentation guidelines for microservices, APIs, front-end applications, and legacy systems across hybrid cloud environments. Experience in OpenTelemetry adoption, ensuring vendor-neutral, portable observability implementations where appropriate. Build multi-tool dashboards, health scorecards, SLOs/SLIs, and integrated alerting systems tailored for engineering, operations, and executive consumption. Collaborate with engineering and DevOps teams to integrate observability into CI/CD pipelines, GitOps, and progressive delivery workflows. Partner with platform, cloud, and security teams to provide end-to-end visibility across AWS, Azure, GCP, and on-prem systems. Lead root cause analysis, system-wide incident reviews, and reliability engineering initiatives to reduce MTTR and improve MTBF. Evaluate, pilot, and implement new observability tools/technologies aligned with enterprise architecture and scalability requirements. Deliver technical mentorship and enablement, evangelizing observability best practices and nurturing a culture of ownership and data-driven decision-making. Drive observability governance and maturity models, ensuring compliance, consistency, and alignment with business SLAs and customer experience goals. Required Qualifications 15+ years of overall IT experience, hands-on with application development, system architecture, operations in complex distributed environments, troubleshooting and integration for applications and other cloud technology with observability tools. 5+ years of hands-on experience with observability tools such as New relic, Datadog, Prometeus, etc. including APM, infrastructure monitoring, logs, synthetics, alerting, and dashboard creation. Proven experience and willingness to work with multiple observability stacks, such as: Datadog, Dynatrace, AppDynamics Prometheus, Grafana, etc. Elasticsearch, Fluentd, Kibana (EFK/ELK) Splunk, OpenTelemetry, Solid knowledge of Kubernetes, service mesh (e.g., Istio), containerization (Docker) and orchestration strategies. Strong experience with DevOps and SRE disciplines, including CI/CD, IaC (Terraform, Ansible), and incident response workflows. Fluency in one or more programming/scripting languages: Java, Python, Go, Node.js, Bash. Hands-on expertise in cloud-native observability services (e.g., CloudWatch, Azure Monitor, GCP Operations Suite). Excellent communication and stakeholder management skills, with the ability to align technical strategies with business goals. Preferred Qualifications Architect level Certifications in New Relic, Datadog, Kubernetes, AWS/Azure/GCP, or SRE/DevOps practices. Experience with enterprise observability rollouts, including organizational change management. Understanding of ITIL, TOGAF, or COBIT frameworks as they relate to monitoring and service management. Familiarity with AI/ML-driven observability, anomaly detection, and predictive alerting. Why Join Us? Lead enterprise-scale observability transformations impacting customer experience, reliability, and operational excellence. Work in a tool-diverse environment, solving complex monitoring challenges across multiple platforms. Collaborate with high-performing teams across development, SRE, platform engineering, and security. Influence strategy, tooling, and architecture decisions at the intersection of engineering, operations, and business. Show more Show less

Posted 6 days ago

Apply

5.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Description And Requirements CareerArc Code CA-PS Hybrid "At BMC trust is not just a word - it's a way of life!" We are an award-winning, equal opportunity, culturally diverse, fun place to be. Giving back to the community drives us to be better every single day. Our work environment allows you to balance your priorities, because we know you will bring your best every day. We will champion your wins and shout them from the rooftops. Your peers will inspire, drive, support you, and make you laugh out loud! We help our customers free up time and space to become an Autonomous Digital Enterprise that conquers the opportunities ahead - and are relentless in the pursuit of innovation! The DSOM product line includes BMC’s industry-leading Digital Services and Operation Management products. We have many interesting SaaS products, in the fields of: Predictive IT service management, Automatic discovery of inventories, intelligent operations management, and more! We continuously grow by adding and implementing the most cutting-edge technologies and investing in Innovation! Our team is a global and versatile group of professionals, and we LOVE to hear our employees’ innovative ideas. So, if Innovation is close to your heart – this is the place for you! BMC is looking for a Senior QA Engineer to join a QE team working on complex and distributed software, developing test plans, executing tests, developing automation & assuring product quality. Here is how, through this exciting role, YOU will contribute to BMC's and your own success: Define and execute comprehensive test strategies for service management platforms and observability pipelines. Develop, maintain, and optimize automated tests covering incident, problem, change management workflows, and observability data (metrics, logs, traces, events). Collaborate with product, engineering, and SRE teams to embed quality throughout service delivery and monitoring processes. Validate the accuracy, completeness, and reliability of telemetry data and alerts used in observability. Drive continuous integration of quality checks into CI/CD pipelines for rapid feedback and deployment confidence. Investigate production incidents using observability tools and testing outputs to support root cause analysis. Mentor and guide junior engineers on quality best practices for service management and observability domains. Generate detailed quality metrics and reports to inform leadership and drive continuous improvement. To ensure you’re set up for success, you will bring the following skillset & experience: 5+ years of experience in quality engineering or software testing with a focus on service management and observability. Strong programming and scripting skills (Java, Python, JavaScript, or similar). Hands-on experience with service management tools such as BMC Helix, ServiceNow, Jira Service Management. Proficient in observability platforms and frameworks (Prometheus, Grafana, ELK Stack, OpenTelemetry, Jaeger). Solid understanding of CI/CD processes and tools (Jenkins, GitHub Actions, Azure DevOps). Experience with cloud environments (AWS, Azure, GCP) and container technologies (Docker, Kubernetes). Whilst these are nice to have, our team can help you develop in the following skills: Experience in Site Reliability Engineering (SRE) practices. Knowledge of security and performance testing methodologies. QA certifications such as ISTQB or equivalent. BMC Software maintains a strict policy of not requesting any form of payment in exchange for employment opportunities, upholding a fair and ethical hiring process. At BMC we believe in pay transparency and have set the midpoint of the salary band for this role at 2,790,000 INR. Actual salaries depend on a wide range of factors that are considered in making compensation decisions, including but not limited to skill sets; experience and training, licensure, and certifications; and other business and organizational needs. The salary listed is just one component of BMC's employee compensation package. Other rewards may include a variable plan and country specific benefits. We are committed to ensuring that our employees are paid fairly and equitably, and that we are transparent about our compensation practices. ( Returnship@BMC ) Had a break in your career? No worries. This role is eligible for candidates who have taken a break in their career and want to re-enter the workforce. If your expertise matches the above job, visit to https://bmcrecruit.avature.net/returnship know more and how to apply. Min salary 2,092,500 Our commitment to you! BMC’s culture is built around its people. We have 6000+ brilliant minds working together across the globe. You won’t be known just by your employee number, but for your true authentic self. BMC lets you be YOU! If after reading the above, You’re unsure if you meet the qualifications of this role but are deeply excited about BMC and this team, we still encourage you to apply! We want to attract talents from diverse backgrounds and experience to ensure we face the world together with the best ideas! BMC is committed to equal opportunity employment regardless of race, age, sex, creed, color, religion, citizenship status, sexual orientation, gender, gender expression, gender identity, national origin, disability, marital status, pregnancy, disabled veteran or status as a protected veteran. If you need a reasonable accommodation for any part of the application and hiring process, visit the accommodation request page. Mid point salary 2,790,000 Max salary 3,487,500 Show more Show less

Posted 6 days ago

Apply

3.0 - 4.0 years

0 Lacs

Delhi, India

On-site

Linkedin logo

Role : Snapmint DevOps team is looking for a DevOps Engineer Snapmint DevOps team is looking for a DevOps Engineer with a passion for working on cutting-edge technology and who thrives on the challenge of building something new that will operate at massive scale. We are looking for a DevOps Engineer with 3-4 years of hands-on experience in managing modern cloud infrastructure and CI/CD pipelines. The ideal candidate should be well-versed with Kubernetes, cloud services (AWS/GCP), observability tools like Grafana, Loki, Fluentd, and have strong Linux and scripting, and automation skills. You will be completely building and owning one of the areas of DevOps - CI/CD, Scaling microservices and distributed applications using containers and Kubernetes or related technologies, db clusters, Data Lake platform, Centralized logging & monitoring, and security. Responsibilities Manage and maintain Kubernetes clusters (EKS/GKE) using Karpenter and self-managed node groups. Handle deployments and scaling for multi-language applications (React/sNode.js, Django, Ruby on Rails). Design and maintain CI/CD pipelines using Jenkins and GitHub Actions. Automate deployment workflows and manage environment configurations. Set up and manage monitoring and alerting using Grafana, CloudWatch, and Prometheus. Implement log aggregation using Fluentd, Loki, and Elasticsearch. Configure and maintain Sentry for error tracking and OpenTelemetry for distributed tracing. Manage NGINX ingress controllers, CoreDNS custom configurations, and Cloudflare WAF/CDN. Maintain secure networking across VPCs, NAT Gateways, and subnets in AWS. Work with RDS (MySQL/PostgreSQL), including replication, backups, and read-replica optimization for analytics. Support analytics platforms like Redash, Metabase, and Jupyter with backend reliability. Write scripts in Bash, Python, or Go for automation and infrastructure maintenance tasks. Use Docker extensively for containerization and local development environments. Requirements 3 to 7 years of hands-on experience in a DevOps/Site Reliability role. Strong knowledge of Linux, Kubernetes, Docker, and Helm. Proficiency with CI/CD tools (e. g., Jenkins, GitHub Actions). Experience with observability stack: Grafana, Loki, Fluentd, Prometheus, and Sentry. Good understanding of networking fundamentals (DNS, Load Balancing, Firewalls). Working knowledge of at least one public cloud (AWS or GCP). Familiarity with monitoring distributed applications and log management. Basic understanding of application deployment (Node.js, PHP, Django, Ruby on Rails). Good understanding of Terraform. Good understanding of Networking Layers. (ref:hirist.tech) Show more Show less

Posted 6 days ago

Apply

3.0 years

0 Lacs

Srinagar, Jammu & Kashmir, India

On-site

Linkedin logo

About Cryptlex: Cryptlex is a SaaS company providing software licensing, activation, and distribution solutions to businesses and developers globally. We’re growing fast and need a strong DevOps and Security Engineer to help scale and secure our infrastructure. Responsibilities: Manage and optimize our AWS account and all cloud resources Administer and maintain RDS PostgreSQL databases Operate and scale Kubernetes clusters used in production Ensure security best practices across infrastructure and application layers Work with third-party auditors to achieve and maintain ISO 27001 compliance Coordinate GDPR compliance efforts with auditors Set up and manage observability using OpenTelemetry (OTEL) and Grafana Support CI/CD pipelines and infrastructure automation Collaborate with engineering teams on DevSecOps initiatives Requirements: 3+ years of experience in DevOps, Cloud, or Infrastructure Engineering Proficiency in AWS, Kubernetes, and PostgreSQL Experience with ISO 27001 and GDPR compliance processes Familiarity with observability tools like OTEL and Grafana Knowledge of CI/CD tools and Infrastructure as Code (e.g., Terraform, Helm) Strong communication skills and experience in fast-paced SaaS environments Show more Show less

Posted 6 days ago

Apply

1.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

This is a position based in Bangalore, India. The duration of the internship program will be 1 year. In this role, you will work with cross functional teams ranging from Cloud Operation, Platform Engineering, Development and Testing to automate, build and integrate a large-scale infrastructure system. Responsibilities and Duties: Design our infrastructure platform components to be highly available, scalable, reliable, and secure. Automate the entire lifecycle of platform components from provisioning through decommissioning. Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost. Build Cloud native CI/CD pipelines, tools and automation that enables developer autonomy and improves their productivity. Build tools that predict saturations/failures and take preventive actions through automation. Collaborate extensively with cross functional teams to understand their requirements. Qualifications: Familiar with building Infrastructure Platforms and CI/CD pipelines in a major public cloud provide – GCP preferred. Familiar with Kubernetes and its ecosystem – Containerd/Docker, Helm/Plulumi, ServiceMesh, Terraform/Terragrunt, Ansible, ArgoCD/Workflow, Tekton, Keptn etc. Familiar with observability platforms/tools like ELK/Fluentd/Fluenbit, Grafana/Prometheus/OpenTelemetry, Cortex/InfluxDB and Jaeger/Zipkin etc. Familiar with Python, Golang or a similar language. Must understand basic computer programming concepts, data structures, and object-oriented programming. Familiar with source control applications such as GIT. Strong communication skills and excellent telephone presence. Bachelors or master's in computer science / information technology / Electronics & communication / Electrical & Electronics. Should have graduated in 2024 / 2025. Show more Show less

Posted 1 week ago

Apply

5.0 years

2 - 3 Lacs

Cochin

On-site

Joining Gadgeon offers a dynamic and rewarding career experience that fosters both personal and professional growth. Our collaborative culture encourages innovation, empowering team members to contribute their ideas and expertise to cutting-edge projects. LLM APPLICATIONS ENGINEER The LLM Applications Engineer transforms AI prototypes into modular, secure, and production-ready workflows. This role focuses on orchestrating AI agents, implementing observability, and building toolkits that accelerate internal and external adoption. Key Duties/ Responsibilities Develop agent orchestration using frameworks like LangChain, CrewAI, AutoGen. Build reusable APIs, SDKs, and configuration layers for internal consumption. Implement prompt safety measures and fallback handling using Rebuff, Guardrails AI. Ensure agent workflows are observable and CI/CD friendly. Collaborate with platform and backend engineers for deployment enablement. Leadership Skills: Ownership of agent toolkit delivery. Cross-functional collaboration with platform and AI scientists. Documentation and internal adoption support. Required Technical Skills: Node.js, Python, Docker, REST APIs. LangChain, Rebuff, LCEL, Guardrails, LangSmith. CI/CD tooling, OpenTelemetry, workflow orchestration (Airflow, Prefect). Experience implementing agent safety using Rebuff, Guardrails, or similar tools. Familiarity with one or more orchestration frameworks: LangChain, CrewAI, AutoGen (or equivalents). Qualification: Bachelor’s degree in Engineering or related field. 5+ years in backend/devops roles, with 2+ years in AI workflow orchestration.

Posted 1 week ago

Apply

2.0 years

0 Lacs

Hyderābād

On-site

Job Requirements Phenom People is looking for an experienced and motivated Product Manager to join our Product team in Hyderabad, Telangana, India. This is a full-time position. The Associate Product Manager or the Product Manager will be responsible for developing and managing the product roadmap, working with stakeholders to define product requirements, and managing the product life cycle. The ideal candidate will have a strong technical background and experience in product management. Responsibilities: Develop and manage the product roadmap Work with stakeholders to define product requirements Manage the product life cycle Monitor product performance and customer feedback Identify and prioritize product features Develop product pricing and positioning strategies Create product marketing plans Develop product launch plans Analyze market trends and customer needs Collaborate with engineering, design, and marketing teams Requirements: Must-Have: 2+ years of product management experience with at least 2 years in a technical or observability-related role. Strong understanding of APM concepts: distributed tracing, metrics aggregation, anomaly detection, alerting, root cause analysis. Familiarity with modern observability stacks: OpenTelemetry, Prometheus, Grafana, Jaeger, Zipkin, ELK/EFK, Datadog, New Relic, AppDynamics, etc. Exposure to cloud-native infrastructure: containers, Kubernetes, microservices architecture. Experience working with engineers on deeply technical systems and scalable backend architecture. Proficiency in creating technically detailed user stories and acceptance criteria. Strong problem-solving and analytical skills, with a bias for action and customer empathy. Nice-to-Have: Background in software engineering, DevOps, or site reliability engineering. Experience in building Technical products Understanding of telemetry pipelines, sampling strategies, and correlation between MELT signals. Familiarity with SLIs/SLOs, service maps, and incident response workflows. Knowledge of integration with CI/CD, synthetic monitoring, or real-user monitoring (RUM). We prefer candidates with these experiences Experience in product management - worked as PO or PM in a SaaS product organization Experience working on integrations, API's etc., Experience collaborating with customers and internal business partners Experience working with distributed / international teams Experience with JIRA or equivalent product development management tools Minimum Qualifications 1 to 3 years of experience in product management - as a Product Manager or Product owner or Associate Product Manager Experience in HR Tech industry is a plus but not mandatory Bachelor’s degree or equivalent years of experience. MBA is highly desirable. Benefits Competitive salary for a startup Gain experience rapidly Work directly with executive team Fast-paced work environment #LI-JG1

Posted 1 week ago

Apply

5.0 years

18 - 24 Lacs

Hyderābād

On-site

No. of Positions: 2 Position: Observability Engineer Exp: 5-10 Years Location: Hyderabad Mode: 2 Days WFO Mandatory Skills: Observability, Grafana and Writing queries using Prometheus and Loki. Note: Candidate deployed at Vialto premises. Job Description: We are looking for a highly skilled Observability Engineer to design, develop, and maintain observability solutions that provide deep visibility into our infrastructure, applications, and services. You will be responsible for implementing monitoring, logging, and tracing solutions to ensure the reliability, performance, and availability of our systems. Working closely with development, Infra Engineers, DevOps, and SRE teams, you will play a critical role in optimizing system observability and improving incident response. Key Responsibilities: ● Design and implement observability solutions for monitoring, logging, and tracing across cloud and on-premises environments. ● Develop and maintain monitoring tools such as Prometheus, Grafana, Datadog, New Relic, and AppDynamics. ● Implement distributed tracing using OpenTelemetry, Jaeger, Zipkin, or similar tools to improve application performance and troubleshooting. ● Optimize log management and analysis with tools like Elasticsearch, Splunk, Loki, or Fluentd. ● Create alerting and anomaly detection strategies to proactively identify system issues and reduce mean time to resolution (MTTR). ● Collaborate with development and SRE teams to enhance observability in CI/CD pipelines and microservices architectures. ● Automate observability processes using scripting languages like Python, Bash, or Golang. ● Ensure scalability and efficiency of monitoring solutions to handle large-scale distributed systems. ● Support incident response and root cause analysis by providing actionable insights through observability data. ● Stay up to date with industry trends in observability and site reliability engineering (SRE). Required Qualifications: ● 3+ years of experience in observability, SRE, DevOps, or a related field. ● Proficiency in observability tools such as Prometheus, Grafana, Datadog, New Relic, or AppDynamics. ● Experience with logging platforms like Elasticsearch, Splunk, Loki, or Fluentd. ● Strong knowledge of distributed tracing (OpenTelemetry, Jaeger, Zipkin). ● Hands-on experience with Azure cloud platforms and Kubernetes. ● Proficiency in scripting languages (Python, Bash, PowerShell) and infrastructure as code (Terraform, Ansible). ● Solid understanding of system performance, networking, and troubleshooting. ● Strong problem-solving and analytical skills. ● Excellent communication and collaboration abilities. Preferred Qualifications: ● Experience with AI-driven observability and anomaly detection. ● Familiarity with microservices, serverless architectures, and event-driven systems. ● Experience working with on-call rotations and incident management workflows. ● Relevant certifications in observability tools, cloud platforms, or SRE practices. Job Type: Fresher Pay: ₹1,800,000.00 - ₹2,400,000.00 per year Benefits: Provident Fund Supplemental Pay: Performance bonus Work Location: In person

Posted 1 week ago

Apply

6.0 years

0 Lacs

Trivandrum, Kerala, India

On-site

Linkedin logo

Role Description Role Proficiency: Act creatively to develop applications and select appropriate technical options optimizing application development maintenance and performance by employing design patterns and reusing proven solutions account for others' developmental activities Outcomes Interpret the application/feature/component design to develop the same in accordance with specifications. Code debug test document and communicate product/component/feature development stages. Validate results with user representatives; integrates and commissions the overall solution Select appropriate technical options for development such as reusing improving or reconfiguration of existing components or creating own solutions Optimises efficiency cost and quality. Influence and improve customer satisfaction Set FAST goals for self/team; provide feedback to FAST goals of team members Measures Of Outcomes Adherence to engineering process and standards (coding standards) Adherence to project schedule / timelines Number of technical issues uncovered during the execution of the project Number of defects in the code Number of defects post delivery Number of non compliance issues On time completion of mandatory compliance trainings Code Outputs Expected: Code as per design Follow coding standards templates and checklists Review code – for team and peers Documentation Create/review templates checklists guidelines standards for design/process/development Create/review deliverable documents. Design documentation r and requirements test cases/results Configure Define and govern configuration management plan Ensure compliance from the team Test Review and create unit test cases scenarios and execution Review test plan created by testing team Provide clarifications to the testing team Domain Relevance Advise Software Developers on design and development of features and components with a deep understanding of the business problem being addressed for the client. Learn more about the customer domain identifying opportunities to provide valuable addition to customers Complete relevant domain certifications Manage Project Manage delivery of modules and/or manage user stories Manage Defects Perform defect RCA and mitigation Identify defect trends and take proactive measures to improve quality Estimate Create and provide input for effort estimation for projects Manage Knowledge Consume and contribute to project related documents share point libraries and client universities Review the reusable documents created by the team Release Execute and monitor release process Design Contribute to creation of design (HLD LLD SAD)/architecture for Applications/Features/Business Components/Data Models Interface With Customer Clarify requirements and provide guidance to development team Present design options to customers Conduct product demos Manage Team Set FAST goals and provide feedback Understand aspirations of team members and provide guidance opportunities etc Ensure team is engaged in project Certifications Take relevant domain/technology certification Skill Examples Explain and communicate the design / development to the customer Perform and evaluate test results against product specifications Break down complex problems into logical components Develop user interfaces business software components Use data models Estimate time and effort required for developing / debugging features / components Perform and evaluate test in the customer or target environment Make quick decisions on technical/project related challenges Manage a Team mentor and handle people related issues in team Maintain high motivation levels and positive dynamics in the team. Interface with other teams designers and other parallel practices Set goals for self and team. Provide feedback to team members Create and articulate impactful technical presentations Follow high level of business etiquette in emails and other business communication Drive conference calls with customers addressing customer questions Proactively ask for and offer help Ability to work under pressure determine dependencies risks facilitate planning; handling multiple tasks. Build confidence with customers by meeting the deliverables on time with quality. Estimate time and effort resources required for developing / debugging features / components Make on appropriate utilization of Software / Hardware’s. Strong analytical and problem-solving abilities Knowledge Examples Appropriate software programs / modules Functional and technical designing Programming languages – proficient in multiple skill clusters DBMS Operating Systems and software platforms Software Development Life Cycle Agile – Scrum or Kanban Methods Integrated development environment (IDE) Rapid application development (RAD) Modelling technology and languages Interface definition languages (IDL) Knowledge of customer domain and deep understanding of sub domain where problem is solved Additional Comments Musts: Strong understanding of object-oriented and functional programming principles Experience with RESTful APIs Knowledge of microservices architecture and cloud platforms Familiarity with CICD pipelines, Docker, and Kubernetes Strong problem-solving skills and ability to work in an Agile environment Excellent communication and teamwork skills Nices: 6+ years of experience, with at least 3+ in Kotlin Experience with backend development using Kotlin (Ktor, Spring Boot, or Micronaut) Proficiency in working with databases such as PostgreSQL, MySQL, or MongoDB Experience with GraphQL and WebSockets Additional Musts: Experience with backend development in the Java ecosystem (either Java or Kotlin will do) Additional Nices: Experience with Typescript and NodeJS Experience with Kafka Experience with frontend development (e.g. React) Experience with Gradle Experience with GitLab CI Experience with OpenTelemetry Skills Kotlin,Spring Boot,Restful Api,Java Show more Show less

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies