Jobs
Interviews

944 Gitops Jobs - Page 9

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

10.0 years

0 Lacs

India

On-site

🚀 Senior Engineer (GenAI & Prompt Engineering) | Xebia | Hybrid (India) We’re hiring a GenAI expert to supercharge our DevOps and Engineering ecosystem with AI-native intelligence. If you’re passionate about Prompt Engineering, RAG pipelines, and vector databases , and can deliver production-ready LLM integrations — we want to hear from you! 🔍 Key Responsibilities Architect and refine prompts for task-specific LLM workflows Build RAG pipelines using FAISS, Weaviate, Pinecone , etc. Seamlessly integrate GenAI into CI/CD, observability, developer tools Use Python frameworks like LangChain, LlamaIndex, Haystack Fine-tune and deploy LLMs (OpenAI, Claude, Mistral, etc.) Ensure privacy, security, and model performance Collaborate with Platform, DevOps & Engineering teams 🔧 Required Skills 6–10+ years in engineering with 2+ years in GenAI/LLMs Expert in Python , prompt engineering , embedding techniques Experience with vector DBs , LLM APIs, and GenAI frameworks Strong understanding of NLP concepts and LLM lifecycle 💡 Nice to Have Exposure to Kubernetes, GitOps, DevSecOps Worked on changelog generation, log triage, AI copilots 📍 Hybrid Work : 3 days a week from any Xebia office (Chennai, Bangalore, Hyderabad, Pune, Jaipur, Bhopal, Gurugram) ⏳ Joiners : Immediate to Max 2 Weeks’ Notice Only 📩 To Apply , email your details to vijay.s@xebia.com with: Full Name Total Experience Current CTC Expected CTC Current Location Preferred Xebia Location Notice Period / Last Working Day Primary Skills LinkedIn Profile Join us to build the future of engineering productivity with AI. #GenAI #PromptEngineering #RAG #Python #LLM #Hiring #Xebia

Posted 2 weeks ago

Apply

4.0 - 6.0 years

0 Lacs

Bangalore Urban, Karnataka, India

On-site

Smarsh is the leading provider of archiving & compliance solutions for companies in regulated and litigious industries. The solutions are delivered using Smarsh product suite that process, control, manage and store a very large variety of electronic communication channels (from e.g. social networks, group chat, instant messaging, email, blogs, wikis, SMS/MMS, Voice etc.) at cloud scale About the team : We are seeking a talented Engineer to join our team, focusing on developing scalable integrations, APIs, and open-source solutions that contribute to our Internal Developer Portal (IDP) ecosystem. As a key team member, you will collaborate with cross-functional teams to design, implement, and maintain APIs and data pipelines that enable seamless data flow into our IDP. If you are passionate about clean code, open-source contributions, and building developer-centric tools, we want to hear from you Key Responsibilities API Development : Design, develop, and maintain robust APIs to push data into the IDP. Ensure high performance, scalability, and security in API implementations. Collaborate with teams to integrate APIs with existing systems. Integration Development : Build and maintain open-source integrations for third-party tools (e.g., monitoring systems, CI/CD pipelines, container registries). Write reusable, testable, and efficient Python code to bridge systems with the IDP. Data Processing and Transformation : Develop data pipelines to process, transform, and push data into the IDP. Implement error handling and logging mechanisms to ensure reliability. Design systems for data parsing and transformation , including robust handling of YAML, JSON , and other serialisation formats to normalise inputs from disparate sources. Open-Source Contribution : Contribute to open-source projects that enhance the IDP ecosystem. Actively participate in the developer community by publishing and maintaining open-source tools. Collaboration and Communication : Work closely with DevOps, Platform Engineering, and Security teams to understand data requirements. Document APIs, integrations, and workflows for internal and external stakeholders. Code Quality and Testing : Write unit and integration tests to ensure code reliability. Perform code reviews and enforce best practices in Python development. Required Experience/Skills Education : Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience) with 4 - 6 years of total experience. Technical Expertise : Proficiency in Python with a focus on building scalable applications. Experience with API frameworks such as FastAPI , Django Rest Framework , or Flask . Knowledge of data serialization formats (e.g., JSON , YAML ). Knowledge of event-driven architecture. Knowledge of queuing system like Kafka, RabbitMQ and SQS. Knowledge of Role-Based Access Control (RBAC) and least-privilege principles to secure all IDP interactions. Integration Experience : Experience building integrations with third-party tools like Jenkins , GitLab , Prometheus , or AWS . Familiarity with APIs for monitoring tools, container registries, and CI/CD systems. DevOps and Cloud : Understanding of Kubernetes , Docker , and cloud platforms (AWS, GCP, Azure). Familiarity with GitOps practices and tools like ArgoCD . Data Processing : Experience with data pipelines and ETL workflows. Knowledge of PostgreSQL , MongoDB , or other relational/non-relational databases. Design systems for data parsing and transformation , including robust handling of YAML and JSON . Open Source : Proven experience contributing to or maintaining open-source projects. Familiarity with Git and GitHub workflows. Soft Skills : Strong communication skills and the ability to work in a collaborative environment. Analytical mindset with attention to detail and problem-solving skills. Preferred Qualifications Familiarity with Port or other Internal Developer Portal (IDP) tools. Experience with security practices, including API authentication and data encryption. Understanding of AWS, Kubernetes and DevOps practices. Knowledge of DORA metrics and CI/CD pipeline observability. Exposure to Infrastructure-as-Code tools (e.g., Terraform, Pulumi). Familiarity with testing frameworks like pytest or unittest Smarsh hires lifelong learners with a passion for innovating with purpose, humility and humor. Collaboration is at the heart of everything we do. We work closely with the most popular communications platforms and the world’s leading cloud infrastructure platforms. We use the latest in AI/ML technology to help our customers break new ground at scale. We are a global organization that values diversity, and we believe that providing opportunities for everyone to be their authentic self is key to our success. Smarsh leadership, culture, and commitment to developing our people have all garnered Comparably.com Best Places to Work Awards. Come join us and find out what the best work of your career looks like.

Posted 2 weeks ago

Apply

3.0 - 8.0 years

9 - 19 Lacs

Hyderabad, Ahmedabad, Bengaluru

Work from Office

Kubernetes Engineer Build bulletproof infrastructure for regulated industries At Ajmera Infotech , we're building planet-scale software for NYSE-listed clients with a 120+ strong engineering team . Our work powers mission-critical systems in HIPAA, FDA, and SOC2-compliant domains where failure is not an option . Why Youll Love It Own production-grade Kubernetes deployments at real scale Drive TDD-first DevOps in CI/CD environments Work in a compliance-first org (HIPAA, FDA, SOC2) with code-first values Collaborate with top-tier engineers in multi-cloud deployments Career growth via mentorship , deep-tech projects , and leadership tracks Key Responsibilities Design, deploy, and manage resilient Kubernetes clusters (k8s/k3s) Automate workload orchestration using Ansible or custom scripting Integrate Kubernetes deeply into CI/CD pipelines Tune infrastructure for performance, scalability, and regulatory reliability Support secure multi-tenant environments and compliance needs (e.g., HIPAA/FDA) Must-Have Skills 38 years of hands-on experience in production Kubernetes environments Expert-level knowledge of containerization with Docker Proven experience with CI/CD integration for k8s Automation via Ansible , shell scripting, or similar tools Infrastructure performance tuning within Kubernetes clusters Nice-to-Have Skills Multi-cloud cluster management (AWS/GCP/Azure) Helm, ArgoCD, or Flux for deployment and GitOps Service mesh, ingress controllers, and pod security policies

Posted 2 weeks ago

Apply

3.0 - 8.0 years

7 - 17 Lacs

Hyderabad, Ahmedabad, Bengaluru

Work from Office

Sr. Cloud Infrastructure Engineer Build the Backbone of Mission-Critical Software Ajmera Infotech is a planet-scale engineering firm powering NYSE-listed clients with a 120+ strong team of elite developers. We build fail-safe, compliant software systems that cannot go down and now, were hiring a senior cloud engineer to help scale our infrastructure to the next level. Why Youll Love It Terraform everything Zero-click, GitOps-driven provisioning pipelines Hardcore compliance Build infrastructure aligned with HIPAA, FDA, and SOC2 Infra across OSes Automate for Linux, MacOS, and Windows environments Own secrets & state — Use Vault, Packer, Consul like a HashiCorp champion Team of pros — Collaborate with engineers who write tests before code Dev-first culture — Code reviews, mentorship, and CI/CD by default Real-world scale — Azure-first systems powering critical applications Key Responsibilities Design and automate infrastructure as code using Terraform, Ansible, and GitOps Implement secure secret management via HashiCorp Vault Build CI/CD-integrated infra automation across hybrid environments Develop scripts and tooling in PowerShell, Bash, and Python Manage cloud infrastructure primarily on Azure, with exposure to AWS Optimize for performance, cost, and compliance at every layer Support infrastructure deployments using containerization tools (e.g., Docker, Kubernetes) Must-Have Skills 3–8 years in infrastructure automation and cloud engineering Deep expertise in Terraform (provisioning, state management) Hands-on with HashiCorp Vault, Packer, and Consul Strong Azure experience Proficiency with Ansible and GitOps workflows Cross-platform automation: Linux, MacOS, Windows CI/CD knowledge for infra pipelines REST API usage for automation tasks PowerShell, Python, and Bash scripting Nice-to-Have Skills AWS exposure Cost-performance optimization experience in cloud environments Containerization for infra deployments (Docker, Kubernetes) ibilities

Posted 2 weeks ago

Apply

5.0 - 10.0 years

14 - 24 Lacs

Bengaluru

Hybrid

Factspan Overview: Factspan is a pure play data and analytics services organization. We partner with fortune 500 enterprises to build an analytics center of excellence, generating insights and solutions from raw data to solve business challenges, make strategic recommendations and implement new processes that help them succeed. With offices in Seattle, Washington and Bengaluru, India; we use a global delivery model to service our customers. Our customers include industry leaders from Retail, Financial Services, Hospitality, and technology sectors. Responsibilities: We are seeking a highly skilled uDeploy Developer to manage and enhance deployment pipelines for enterprise applications. This role involves maintaining legacy IBM UrbanCode Deploy (uDeploy) systems, integrating with CI tools like Jenkins, collaborating with application and support teams, and supporting the phased transition to modern GitOps-based delivery using ArgoCD and GitLab. The candidate will be part of a larger DevOps transformation program closely aligned with SRE operations. Key Responsibilities: Develop & implement machine learning models & algorithms to extract insights from large datasets. Own, manage, and support end-to-end uDeploy pipelines for critical enterprise applications across multiple environments (DEV, QA, UAT, PROD). Build reusable templates and promote consistent deployment standards. Work with app teams to configure uDeploy components: processes, applications, environments, component versions. Create and manage pre/post-deployment scripts, environment variable mappings, and rollback plans. Integrate Jenkins CI jobs with uDeploy processes using REST APIs or plugins. Collaborate with DevOps engineers and architects to transition uDeploy-based pipelines to GitOps-native deployments using ArgoCD. Troubleshoot deployment failures and support L2/L3 incident resolution. Maintain deployment documentation, audit trails, and release notes for compliance and traceability. Work with Retail-specific tools and frameworks to enable seamless holiday/seasonal deployment readiness. Required Skills & Tools: Deployment Tools - IBM UrbanCode Deploy (uDeploy), UrbanCode CLI, Process Designer' CI Tools Jenkins, GitLab CI Scripting Groovy, Shell, Python Version Control Git Gitlab Containerization (Plus) - Docker, Helm, Kubernetes (basic awareness helpful) Infrastructure (Bonus) - GCP / AWS experience in context of deployments Monitoring/Logging (Bonus) - Splunk, AppDynamics (integration-level exposure preferred) Good to Have Experience in migrating uDeploy pipelines to ArgoCD or similar tools. Exposure to Retail domain CI/CD patterns and high-availability deployment scheduling. Familiarity with ITIL-based change management processes. Understanding of DevOps-SRE collaboration models. Join Us as a DevOps Engineer Are you driven by precision, planning, and seamless delivery? We're looking for a UDeploy Release Engineer to orchestrate the deployment of our cutting-edge solutions. If you're ready to lead impactful releases and streamline innovation, we'd love to hear from you. Send your resume to sathishkumar.arumugam@factspan.com .

Posted 2 weeks ago

Apply

3.0 years

3 - 7 Lacs

Hyderābād

On-site

Job Description Sr. Cloud Infrastructure Engineer — Build the Backbone of Mission-Critical Software (On-site only) Ajmera Infotech is a planet-scale engineering firm powering NYSE-listed clients with a 120+ strong team of elite developers. We build fail-safe, compliant software systems that cannot go down — and now, we’re hiring a senior cloud engineer to help scale our infrastructure to the next level. Why You’ll Love It Terraform everything — Zero-click, GitOps-driven provisioning pipelines Hardcore compliance — Build infrastructure aligned with HIPAA, FDA, and SOC2 Infra across OSes — Automate for Linux, MacOS, and Windows environments Own secrets & state — Use Vault, Packer, Consul like a HashiCorp champion Team of pros — Collaborate with engineers who write tests before code Dev-first culture — Code reviews, mentorship, and CI/CD by default Real-world scale — Azure-first systems powering critical applications Requirements Key Responsibilities Design and automate infrastructure as code using Terraform, Ansible, and GitOps Implement secure secret management via HashiCorp Vault Build CI/CD-integrated infra automation across hybrid environments Develop scripts and tooling in PowerShell, Bash, and Python Manage cloud infrastructure primarily on Azure, with exposure to AWS Optimize for performance, cost, and compliance at every layer Support infrastructure deployments using containerization tools (e.g., Docker, Kubernetes) Must-Have Skills 3–8 years in infrastructure automation and cloud engineering Deep expertise in Terraform (provisioning, state management) Hands-on with HashiCorp Vault, Packer, and Consul Strong Azure experience Proficiency with Ansible and GitOps workflows Cross-platform automation: Linux, MacOS, Windows CI/CD knowledge for infra pipelines REST API usage for automation tasks PowerShell, Python, and Bash scripting Nice-to-Have Skills AWS exposure Cost-performance optimization experience in cloud environments Containerization for infra deployments (Docker, Kubernetes) Benefits Competitive salary + performance bonus. Comprehensive health insurance for you and your dependents. Flexible working hours and generous PTO.

Posted 2 weeks ago

Apply

3.0 years

3 - 8 Lacs

Hyderābād

Remote

Role : Senior DevOps Developer (SR1) Location : Remote Job Summary : This is a full-time role for a Senior DevOps Developer (SR1) . We are seeking an experienced DevOps professional to lead our infrastructure strategy, design resilient systems, and drive continuous improvement in our deployment processes. In this role, you will architect scalable solutions, mentor junior engineers, and ensure the highest standards of reliability and security across our cloud infrastructure. The job location is flexible with preference for the Delhi NCR region. Responsibilities Lead comprehensive improvements to CI/CD systems and deployment pipelines. Design and implement resilient, secure, and scalable infrastructure solutions. Proactively identify and resolve infrastructure bottlenecks and performance challenges. Own deployment health, managing Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Conduct thorough infrastructure audits and optimize cost-efficiency. Develop and maintain high availability and robust rollback strategies. Collaborate closely with Development and QA teams to streamline release automation. Mentor Mid-Level and Junior DevOps Engineers, fostering skill development and best practices. Provide technical leadership and guidance in architectural decisions. Lead complex project components with minimal supervision. Develop risk mitigation strategies for infrastructure and deployment challenges. Propose innovative technological solutions aligned with business goals. Requirements Technical Skills Bachelor's or Master's degree in Computer Science, Engineering, or related field. 3-5 years of professional DevOps experience with demonstrated progression. Advanced Linux administration and shell scripting expertise. Comprehensive Git workflow knowledge, including advanced branching and collaboration strategies. Deep Kubernetes knowledge including Helm, StatefulSets, Horizontal Pod Autoscalers, and Network Policies. Advanced Terraform skills with module development, remote backend, and workspace management. Extensive experience with AWS services (EC2, S3, IAM, VPC, CloudWatch). Advanced Docker and Kubernetes container optimization and deployment strategies. Expertise in writing and maintaining complex CI/CD pipelines using Jenkins, GitHub Actions. Advanced secrets management using AWS SSM, HashiCorp Vault. Comprehensive logging and alerting system setup (ELK stack, Prometheus, Alertmanager). Advanced cloud security implementation (IAM roles, Key Management Service, Web Application Firewall). GitOps implementation experience with tools like ArgoCD and Flux. Performance tuning skills for infrastructure and containerized environments. Advanced observability practices covering metrics, logs, and distributed tracing. Soft Skills Cross-functional communication excellence with ability to lead technical discussions. Strong mentorship capabilities for junior and mid-level team members. Advanced strategic thinking and ability to propose innovative solutions. Excellent knowledge transfer skills through documentation and training. Ability to understand and align technical solutions with broader business strategy. Proactive problem-solving approach with focus on continuous improvement. Strong leadership skills in guiding team performance and technical direction. Effective collaboration across development, QA, and business teams. Ability to make complex technical decisions with minimal supervision. Strategic approach to risk management and mitigation. Additional Preferred Qualifications Experience with multi-cloud or hybrid-cloud environments. Exposure to incident management and on-call responsibilities. Advanced scripting skills in Groovy, Python, or Go for CI/CD. Experience with infrastructure testing tools like Terratest or Inspec. Advanced cost analysis and cloud cost optimization skills. Contributions to open-source projects or advanced technical certifications. What We Offer Professional Growth : Continuous learning opportunities through diverse projects and mentorship from experienced leaders Global Exposure : Work with clients from 20+ countries, gaining insights into different markets and business cultures Impactful Work : Contribute to projects that make a real difference, with solutions generating over $1B in revenue Work-Life Balance : Flexible arrangements that respect personal wellbeing while fostering productivity Career Advancement : Clear progression pathways as you develop skills within our growing organization Competitive Compensation : Attractive salary packages that recognize your contributions and expertise

Posted 2 weeks ago

Apply

3.0 years

5 - 8 Lacs

Gurgaon

Remote

Role : Senior DevOps Developer (SR1) Location : Remote Job Summary : This is a full-time role for a Senior DevOps Developer (SR1) . We are seeking an experienced DevOps professional to lead our infrastructure strategy, design resilient systems, and drive continuous improvement in our deployment processes. In this role, you will architect scalable solutions, mentor junior engineers, and ensure the highest standards of reliability and security across our cloud infrastructure. The job location is flexible with preference for the Delhi NCR region. Responsibilities Lead comprehensive improvements to CI/CD systems and deployment pipelines. Design and implement resilient, secure, and scalable infrastructure solutions. Proactively identify and resolve infrastructure bottlenecks and performance challenges. Own deployment health, managing Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Conduct thorough infrastructure audits and optimize cost-efficiency. Develop and maintain high availability and robust rollback strategies. Collaborate closely with Development and QA teams to streamline release automation. Mentor Mid-Level and Junior DevOps Engineers, fostering skill development and best practices. Provide technical leadership and guidance in architectural decisions. Lead complex project components with minimal supervision. Develop risk mitigation strategies for infrastructure and deployment challenges. Propose innovative technological solutions aligned with business goals. Requirements Technical Skills Bachelor's or Master's degree in Computer Science, Engineering, or related field. 3-5 years of professional DevOps experience with demonstrated progression. Advanced Linux administration and shell scripting expertise. Comprehensive Git workflow knowledge, including advanced branching and collaboration strategies. Deep Kubernetes knowledge including Helm, StatefulSets, Horizontal Pod Autoscalers, and Network Policies. Advanced Terraform skills with module development, remote backend, and workspace management. Extensive experience with AWS services (EC2, S3, IAM, VPC, CloudWatch). Advanced Docker and Kubernetes container optimization and deployment strategies. Expertise in writing and maintaining complex CI/CD pipelines using Jenkins, GitHub Actions. Advanced secrets management using AWS SSM, HashiCorp Vault. Comprehensive logging and alerting system setup (ELK stack, Prometheus, Alertmanager). Advanced cloud security implementation (IAM roles, Key Management Service, Web Application Firewall). GitOps implementation experience with tools like ArgoCD and Flux. Performance tuning skills for infrastructure and containerized environments. Advanced observability practices covering metrics, logs, and distributed tracing. Soft Skills Cross-functional communication excellence with ability to lead technical discussions. Strong mentorship capabilities for junior and mid-level team members. Advanced strategic thinking and ability to propose innovative solutions. Excellent knowledge transfer skills through documentation and training. Ability to understand and align technical solutions with broader business strategy. Proactive problem-solving approach with focus on continuous improvement. Strong leadership skills in guiding team performance and technical direction. Effective collaboration across development, QA, and business teams. Ability to make complex technical decisions with minimal supervision. Strategic approach to risk management and mitigation. Additional Preferred Qualifications Experience with multi-cloud or hybrid-cloud environments. Exposure to incident management and on-call responsibilities. Advanced scripting skills in Groovy, Python, or Go for CI/CD. Experience with infrastructure testing tools like Terratest or Inspec. Advanced cost analysis and cloud cost optimization skills. Contributions to open-source projects or advanced technical certifications. What We Offer Professional Growth : Continuous learning opportunities through diverse projects and mentorship from experienced leaders Global Exposure : Work with clients from 20+ countries, gaining insights into different markets and business cultures Impactful Work : Contribute to projects that make a real difference, with solutions generating over $1B in revenue Work-Life Balance : Flexible arrangements that respect personal wellbeing while fostering productivity Career Advancement : Clear progression pathways as you develop skills within our growing organization Competitive Compensation : Attractive salary packages that recognize your contributions and expertise

Posted 2 weeks ago

Apply

3.0 years

0 Lacs

Delhi

Remote

Role : Senior DevOps Developer (SR1) Location : Remote Job Summary : This is a full-time role for a Senior DevOps Developer (SR1) . We are seeking an experienced DevOps professional to lead our infrastructure strategy, design resilient systems, and drive continuous improvement in our deployment processes. In this role, you will architect scalable solutions, mentor junior engineers, and ensure the highest standards of reliability and security across our cloud infrastructure. The job location is flexible with preference for the Delhi NCR region. Responsibilities Lead comprehensive improvements to CI/CD systems and deployment pipelines. Design and implement resilient, secure, and scalable infrastructure solutions. Proactively identify and resolve infrastructure bottlenecks and performance challenges. Own deployment health, managing Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Conduct thorough infrastructure audits and optimize cost-efficiency. Develop and maintain high availability and robust rollback strategies. Collaborate closely with Development and QA teams to streamline release automation. Mentor Mid-Level and Junior DevOps Engineers, fostering skill development and best practices. Provide technical leadership and guidance in architectural decisions. Lead complex project components with minimal supervision. Develop risk mitigation strategies for infrastructure and deployment challenges. Propose innovative technological solutions aligned with business goals. Requirements Technical Skills Bachelor's or Master's degree in Computer Science, Engineering, or related field. 3-5 years of professional DevOps experience with demonstrated progression. Advanced Linux administration and shell scripting expertise. Comprehensive Git workflow knowledge, including advanced branching and collaboration strategies. Deep Kubernetes knowledge including Helm, StatefulSets, Horizontal Pod Autoscalers, and Network Policies. Advanced Terraform skills with module development, remote backend, and workspace management. Extensive experience with AWS services (EC2, S3, IAM, VPC, CloudWatch). Advanced Docker and Kubernetes container optimization and deployment strategies. Expertise in writing and maintaining complex CI/CD pipelines using Jenkins, GitHub Actions. Advanced secrets management using AWS SSM, HashiCorp Vault. Comprehensive logging and alerting system setup (ELK stack, Prometheus, Alertmanager). Advanced cloud security implementation (IAM roles, Key Management Service, Web Application Firewall). GitOps implementation experience with tools like ArgoCD and Flux. Performance tuning skills for infrastructure and containerized environments. Advanced observability practices covering metrics, logs, and distributed tracing. Soft Skills Cross-functional communication excellence with ability to lead technical discussions. Strong mentorship capabilities for junior and mid-level team members. Advanced strategic thinking and ability to propose innovative solutions. Excellent knowledge transfer skills through documentation and training. Ability to understand and align technical solutions with broader business strategy. Proactive problem-solving approach with focus on continuous improvement. Strong leadership skills in guiding team performance and technical direction. Effective collaboration across development, QA, and business teams. Ability to make complex technical decisions with minimal supervision. Strategic approach to risk management and mitigation. Additional Preferred Qualifications Experience with multi-cloud or hybrid-cloud environments. Exposure to incident management and on-call responsibilities. Advanced scripting skills in Groovy, Python, or Go for CI/CD. Experience with infrastructure testing tools like Terratest or Inspec. Advanced cost analysis and cloud cost optimization skills. Contributions to open-source projects or advanced technical certifications. What We Offer Professional Growth : Continuous learning opportunities through diverse projects and mentorship from experienced leaders Global Exposure : Work with clients from 20+ countries, gaining insights into different markets and business cultures Impactful Work : Contribute to projects that make a real difference, with solutions generating over $1B in revenue Work-Life Balance : Flexible arrangements that respect personal wellbeing while fostering productivity Career Advancement : Clear progression pathways as you develop skills within our growing organization Competitive Compensation : Attractive salary packages that recognize your contributions and expertise

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Delhi, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Odisha, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Chhattisgarh, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Andhra Pradesh, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Dehradun, Uttarakhand, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Kerala, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Bihar, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Pune, Maharashtra, India

On-site

🚀 We're Hiring: Solution Architect – Pune (Hybrid) 🌐 Are you a tech visionary with deep expertise in cloud platforms, FinOps, IaC, and architecture design ? We're looking for a Solution Architect who can bridge business goals with scalable cloud solutions. 🔍 Role Highlights: Own operations & support of the IBM Turbonomic tool Collaborate with DevOps teams on IaC, GitOps & automation Architect modern cloud-native systems & drive app modernization Ensure optimal system performance and cloud cost optimization Align software and hardware strategies for performance gains 🧩 You Bring: 10+ years of experience (2–3 yrs in cloud platforms – GCP preferred) Expertise in IaC, GitOps, FinOps, cloud cost tools, and security Strong grasp of cloud architectures, networking, and compliance Hands-on with scripting, automation & DevOps tools 💼 Nice to Have: Enterprise architecture or application development background 🤝 Soft Skills That Set You Apart: Strong analytical & problem-solving mindset Clear communicator across technical and business teams Skilled in translating business needs into cloud solutions Experience leading cloud transformation initiatives 📍 Location: Pune 🏠 Work Mode: Hybrid (Flexibility with collaboration) Sound like you? Let’s connect. 📩 DM us or apply moumita.po@peoply.com today to be a part of our cloud-first journey! #SolutionArchitect #PuneJobs #HybridWork #CloudArchitecture #FinOps #IaC #GitOps #Turbonomic #NowHiring #TechLeadership #CloudCareers #DevOpsJobs

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Punjab, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Himachal Pradesh, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

10.0 years

0 Lacs

Madhya Pradesh, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a Staff IT Monitoring Engineer/Site Reliability Engineer (SRE) to lead the design, implementation, and evolution of our enterprise monitoring and observability platforms. In this leadership role, you will architect scalable monitoring solutions, drive reliability initiatives, and serve as a technical authority for monitoring best practices. You will mentor junior team members, collaborate with cross-functional teams to establish SLOs, and play a key role in major incident management. This position requires advanced technical expertise, strategic thinking, and the ability to balance operational excellence with innovation. What You’ll Need Required Skills and Qualifications 10+ years of experience with enterprise monitoring platforms and observability tools (LogicMonitor, DataDog, LogScale, Zscaler Digital Experience (ZDX), ThousandEyes) Advanced proficiency in multiple scripting/programming languages (Python, Go, Bash) Expert knowledge of modern monitoring ecosystems (Prometheus, Grafana, ELK) Demonstrated experience architecting monitoring solutions at scale across hybrid environments Strong background in SRE practices, including SLO definition, error budgets, and reliability engineering Advanced knowledge of cloud platforms (AWS, GCP) and their native monitoring capabilities Expertise in log aggregation, metrics and KPIs collection, and distributed tracing implementations Experience designing and implementing automated remediation systems Strong understanding of Infrastructure as Code and GitOps principles Proven ability to mentor junior engineers and provide technical leadership Shift timings- 12PM -9PM IST What You'll Do Technical Leadership Architect and implement enterprise-wide monitoring and observability solutions Establish monitoring standards, best practices, and governance frameworks Lead the evaluation and adoption of new monitoring technologies and approaches Design scalable, resilient monitoring Infrastructure as Code Serve as the technical escalation point for complex monitoring issues Reliability Engineering Lead the implementation of SRE practices across the organization Partner with service owners to define appropriate SLOs and error budgets Drive reliability improvements through data-driven analysis and recommendations Design and implement advanced alerting strategies Develop comprehensive observability strategies covering metrics, logs, and traces Incident Management Lead major incident response for critical service disruptions Conduct thorough post-incident reviews and drive systematic improvements Establish incident management processes and tooling improvements Mentor team members on effective incident response techniques Analyze incident patterns to identify and address systemic issues Strategic Initiatives Develop the monitoring and observability roadmap aligned with business objectives Lead monitoring platform migrations and major upgrades Implement cost optimization strategies for monitoring infrastructure Drive automation initiatives to reduce toil and improve operational efficiency Collaborate with security teams to integrate security monitoring capabilities Team Development Mentor junior engineers on monitoring best practices and SRE principles Provide technical guidance and code reviews for monitoring implementations Create documentation and knowledge-sharing materials for the broader organization Contribute to hiring and team development activities Foster a culture of continuous improvement and learning Bonus Points Advanced certifications in cloud platforms or SRE practices Experience leading incident response for complex, high-impact service disruptions Experience with AIOps and ML-based monitoring approaches Background in performance engineering or capacity management Experience with chaos engineering and resilience testing Bachelor's or Master's degree in Computer Science, Engineering, or related field Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 2 weeks ago

Apply

5.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Job Description Role: DevOps Engineer About The Role We are looking for an experienced DevOps Engineer to join our Development Operations Group. Development Operations is responsible for our DevOps Platform hosted on Microsoft Azure providing Self Service & Operational excellence for Client's Azure cloud-based applications. Responsibilities include managing all aspects of the DevOps Platform (Kubernetes, Cloud, Networking, Monitoring & CI/CD). Successful candidates will engage, troubleshoot, solution and improve all aspects of the cloud journey. This role requires a 'can do' attitude; An individual with a passion for DevOps. Suitable candidates must thrive on the challenges of working in a fast-paced environment and who can help us to release outstanding software. As within this role, your responsibilities will include the building and running of large-scale, massively distributed, fault-tolerant systems. Qualifications A bachelor or master degree in IT (preferable computer science) 5+ years of experience in software development 2+ years of experience providing automation within end to end deployment and operations of Kubernetes. OpenShift is an advantage Experience Experience designing, deploying and managing orchestration of microservices using Kubernetes. OpenShift is an advantage Experience with implementing end to end monitoring, alerting & tracing on Kubernetes Implementing and Delivering robust Infrastructure as code Managing Desired State Configuration of Kubernetes & Cloud Resources Day 1 and Day 2 Operations of Cloud PaaS Services (Oracle, PostgreSQL, Elastic Cloud, Kafka) Implementation of Software Defined Networks using Kubernetes & Cloud Native Networking Implementation of Key Technical Practices to enable Continuous Delivery for Client's Cloud based Financial Services Primary Responsibilities Service Ownership from Development to Production of Platform Core Services including GitOps, Network, Container Management, Monitoring, Pipelines & Service Catalog on Kubernetes Skills Programming Skills (Java, JavaScript, Python, Bash) Kubernetes/OpenShift & Helm Terraform, Ansible and/or Puppet Azure DevOps Prometheus, Grafana, Loki Azure Cloud Networking (Application Gateway, Global Traffic Manager, Front Door) Fluent English skills

Posted 2 weeks ago

Apply

7.0 - 10.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

DevOps Engineer_ OCP FSS is seeking a highly skilled DevOps Engineer with hands-on experience in Red Hat OpenShift Container Platform (OCP) and associated tools like Argo CD, Jenkins, and Data Grid. The ideal candidate will drive automation, manage containerized environments, and ensure smooth CI/CD pipelines across hybrid infrastructure to support our financial technology solutions. Experience: 7 -10 years CTC: 20-30 lpa Location: Chennai/ Mumbai Key Responsibilities: OpenShift Platform Engineering: Deploy, manage, and maintain applications on OpenShift Container Platform. Configure and manage Operators, Helm charts, and OpenShift GitOps (Argo CD). Manage Red Hat Data Grid deployments and integrations. Support OCP cluster upgrades, patching, and troubleshooting. CI/CD Implementation & Automation: Design, implement, and manage CI/CD pipelines using Jenkins and Argo CD. Ensure seamless code integration, testing, and deployment processes with development teams. Infrastructure as Code (IaC): Automate infrastructure provisioning with tools like Terraform and Ansible. Manage hybrid infrastructure across on-prem and public clouds (AWS, Azure, or GCP). Monitoring & Performance Optimization: Implement and manage observability stacks (Prometheus, Grafana, ELK, etc.) for OCP and underlying services. Proactively identify and resolve system performance bottlenecks. Security & Compliance: Enforce security best practices in containerized and cloud environments. Conduct vulnerability assessments and ensure compliance with industry standards. Collaboration & Support: Collaborate with developers, QA, and IT teams to optimize DevOps workflows. Provide ongoing support and incident response for production and non-production environments. Required Skills & Qualifications: Technical Skills: Strong hands-on experience with OpenShift (v4.x) administration and operations. Proficiency in CI/CD tools: Jenkins, Argo CD, GitHub Actions, GitLab CI/CD. Deep understanding of Kubernetes, Docker, and container orchestration. Experience with Red Hat Data Grid or other in-memory data grids. Skilled in IaC tools: Terraform, Ansible, CloudFormation. Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK, Splunk). Proficient in scripting languages: Bash, Python, or Shell.

Posted 2 weeks ago

Apply

0.0 - 5.0 years

0 Lacs

Delhi, Delhi

Remote

Role : Senior DevOps Developer (SR1) Location : Remote Job Summary : This is a full-time role for a Senior DevOps Developer (SR1) . We are seeking an experienced DevOps professional to lead our infrastructure strategy, design resilient systems, and drive continuous improvement in our deployment processes. In this role, you will architect scalable solutions, mentor junior engineers, and ensure the highest standards of reliability and security across our cloud infrastructure. The job location is flexible with preference for the Delhi NCR region. Responsibilities Lead comprehensive improvements to CI/CD systems and deployment pipelines. Design and implement resilient, secure, and scalable infrastructure solutions. Proactively identify and resolve infrastructure bottlenecks and performance challenges. Own deployment health, managing Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Conduct thorough infrastructure audits and optimize cost-efficiency. Develop and maintain high availability and robust rollback strategies. Collaborate closely with Development and QA teams to streamline release automation. Mentor Mid-Level and Junior DevOps Engineers, fostering skill development and best practices. Provide technical leadership and guidance in architectural decisions. Lead complex project components with minimal supervision. Develop risk mitigation strategies for infrastructure and deployment challenges. Propose innovative technological solutions aligned with business goals. Requirements Technical Skills Bachelor's or Master's degree in Computer Science, Engineering, or related field. 3-5 years of professional DevOps experience with demonstrated progression. Advanced Linux administration and shell scripting expertise. Comprehensive Git workflow knowledge, including advanced branching and collaboration strategies. Deep Kubernetes knowledge including Helm, StatefulSets, Horizontal Pod Autoscalers, and Network Policies. Advanced Terraform skills with module development, remote backend, and workspace management. Extensive experience with AWS services (EC2, S3, IAM, VPC, CloudWatch). Advanced Docker and Kubernetes container optimization and deployment strategies. Expertise in writing and maintaining complex CI/CD pipelines using Jenkins, GitHub Actions. Advanced secrets management using AWS SSM, HashiCorp Vault. Comprehensive logging and alerting system setup (ELK stack, Prometheus, Alertmanager). Advanced cloud security implementation (IAM roles, Key Management Service, Web Application Firewall). GitOps implementation experience with tools like ArgoCD and Flux. Performance tuning skills for infrastructure and containerized environments. Advanced observability practices covering metrics, logs, and distributed tracing. Soft Skills Cross-functional communication excellence with ability to lead technical discussions. Strong mentorship capabilities for junior and mid-level team members. Advanced strategic thinking and ability to propose innovative solutions. Excellent knowledge transfer skills through documentation and training. Ability to understand and align technical solutions with broader business strategy. Proactive problem-solving approach with focus on continuous improvement. Strong leadership skills in guiding team performance and technical direction. Effective collaboration across development, QA, and business teams. Ability to make complex technical decisions with minimal supervision. Strategic approach to risk management and mitigation. Additional Preferred Qualifications Experience with multi-cloud or hybrid-cloud environments. Exposure to incident management and on-call responsibilities. Advanced scripting skills in Groovy, Python, or Go for CI/CD. Experience with infrastructure testing tools like Terratest or Inspec. Advanced cost analysis and cloud cost optimization skills. Contributions to open-source projects or advanced technical certifications. What We Offer Professional Growth : Continuous learning opportunities through diverse projects and mentorship from experienced leaders Global Exposure : Work with clients from 20+ countries, gaining insights into different markets and business cultures Impactful Work : Contribute to projects that make a real difference, with solutions generating over $1B in revenue Work-Life Balance : Flexible arrangements that respect personal wellbeing while fostering productivity Career Advancement : Clear progression pathways as you develop skills within our growing organization Competitive Compensation : Attractive salary packages that recognize your contributions and expertise

Posted 2 weeks ago

Apply

1.0 - 5.0 years

0 Lacs

chandigarh

On-site

You will be a part of our team as a Junior DevOps Engineer, where you will contribute to building, maintaining, and optimizing our cloud-native infrastructure. Your role will involve collaborating with senior DevOps engineers and development teams to automate deployments, monitor systems, and ensure the high availability, scalability, and security of our applications. Your key responsibilities will include managing and optimizing Kubernetes (EKS) clusters, Docker containers, and Helm charts for deployments. You will support CI/CD pipelines using tools like Jenkins, Bitbucket, and GitHub Actions, and help deploy and manage applications using ArgoCD for GitOps workflows. Monitoring and troubleshooting infrastructure will be an essential part of your role, utilizing tools such as Grafana, Prometheus, Loki, and OpenTelemetry. Working with various AWS services like EKS, ECR, ALB, EC2, VPC, S3, and CloudFront will also be a crucial aspect to ensure reliable cloud infrastructure. Automating infrastructure provisioning using IaC tools like Terraform and Ansible will be another key responsibility. Additionally, you will assist in maintaining Docker image registries and collaborate with developers to enhance observability, logging, and alerting while adhering to security best practices for cloud and containerized environments. To excel in this role, you should have a basic understanding of Kubernetes, Docker, and Helm, along with familiarity with AWS cloud services like EKS, EC2, S3, VPC, and ALB. Exposure to CI/CD tools such as Jenkins, GitHub/Bitbucket pipelines, basic scripting skills (Bash, Python, or Groovy), and knowledge of observability tools like Prometheus, Grafana, and Loki will be beneficial. Understanding GitOps (ArgoCD) and infrastructure as code (IaC), experience with Terraform/CloudFormation, and knowledge of Linux administration and networking are also required skills. This is a full-time position that requires you to work in person. If you are interested in this opportunity, please feel free to reach out to us at +91 6284554276.,

Posted 2 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

As a Senior Software Engineer in the Platform Engineering team at Mastercard, you will be instrumental in designing and developing software related to internet traffic engineering technologies. Your primary responsibilities will include working on public and private CDNs, load balancing, DNS, DHCP, and IPAM solutions. You will have the exciting opportunity to contribute to building new platform technologies from the ground up in our high-impact environment. In this role, you will be responsible for developing and maintaining Public and Private REST APIs while upholding high code and quality standards. You will also provide timely and competent support for the technologies owned and built by the team, bridging automation gaps by writing scripts that enhance automation and improve service quality. Your drive and curiosity to continuously learn and teach yourself new skills will be highly valued as you collaborate effectively with cross-functional teams to ensure project success. To excel in this position, you should possess a Bachelor's degree in Computer Science or a related technical field, or equivalent practical experience. Strong fundamentals in internet and intranet traffic engineering, OSI Layers & Protocols, DNS, DHCP, IP address management, and TCP/HTTP processing are essential. Additionally, a practical understanding of data structures, algorithms, and database fundamentals is required. Proficiency in Java, Python, SQL, NoSQL, Kubernetes, PCF, Jenkins, Chef, and related platforms is crucial for this role. Knowledge of cloud-native and multi-tiered applications development, as well as experience with programming around Network services, Domain Nameservers, DHCP, and IPAM solutions, will be advantageous. An understanding of CI/CD pipelines, DevSecOps, GitOps, and related best practices is necessary, along with the ability to write and maintain scripts for automation to drive efficiency and innovation. If you are motivated by innovation, have a curious mindset, and are dedicated to enhancing customer experiences through technological advancements, we invite you to join our dynamic team at Mastercard and contribute to shaping the future of our Network team.,

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies