Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 years
0 Lacs
Chhattisgarh, India
Remote
As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role Design, implement, and maintain scalable and reliable cloud infrastructure using industry best practices. What You'll Do Collaborate with cross-functional teams to define and implement automation solutions. Improve engineer productivity through platform automation. Develop and maintain CI/CD pipelines to automate the deployment and testing of IT I&O applications. Reduce deployment complexity and time-to-market. Design and build internal developer platforms and self-service tools. Monitor and optimize the performance, security, and cost-efficiency of our hybrid cloud infrastructure. Troubleshoot and resolve issues related to infrastructure and applications. Stay up-to-date with the latest trends and technologies in cloud computing and DevOps practices. Mentor and guide junior team members to help them grow their skills and knowledge, participating in architectural reviews and technical decision-making. What You’ll Need Bachelor’s degree in Computer Science, Engineering, or a related field. 5+ years of experience working as a Cloud Software Engineer/DevOps Engineer. Strong knowledge of public and hybrid cloud platforms such as AWS, GCP, and VMWare. Experience with infrastructure-as-code tools such as Terraform or CloudFormation. Proficiency in programming and scripting languages such as Python, JavaScript, Bash, or PowerShell. Internal tooling development experience, including API development and integration experience. Founded knowledge of DevSecOps principles and security best practices in CI/CD pipelines. Hands-on experience with cloud-native technologies like Docker and Kubernetes. Experience with GitOps workflows (ArgoCD, Flux). Service mesh experience (Istio, Linkerd). Solid understanding of networking concepts and protocols. Familiarity with monitoring and logging tools such as Grafana, DataDog, Splunk, or similar. Observability beyond monitoring (distributed tracing). Strong problem-solving skills and the ability to work well under pressure. Excellent communication and collaboration skills. Bonus Points Certification in AWS or GCP. Experience with serverless computing and cloud-native architecture. Knowledge of configuration management tools such as Ansible. Familiarity with Agile/Scrum methodologies Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.
Posted 2 weeks ago
5.0 years
0 Lacs
Himachal Pradesh, India
Remote
As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Role The CrowdStrike Information Technology team is looking for a skilled Sr. IT Monitoring Engineer/Site Reliability Engineer (SRE) to join our IT Operations team. In this role, you will be responsible for designing, implementing, and maintaining monitoring solutions that ensure the reliability, availability, and performance of our critical IT infrastructure and applications. You will work at the intersection of operations and development, applying software engineering principles to operations tasks while focusing on system reliability and automation. This position requires a proactive approach to identifying and resolving issues before they impact business operations, as well as participating in on-call rotations to address incidents when they occur. What You’ll Need 5+ years of experience with enterprise monitoring tools (Prometheus, LogicMonitor, Datadog, ThousandEyes, Zscaler Digital Experience (ZDX)) Strong proficiency in scripting languages (Python, Bash, PowerShell) for automation Experience with log management platforms (ELK stack, Splunk, LogScale) Working knowledge of cloud services monitoring (AWS CloudWatch, GCP) Experience with application performance monitoring (APM), digital experience monitoring (DEM) and infrastructure monitoring Knowledge of SRE principles, SLOs, error budgets, and incident management Experience with automated alerting, remediation workflows, and CI/CD pipeline monitoring Familiarity with Infrastructure as Code (Terraform, Ansible) and containerization (Docker, Kubernetes) Strong incident triage, root cause analysis, and documentation skills Experience participating in on-call rotations and emergency response What You'll Do Monitoring and Reliability Design and maintain comprehensive monitoring solutions across infrastructure and applications Configure appropriate alerting thresholds to ensure timely response to potential issues Define and track SLOs and error budgets for critical services Create and maintain dashboards providing real-time visibility into system health Conduct regular reviews of system reliability and recommend improvements Incident Management and Operations Participate in on-call rotation to respond to alerts and incidents Lead incident response efforts and conduct thorough post-incident reviews Document incidents, resolutions, and lessons learned Develop and refine incident response procedures to improve MTTR Implement proactive monitoring to detect potential issues before they impact users Automation and Collaboration Develop scripts and automation to streamline monitoring tasks and reduce manual effort Create self-healing systems that can automatically remediate common issues Integrate monitoring tools with other operational systems Work closely with development, infrastructure, and security teams Provide guidance on monitoring best practices and observability Maintain comprehensive documentation for monitoring systems and procedures Continuous Improvement Stay current with industry trends in monitoring and site reliability engineering Analyze monitoring data to identify patterns and improvement opportunities Implement metrics to track the effectiveness of monitoring processes Contribute to the evolution of the organization's monitoring strategy Bonus Points SRE, cloud platform, or monitoring tool certifications ITIL Foundation certification Bachelor's degree in Computer Science, Information Technology, or related field Shift Timings: 12PM - 9PM IST Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.
Posted 2 weeks ago
4.0 - 6.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
Description We are looking for a visionary and hands-on DevOps Engineer to drive the strategic direction, implementation, and continuous improvement of our DevOps practices across the organization. Requirements Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical discipline. 4 to 6 years of overall experience in infrastructure engineering, DevOps, systems administration, or platform engineering. Hands-on expertise in cloud platforms (AWS, Azure, or GCP), with deep knowledge of networking, IAM, VPCs, storage, and compute services. Strong proficiency in Infrastructure as Code (IaC) using Terraform, Ansible, or equivalent. Experience building and managing CI/CD pipelines using tools such as Jenkins, GitLab CI, CircleCI, or ArgoCD. Strong background in Linux/Unix systems, system administration, scripting (e.g., Bash, Python, Go), and configuration management. Experience implementing containerization and orchestration using Docker, Kubernetes, Helm. Familiarity with observability tools and logging frameworks (e.g., ELK, Datadog, Fluentd, Prometheus, Grafana). Solid understanding of DevOps principles, Agile/Lean methodologies, and modern SDLC practices. Job responsibilities The ideal candidate is a technical leader with deep expertise in automation, cloud operations, configuration management, and infrastructure-as-code (IaC). This role requires strong collaboration across engineering, security, product, and QA to enable a culture of continuous delivery, operational excellence, and system reliability. What we offer Culture of caring. At GlobalLogic, we prioritize a culture of caring. Across every region and department, at every level, we consistently put people first. From day one, you’ll experience an inclusive culture of acceptance and belonging, where you’ll have the chance to build meaningful connections with collaborative teammates, supportive managers, and compassionate leaders. Learning and development. We are committed to your continuous learning and development. You’ll learn and grow daily in an environment with many opportunities to try new things, sharpen your skills, and advance your career at GlobalLogic. With our Career Navigator tool as just one example, GlobalLogic offers a rich array of programs, training curricula, and hands-on opportunities to grow personally and professionally. Interesting & meaningful work. GlobalLogic is known for engineering impact for and with clients around the world. As part of our team, you’ll have the chance to work on projects that matter. Each is a unique opportunity to engage your curiosity and creative problem-solving skills as you help clients reimagine what’s possible and bring new solutions to market. In the process, you’ll have the privilege of working on some of the most cutting-edge and impactful solutions shaping the world today. Balance and flexibility. We believe in the importance of balance and flexibility. With many functional career areas, roles, and work arrangements, you can explore ways of achieving the perfect balance between your work and life. Your life extends beyond the office, and we always do our best to help you integrate and balance the best of work and life, having fun along the way! High-trust organization. We are a high-trust organization where integrity is key. By joining GlobalLogic, you’re placing your trust in a safe, reliable, and ethical global company. Integrity and trust are a cornerstone of our value proposition to our employees and clients. You will find truthfulness, candor, and integrity in everything we do. About GlobalLogic GlobalLogic, a Hitachi Group Company, is a trusted digital engineering partner to the world’s largest and most forward-thinking companies. Since 2000, we’ve been at the forefront of the digital revolution – helping create some of the most innovative and widely used digital products and experiences. Today we continue to collaborate with clients in transforming businesses and redefining industries through intelligent products, platforms, and services.
Posted 2 weeks ago
7.0 years
0 Lacs
Pune, Maharashtra, India
On-site
What you’ll do Develop and maintain observability using AWS/GCP tools and Datadog. Keep monitoring tool software currency up to date across Cloud/Legacy landscape Keep Engineering updated with logging/tracing standards Good knowledge of Splunk or other logging tools like ELK stack Have good understanding of Application Performance Management Implement best practices for observability, including metrics, logging, and tracing. Collaborate with engineering and operations teams to troubleshoot and resolve performance issues. Automate observability processes and integrate them into CI/CD pipelines. Analyze and interpret monitoring data to provide actionable insights and recommendations. Stay updated with the latest advancements in GCP and Datadog to continuously improve our observability capabilities Good knowledge of linux/windows environment Work in Scaled Agile Framework Solve problems and triage complex distributed architecture service maps. On call for high severity application incidents and improving run books to improve MTTR Lead availability blameless postmortem and own the call to action to remediate recurrences What Experience You Need BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent job experience required 7-10 years experience with monitoring tools Google/AWS Cloud Monitoring, Appdynamics, DataDog, Splunk , Elastic Search or similar 7+ years’ experience in system support, coding or operations. Hands-on experience with Windows/Linux environments Excellent problem-solving and communication skills Provide step-by-step technical help, both written and verbal Experience in languages such as Python, Bash, Java, Go JavaScript and/or node.js Demonstrable cross-functional knowledge with systems, storage, networking, security and databases System administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansibleand/or containers (Docker, Kubernetes, etc.) Proficiency with continuous integration and continuous delivery tooling and practices Cloud Certification Strongly Preferred What Could Set You Apart You take a system problem-solving approach, coupled with strong communication skills and a sense of ownership and drive Experience managing Infrastructure as code via tools such as Terraform or CloudFormation Passion for automation with a desire to eliminate toil whenever possible You’ve built software or maintained systems in a highly secure, regulated or compliant industry Experience and passion for working within a DevOps culture and as part of a team Proficiency with continuous integration and continuous delivery tooling and practices
Posted 2 weeks ago
4.0 years
0 Lacs
Noida, Uttar Pradesh, India
On-site
About Client: Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA. Founded in 1996, with a revenue of $1.8B, with 35,000+ associates worldwide, specializes in digital engineering, and IT services company helping clients modernize their technology infrastructure, adopt cloud and AI solutions, and accelerate innovation. It partners with major firms in banking, healthcare, telecom, and media. Our Client is known for combining deep industry expertise with agile development practices, enabling scalable and cost-effective digital transformation. The company operates in over 50 locations across more than 25 countries, has delivery centers in Asia, Europe, and North America and is backed by Baring Private Equity Asia. Hi....! We are hiring for below Positions Job Title:Cloudops engineer Key Skills: Hotfix & Sequential ,Cloudops,ci/cd,AWS Job Locations: Pan India Experience: 10 - 16Yrs Budget: 25LPA Education Qualification : Any Graduation Work Mode: Hybrid Employment Type: Contract Notice Period: Immediate - 15 Days Interview Mode: 2 Rounds of Technical Interview + Including Client round Job Description: CloudOps Engineer/Senior CloudOps Engineer – L2 Position Summary: We are currently seeking Managed Services CloudOps for IoT projects in the Smart Products & IoT Strategic Innovation Centre in India team. This role is responsible for supporting managed services & application/product Operations for IoT projects. Duties & Responsibilities: · Apply best practices and strategies regarding Prod application and infrastructure Maintenance (Provisioning/Alerting/Monitoring etc.) o Knowledge & Purpose of various env QA, UAT/Staging, Prod. o Understanding Git, AWS Code Commit. o Hotfix & Sequential configuration process need to follow up. o Understanding of Repositories. o Understanding & use of CI/CD Pipelines. o AWS CLI use & Implementation. · Ensure application & AWS Infrastructure proactive monitoring- o Realtime Monitoring of AWS Services. o CloudWatch alert configurations. o Alerts configuration in third-party tool like Newrelic. Datadog, Splunk etc. o Awareness of Pre & Post Deployment changeset in AWS infrastructure · Managing cloud environments in accordance with company security guidelines. o Config Register Management. o Daily data monitoring of deployed services. o Apply Best security practices for deployed Infrastructure. o Suggest regular optimization of infra by upscale & downscale. · Troubleshoot incidents, identify root cause, fix and document problems, and implement preventive measures o Lambda Logs Configuration. o API logs Configuration. o Better understanding of CloudWatch log insights. · Educate teams on the implementation of new cloud-based initiatives, providing associated training as required · Employ exceptional problem-solving skills, with the ability to see and solve issues before they affect business productivity. o Have Experience in CloudOps Process. · Participate in all aspects of the software development life cycle for AWS solutions, including planning, requirements, development, testing, and quality assurance. · Various AWS accounts Billing management/analysis and alert configurations as per the defined threshold. o Understanding of AWS billing console. o Able to analyze daily/Monthly costing of OnDemand services. · Python & Bash scripting is must to automate the regular task like Data fetch from S3/DDB, Job deployment Qualifications and Experience: · Mandatory o Bachelor’s degree in Electrical Engineering, Software Engineering, Computer Science, Computer Engineering, or related Engineering discipline. o 4+ years of experience in Deployment & Monitoring of AWS serverless services. o 1+ years of experience in the Smart/Connected Products & IoT workflow. o Hands on experience in § Mobile or Web App issues troubleshooting § AWS platform or certified in AWS (SysOPS or DevOPS) · Server-less/headless architecture · Lambda, API Gateways, Kinesis, ElasticSearch, ElasticCache, Dynamo DB, Athena, AWS IoT, Codecommit, Cloudtrail, Codebuild. § Cloud formation template understanding for configuration changes. § NoSQL Database (Dynamo DB preferred). § Trouble ticketing tools (Jira Software & Jira Service Desk preferred) o Good hands-on experience in scripting languages: § Python,Bash,Node,Gitbash,CodeCommit o Experience of impact analysis for Infrastructure configuration change. · Preferred o Hands on experience on Newrelic/Kibana/Splunk and AWS Cloudwatch tools o Prior experience in operation support for IoT projects (50,000+ live devices) will be an added advantage, o Experience in AWS Cloud IoT Core platform. o L2 Support experience in addition to CloudOps Skills and Abilities Required: · Willingness to work in a 24X7 shifts environment · Flexible to take short term travels on a short notice to facilitate the field trails & soft launch of products · Excellent troubleshooting & analytical skills · Highly customer-focused and always eager to find a way to enhance customer experience · Able to pinpoint business needs and deliver innovative solutions · Can-do positive attitude, always looking to accelerate development. · Self-driven & committed to high standards of performance and demonstrate personal ownership for getting the job done. · Innovative and entrepreneurial attitude; stays up to speed on all the latest technologies and industry trends; healthy curiosity to evaluate, understand and utilize new technologies. · Excellent verbal & written communication skills Interested Candidates please share your CV to sushma.n@people-prime.com
Posted 2 weeks ago
6.0 years
0 Lacs
Noida, Uttar Pradesh, India
On-site
About Client: Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA. Founded in 1996, with a revenue of $1.8B, with 35,000+ associates worldwide, specializes in digital engineering, and IT services company helping clients modernize their technology infrastructure, adopt cloud and AI solutions, and accelerate innovation. It partners with major firms in banking, healthcare, telecom, and media. Our Client is known for combining deep industry expertise with agile development practices, enabling scalable and cost-effective digital transformation. The company operates in over 50 locations across more than 25 countries, has delivery centers in Asia, Europe, and North America and is backed by Baring Private Equity Asia. Job Title: CloudOps Engineer/Senior CloudOps Engineer – L2 Key Skills: Apply best practices and strategies regarding Prod application and infrastructure Maintenance (Provisioning/Alerting/Monitoring etc.). Job Locations: Noida Experience: 4 – 6 Years Budget: Based on your Experience Education Qualification : Any Graduation Work Mode: Hybrid Employment Type: Contract Notice Period: Immediate - 15 Days Interview Mode: 2 Rounds of Technical Interview + Including Client round Job Description: Position Summary: Pentair is currently seeking Managed Services CloudOps for IoT projects in the Smart Products & IoT Strategic Innovation Centre in India team. This role is responsible for supporting managed services & application/product Operations for IoT projects. Duties & Responsibilities: · Apply best practices and strategies regarding Prod application and infrastructure Maintenance (Provisioning/Alerting/Monitoring etc.) o Knowledge & Purpose of various env QA, UAT/Staging, Prod. o Understanding Git, AWS Code Commit. o Hotfix & Sequential configuration process need to follow up. o Understanding of Repositories. o Understanding & use of CI/CD Pipelines. o AWS CLI use & Implementation. · Ensure application & AWS Infrastructure proactive monitoring- o Realtime Monitoring of AWS Services. o CloudWatch alert configurations. o Alerts configuration in third-party tool like Newrelic. Datadog, Splunk etc. o Awareness of Pre & Post Deployment changeset in AWS infrastructure · Managing cloud environments in accordance with company security guidelines. o Config Register Management. o Daily data monitoring of deployed services. o Apply Best security practices for deployed Infrastructure. o Suggest regular optimization of infra by upscale & downscale. · Troubleshoot incidents, identify root cause, fix and document problems, and implement preventive measures o Lambda Logs Configuration. o API logs Configuration. o Better understanding of CloudWatch log insights. · Educate teams on the implementation of new cloud-based initiatives, providing associated training as required · Employ exceptional problem-solving skills, with the ability to see and solve issues before they affect business productivity. o Have Experience in CloudOps Process. · Participate in all aspects of the software development life cycle for AWS solutions, including planning, requirements, development, testing, and quality assurance. · Various AWS accounts Billing management/analysis and alert configurations as per the defined threshold. o Understanding of AWS billing console. o Able to analyze daily/Monthly costing of OnDemand services. · Python & Bash scripting is must to automate the regular task like Data fetch from S3/DDB, Job deployment Interested Candidates please share your CV to jyothi.a@people-prime.com
Posted 2 weeks ago
5.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
About Forsys: Forsys Inc. is a leader in Lead-to-Revenue transformation, combining strategy, technology, and business transformation to drive growth. With a team of over 500 professionals spread across the US, India, UK, Colombia, and Brazil, and headquartered in the Bay Area, Forsys epitomizes innovation and excellence. Our role as an implementation partner for major vendors like Conga, Salesforce and Oracle; an incubator for pioneering ideas and solutions positions us uniquely in the consulting industry. We are dedicated to unlocking new revenue streams for our clients and fostering a culture of innovation. Discover our vision and the impact we’re making at forsysinc.com Job Summary: Looking for a highly experienced and self-driven AWS Cloud Operations Engineer. The ideal candidate will have deep expertise in AWS services, infrastructure automation, monitoring, incident response, and continuous improvement of cloud operations. This role is critical to ensuring the scalability, reliability, and security of our cloud infrastructure. Key Responsibilities: Manage and maintain AWS infrastructure using Infrastructure as Code (IaC) tools such as Terraform or AWS CloudFormation. Design and implement highly available, fault-tolerant, and secure cloud environments. Automate infrastructure provisioning, configuration management, and deployment processes. Monitor system health, performance, and capacity planning using AWS CloudWatch, Datadog, Prometheus, or other observability tools. Ensure proper incident and problem management including root cause analysis and remediation. Implement and manage CI/CD pipelines for automated deployments and updates. Manage backup, disaster recovery, and business continuity planning in AWS. Collaborate with development, security, and DevOps teams to optimize system operations. Support and enforce security best practices, including IAM policies, encryption, and vulnerability management. Stay updated on AWS best practices and new services and recommend improvements. Required Qualifications: 5+ years of experience in Cloud Operations with a strong focus on AWS. Hands-on experience with core AWS services including EC2, S3, RDS, VPC, IAM, Lambda, ECS/EKS, CloudFront, and Route53. Proficiency in Infrastructure as Code using Terraform, CloudFormation, or CDK. Strong scripting skills in Bash, Python, or PowerShell. Deep understanding of networking concepts, security groups, NACLs, load balancers, DNS, etc. Experience with monitoring and alerting tools such as CloudWatch, Datadog, Grafana, or ELK Stack. Familiarity with CI/CD tools such as Jenkins, GitHub Actions, CodePipeline, or CircleCI. Solid understanding of security best practices and compliance (e.g., HIPAA, SOC2, ISO). Preferred Qualifications: AWS Certifications (e.g., AWS Certified SysOps Administrator, Solutions Architect, DevOps Engineer). Experience with containerization and orchestration (Docker, ECS, EKS, or Kubernetes). Experience with incident management tools like PagerDuty, OpsGenie, or ServiceNow.
Posted 2 weeks ago
3.0 years
0 Lacs
Chennai, Tamil Nadu
On-site
Chennai, Tamil Nadu, India IT Services Full-time Description: Acolad is the global leader in content and language solutions. Its mission is to support companies in every industry to scale across markets and enable growth through cutting-edge technology and localization expertise. Established in 1993, the group is present in 23 countries across Europe, North America, and Asia, with over 1.800 employees supported by a network of +20.000 linguists around the world. At Acolad, every position is key to our global growth: we know that we will only succeed if our people succeed. Joining Acolad means a unique opportunity for professional development through a collaborative global environment that promotes talent and creativity. We are continuously looking for new talent (like you!) to support our mission to drive growth and innovation across some of the world’s leading brands. Acolad is committed to creating a diverse and equitable workforce. We believe that diversity, equity, and inclusion in all its forms—gender, age, disability, marital status, ethnic or social origin, religion, belief, or sexual orientation—enrich the workplace. It opens opportunities for individuals to express their talents, both individually and collectively, and strengthens our ability to adapt to a changing world. As an equal opportunity employer, we welcome and consider applications from all qualified candidates, regardless of their backgrounds. This is Acolad - Content That Empowers, Anywhere. Acolad Content Solutions India Private Limited Sai Samuthra Plot No. 41B & 41 C North Phase 1st Floor, Sidco Industrial Estate Ekkatuthangal, Chennai-600032 Landmark: Near Ekkatuthangal Overbridge (Jaya TV Office) Phone: 04466841999 The Job Role: We are seeking a skilled professional to manage and optimize virtualized environments (VMware, Hyper-V) and multi-cloud infrastructures (AWS, Azure, GCP). This role is responsible for provisioning, configuration, and deployment processes using modern clouds and tools. The ideal candidate will contribute to secure, scalable, and cost-efficient solutions while maintaining business continuity and compliance. You will also support backup and recovery processes (e.g., Veeam), oversee infrastructure documentation, and collaborate with cross-functional teams to achieve cloud and automation objectives. A strong background in Linux/Windows systems administration, and configuration management is essential. Main Responsibilities: Design and implement secure, scalable infrastructure across AWS, Azure, and GCP. Manage VMware and Hyper-V systems including provisioning, monitoring, and maintenance. Manage and maintain Kubernetes clusters. Terraform is a plus. Perform day-to-day system administration and optimize performance for Linux and Windows-based servers. Apply patches and assist with compliance and audit readiness. Administer and support Veeam-based and cloud-native backup and disaster recovery solutions. Monitor cloud costs and recommend efficiency improvements. Maintain detailed infrastructure documentation. Requirements: Degree in Information Technology and Computer Engineering or similar Minimum 3 years of experience with server Operating Systems, including UNIX and Windows Two or more years of hands-on experience designing and deploying cloud architecture on cloud providers (IaaS), such as Amazon AWS and MS Azure Experience with Veeam backup products Experience working in vast networking and infrastructure landscapes Experience with Containerized environments and with virtual desktop and virtual application cloud-based solutions is a plus. J ob specific : Job Specific & Knowledge § Expertise with ITSM tools and ITIL v3/v4 processes § Good communication skills § Strong knowledge of virtualization technologies, performance optimization, and resource allocation § Consultancy spirit focused on optimizing virtualized environments for reliability and efficiency § Project management skills are valued § Exposure to disaster recovery and business continuity planning § Fluency in English is mandatory § Experience in troubleshooting virtualization environments and managing virtual workloads § Exposure to hybrid environments, integrating on-premises virtualization with cloud services Virtualization Platforms § Expertise in managing VMware vSphere/ESXi, including vCenter for centralized management § Experience with Microsoft Hyper-V for virtual machine management in Windows Server environments § Familiarity with KVM or other open-source hypervisors is a plus Cloud Management § Experience with cloud management platforms (e.g., VMware Cloud on AWS, Azure Arc) § Understanding of cloud-based virtualization services (e.g., Azure Virtual Machines, AWS EC2) § Familiarity with hybrid cloud architectures, integrating on-premises virtual machines with cloud-based workloads Backup & Disaster Recovery § Expertise in backup solutions for virtualized environments (e.g., Veeam Backup & Replication) § Experience with cloud backup services (e.g., AWS Backup, Azure Backup, Google Cloud Backup) § Familiarity with replication and disaster recovery tools to ensure business continuity (e.g., VMware Site Recovery, Azure Site Recovery) Resource Management & Optimization § Knowledge of resource allocation, load balancing, and clustering for high availability § Experience in capacity planning and optimizing compute, storage, and network resources in virtualized environments Automation & Scripting § Experience with automation of virtualization tasks using PowerCLI, PowerShell, or Ansible are valued § Knowledge of Infrastructure as Code (IaC) tools for managing virtual environments (e.g., Terraform) Monitoring & Performance Tuning § Familiarity with virtualization monitoring tools (e.g., VMware vRealize Operations, Nagios, Datadog) § Experience in troubleshooting and tuning virtual infrastructure to optimize performance and uptime Security & Compliance § Understanding of virtual machine security best practices, including network segmentation, patch management, and hardening of virtual hosts § Familiarity with role-based access control (RBAC) and encryption in virtualized environments Languages § English Benefits: National and Festival Holidays Five days work week Medical Insurance
Posted 2 weeks ago
0.0 - 8.0 years
0 Lacs
Pune, Maharashtra
On-site
Pune,Maharashtra Full Time Posted Date: 21-07-2025 Openings: 01 About us EverExpanse is a dynamic technology-driven organization specializing in modern web and e-commerce solutions. We pride ourselves on building scalable, high-performance applications that drive user engagement and business success. Our development team thrives on innovation and collaboration, delivering impactful digital experiences across diverse industries. About Us Job Overview EverExpanse Pvt. Ltd. Pune,Maharashtra 5 - 8 Years Full Time Bachelor’s degree in Computer Science, IT, or a related field. Job Description We are looking for a skilled Benchmarking Engineer with 5–8 years of experience in performance testing of large-scale distributed applications. The ideal candidate will have strong hands-on experience with tools like LoadRunner, JMeter, and Playwright, and expertise in scripting, performance diagnostics, monitoring tools, and troubleshooting across multiple platforms. Experience with cloud platforms, containerization, and storage systems is highly desirable. Key Responsibilities Conduct performance benchmarking and testing of enterprise-scale distributed systems. Use tools such as LoadRunner (Web HTTP, TruClient), JMeter, Ranorex, or Playwright. Script performance scenarios using SQL, Python, VuGen, Korn shell (KSH), Bash, or PowerShell. Monitor and diagnose system performance using tools like AppDynamics, Dynatrace, Datadog, Grafana, Kibana, FIO, PerfMon, SysInternals, IOSTAT, NMON, VMSTAT, NETSTAT, TOP, and IOMeter. Manage and troubleshoot environments across Linux and Windows Server. Troubleshoot performance issues across hardware, OS, network, and application layers. Work with cloud environments like AWS, Azure, GCP, and Kubernetes. Understand and manage storage systems (SAN, NAS, NFS) and storage protocols like FCoE, Fibre Channel, CIFS, and JBOD. Preferred Skills: Strong understanding of distributed architecture and systems. Hands-on experience with modern observability and monitoring tools. Effective communication and cross-functional collaboration skills. Cloud-native experience with Kubernetes and container performance. Why Join Us? Opportunity to work on high-impact systems in cloud-native environments. Exposure to cutting-edge tools in monitoring and diagnostics. Collaborative, high-performance culture with career growth. Be at the forefront of performance engineering in real-world scenarios. To Apply send your Resume to jobs@everexpanse.com
Posted 2 weeks ago
12.0 years
0 Lacs
Pune, Maharashtra, India
On-site
Job Title: VP-Digital Expert Support Lead Experience : 12 + Years Location : Pune Mandatory-Should have a stable track record Suitable candidate shall be notified the same day Position Overview The Digital Expert Support Lead is a senior-level leadership role responsible for ensuring the resilience, scalability, and enterprise-grade supportability of AI-powered expert systems deployed across key domains like Wholesale Banking, Customer Onboarding, Payments, and Cash Management . This role requires technical depth, process rigor, stakeholder fluency , and the ability to lead cross-functional squads that ensure seamless operational performance of GenAI and digital expert agents in production environments. The candidate will work closely with Engineering, Product, AI/ML, SRE, DevOps, and Compliance teams to drive operational excellence and shape the next generation of support standards for AI-driven enterprise systems. Role-Level Expectations Functionally accountable for all post-deployment support and performance assurance of digital expert systems. Operates at L3+ support level , enabling L1/L2 teams through proactive observability, automation, and runbook design. Leads stability engineering squads , AI support specialists, and DevOps collaborators across multiple business units. Acts as the bridge between operations and engineering , ensuring technical fixes feed into product backlog effectively. Supports continuous improvement through incident intelligence, root cause reporting, and architecture hardening . Sets the support governance framework (SLAs/OLAs, monitoring KPIs, downtime classification, recovery playbooks). Position Responsibilities Operational Leadership & Stability Engineering Own the production health and lifecycle support of all digital expert systems across onboarding, payments, and cash management. Build and govern the AI Support Control Center to track usage patterns, failure alerts, and escalation workflows. Define and enforce SLAs/OLAs for LLMs, GenAI endpoints, NLP components, and associated microservices. Establish and maintain observability stacks (Grafana, ELK, Prometheus, Datadog) integrated with model behavior. Lead major incident response and drive cross-functional war rooms for critical recovery. Ensure AI pipeline resilience through fallback logic, circuit breakers, and context caching. Review and fine-tune inference flows, timeout parameters, latency thresholds, and token usage limits. Engineering Collaboration & Enhancements Drive code-level hotfixes or patches in coordination with Dev, QA, and Cloud Ops. Implement automation scripts for diagnosis, log capture, reprocessing, and health validation. Maintain well-structured GitOps pipelines for support-related patches, rollback plans, and enhancement sprints. Coordinate enhancement requests based on operational analytics and feedback loops. Champion enterprise integration and alignment with Core Banking, ERP, H2H, and transaction processing systems. Governance, Planning & People Leadership Build and mentor a high-caliber AI Support Squad – support engineers, SREs, and automation leads. Define and publish support KPIs , operational dashboards, and quarterly stability scorecards. Present production health reports to business, engineering, and executive leadership. Define runbooks, response playbooks, knowledge base entries, and onboarding plans for newer AI support use cases. Manage relationships with AI platform vendors, cloud ops partners, and application owners. Must-Have Skills & Experience 12+ years of software engineering, platform reliability, or AI systems management experience. Proven track record of leading support and platform operations for AI/ML/GenAI-powered systems . Strong experience with cloud-native platforms (Azure/AWS), Kubernetes , and containerized observability . Deep expertise in Python and/or Java for production debugging and script/tooling development. Proficient in monitoring, logging, tracing, and alerts using enterprise tools (Grafana, ELK, Datadog). Familiarity with token economics , prompt tuning, inference throttling, and GenAI usage policies. Experience working with distributed systems, banking APIs, and integration with Core/ERP systems . Strong understanding of incident management frameworks (ITIL) and ability to drive postmortem discipline . Excellent stakeholder management, cross-functional coordination, and communication skills. Demonstrated ability to mentor senior ICs and influence product and platform priorities. Nice-to-Haves Exposure to enterprise AI platforms like OpenAI, Azure OpenAI, Anthropic, or Cohere. Experience supporting multi-tenant AI applications with business-driven SLAs. Hands-on experience integrating with compliance and risk monitoring platforms. Familiarity with automated root cause inference or anomaly detection tooling. Past participation in enterprise architecture councils or platform reliability forums
Posted 2 weeks ago
4.0 - 5.0 years
0 Lacs
Kochi, Kerala, India
On-site
Hiring PostgreSQL Database Administrator to join our team at P Square Solutions (part of Neology Inc www.neology.com) Number of Open Positions – 1 Experience – 4-5 years Industry - IT Product & Services Employment Type - Hybrid Work Location - Smart City, Kochi, Kerala Shift timing - Based on project. Role Description We are hiring an experienced PostgreSQL Database Administrator with 4–5 years of hands-on experience in managing databases on both AWS RDS / Aurora PostgreSQL and on-premises virtual machines . You will be responsible for administration, performance tuning, high availability, replication, and automation in a hybrid cloud setup. Key Responsibilities Manage PostgreSQL on AWS RDS/Aurora and Linux-based on-prem VMs (KVM, VMware, etc.) Perform installation, upgrades, patches, and configuration. Configure replication (streaming/logical) and high availability (e.g., Patroni, Pgpool-II). Handle backups and disaster recovery using pgBackRest, Barman, or AWS snapshots. Tune queries and optimize performance using pg_stat_statements, EXPLAIN, and indexing. Monitor database health using CloudWatch, Datadog, Prometheus/Grafana, etc. Implement database security, roles, and encryption (KMS, SSL). Automate tasks using Shell scripting, Bash, or Ansible. Work on migration projects from on-prem to AWS or across regions. Required Skills 4–5 years of PostgreSQL DBA experience Strong expertise in AWS RDS / Aurora PostgreSQL Experience in managing PostgreSQL on Linux-based on-prem virtual machines Proficiency in SQL, shell scripting, and backup/recovery tools Hands-on experience with monitoring, performance tuning, and security best practices Good to have AWS Certification (Database or Solutions Architect – Associate) Experience with CI/CD and infrastructure automation Familiarity with Oracle, MySQL, or MongoDB is a plus. P Square Solutions LLC (part of Neology Inc - www.neology.com) is a leading firm in Toll systems solutions and systems Integration Services since 2005. We are committed to delivering innovative Toll solutions and exceptional service to our clients. Our core values include integrity, collaboration, and excellence, and we are dedicated to fostering a diverse and inclusive workplace. At P Square, we will offer you good work Culture and Career opportunities with competent Salary in the industry, complemented with excellent employee benefits. We will provide you with opportunities to Learn and evolve in your career. We will support you with work life balance through Balanced leave policy and other benefits for working from office. Our assessments will be focused on your strengths and help to progress and venture career possibilities to grow. We have a holistic approach building talents and nurturing work culture. We are always keen listen and open to feed backs which helps enhance the work environment at P-Square.
Posted 2 weeks ago
6.0 - 10.0 years
11 - 21 Lacs
Gurugram
Hybrid
Looking for Application production Support Engineer in Java application. PFB the details. Provide Java application support, including thread/heap analysis and debugging. Troubleshoot applications hosted in AWS ensuring high availability. Utilize Splunk for in-depth log analysis, developing dynamic queries to pinpoint issues. Monitor and optimize performance. Perform Linux system administration, optimizing OS settings for performance. Provide L2/L3 support for java applications, resolving complex production issues. Conduct root cause analysis (RCA) and implemented fixes to improve system stability.
Posted 2 weeks ago
0 years
0 Lacs
Bangalore Urban, Karnataka, India
On-site
Position : Sr . ASSOCIATE (Cloud and Devops) work mode : HYBRID(3days) Location: Bangalore, Hyderabad, Chennai, Gurugram ,Noida , Mumbai , Pune Preferred candidate profile Expertise in below DevOps & Cloud tools: AWS (EC2, IAM, VPC, S3, Lambda, RDS, SNS, Cloud Watch) Configuration and monitoring DNS, APP Servers, Load Balancer, Firewall for high volume traffic Extensive experience in designing, implementing, and maintaining infrastructure as code using preferably Terraform or Cloud Formation/ARM Templates/Deployment Manager/Pulumi Experience Managing Container Infrastructure (On Prem & Managed e.g., AWS ECS, EKS, or GKE) Design, implement and Upgrade container infrastructure e.g., K8S Cluster & Node Pools Create and maintain deployment manifest files for microservices using HELM Utilize service mesh Istio to create gateways, virtual services, traffic routing and fault injection Troubleshoot and resolve container infrastructure & deployment issues Continues Integration & Continues Deployment Develop and maintain CI/CD pipelines for software delivery using Git and tools such as Jenkins, GitLab, CircleCI, Bamboo and Travis CI Automate build, test, and deployment processes to ensure efficient release cycles and enforce software development best practices e.g., Quality Gates, Vulnerability Scans etc. Automate Build & Deployment process using Groovy, GO, Python, Shell, PowerShell Implement DevSecOps practices and tools to integrate security into the software development and deployment lifecycle. Manage artifact repositories such as Nexus and JFrog Artifactory for version control and release management. Design, implement, and maintain observability, monitoring, logging and alerting using below tools Observability: Jaeger, Kiali, CloudTrail, Open Telemetry, Dynatrace Logging: Elastic Stack (Elasticsearch, Logstash, Kibana), Fluentd, Splunk Monitoring: Prometheus, Grafana, Datadog, New Relic Perks and benefits Gender-Neutral Policy 18 paid holidays throughout the year for NCR/BLR (22 For Mumbai) Generous parental leave and new parent transition program Flexible work arrangements Employee Assistance Programs to help you in wellness and well being
Posted 2 weeks ago
5.0 years
0 Lacs
Pune
Remote
Subject Matter Expert- Datadog: AppDynamics Monitoring We are seeking a skilled Subject Matter Expert (SME) in AppDynamics Monitoring to support the development of high-quality, technically accurate learning content. The SME will work closely with our instructional design and content teams to review, create, and validate course material intended for professionals in performance monitoring and observability. The content will include scripts, labs, assessments, demos, and recorded videos. Key Responsibilities: Learning Design & Validation Analyze and/or create learning objectives for each course. Review and refine course outlines, ensuring alignment with technical goals. Content Review & Development Review video scripts (7–9 per course) for technical accuracy; suggest edits and updates. Review reading materials (4–6 per course, each up to 1200 words) and provide feedback, edits, or rewrites for accuracy and clarity. Suggest suitable freeware, open-source tools, or alternatives where applicable. Hands-On Demonstrations Provide static or recorded demos/screencasts for integration into learning content. Review and validate codes and tools used; incorporate 1 round of internal and 2 rounds of client feedback. Lab & Activities Creation Develop 1–2 hands-on labs or activities per course aligned with course objectives. Write, test, and validate technical code used in labs or assignments. Assessments Review Review quizzes and graded assessments (5 sets per course, 5–10 questions each) for accuracy, difficulty, and relevance. Suggest improvements and ensure they align with course goals. Video Recordings Record ~20–25 minutes of talking head video content per course (onsite or via Zoom). Ensure clarity, accuracy, and consistency in delivery. Collaboration & QA Participate in internal and client review meetings and content discussions. Incorporate feedback (1 internal + 2 client rounds) into all deliverables. Provide digital signature for final course completion confirmation. ✅ Required Skills & Experience: 5+ years of professional experience in AppDynamics Monitoring and Performance Management Prior experience in training, instructional content development, or technical writing Strong knowledge of application performance monitoring (APM) , DevOps, observability tools, and full-stack monitoring Comfortable with video recording and presenting technical content Excellent communication, content editing, and technical validation skills Ability to independently test and troubleshoot code, labs, and technical demos Bonus If You Have: AppDynamics certification(s) Experience with related APM tools (e.g., Dynatrace, Datadog, New Relic) Prior content development or e-learning collaboration experience Timelines and Payout: Project start date : Immediate Time Availability : 25 hours per course Job Type : Part-time, Contract Work Location : Remote Job Type: Part-time Experience: AppDynamics Monitoring: 5 years (Preferred) Work Location: In person
Posted 2 weeks ago
0 years
7 - 8 Lacs
Pune
On-site
Job Description NIQ is a global measurement and data analytics company providing the most complete and trusted view of consumers and markets in 90 countries covering 90% of the world’s population. Focusing on consumer-packaged goods manufacturers and FMCG and retailers, we enable customers to defy what’s possible. How? We combine unparalleled datasets, pioneering technology, and the industry’s top talent to create insights that unlock innovation. Join us and change the landscape. Embrace the role of a Support Specialist on our team, where you'll play a vital part in ensuring the smooth operation of our enterprise software solutions. Candidates will be part of Global Command Center and will be responsible to lead a team of L1 & L2 Support specialists providing application production support for various applications. Your duties encompass monitoring applications, active triaging, collaborating with developers, and conducting thorough troubleshooting to identify and resolve application issues. RESPONSIBILITIES: Ensure team is achieving 99.95% system availability across all customer facing systems. Experience in observability on production application support Effectively audit the threshold of the monitors, pinpoint affected application components, identify significant events and patterns based on real-time analysis. Follow the standard operating procedures to minimize the downtime, optimizing the system performance to maintain uninterrupted service delivery. Monitor service-level dashboards, ensure team is performing daily health checks, and review system capacity. Ensure team is meeting the SLA of various critical alerts and maintain consistent updates on tickets in ServiceNow and JIRA. Provide meaningful analysis of issues, timely updates on ongoing incidents to stakeholders and internal teams. Coordinate end-to-end issue resolution with users, support teams, operations, technical delivery teams, and vendors if required. Knowledge on azure cloud for monitoring the resources and take necessary actions if needed. Building AI monitoring on azure or any other inhouse tool Kubernetes knowledge to provide application support Ensure team has clear understanding of risk associated with alerts, identify and mitigate new risks. Drive long-term solutions for high-impact production issues across technical, operations, and product teams. Assist team in identifying the loopholes in existing processes, share best practices, conduct weekly team meetings and fast-track feedback. Share and collect process improvement ideas to identify trends in issues and propose automation ideas to higher-level management. Demonstrate strong verbal and written communication skills. Ensure SLAs are met with Business-As-Usual (BAU) tasks and work collaboratively in a cross-functional environment. . Drive team meetings effectively and ensure the latest data is available for presentations for monthly workshops with leaders. Create and maintain a Knowledge Base (KEDB) with bug information and workarounds. Facilitate collaboration and communication among internal teams, stakeholders, and external partners. This includes maintaining open channels of communication, facilitating cross-functional collaboration, and fostering a culture of transparency and teamwork. Handling application releases and provide full support for the signoff Qualifications Relevant experience in Business Application Support, with expertise in ITIL and ITSM. Bachelor’s degree in engineering, computer science, or a related field. Automation experience is great to have. Good knowledge on Azure docker and kubernetes services Proficient in monitoring and observability. Exposure to native monitoring skills and troubleshooting tools, including but not limited to Datadog, LogicMonitor, PagerDuty, OpsGenie. Ability to create a dashboard and perform log analysis in monitoring tools like DataDog, LogicMonitor, AppDynamics. Good to have hands on experience on data visualization tools like PowerBI. Cloud and ITIL certifications will be a plus. Advanced knowledge in infrastructure components, including cloud services, containerization, compute, storage, and networking systems is good to have. Must-Have: Exceptional communication skills. Flexibility to work in 24x7 shift rotations, including weekends. Ability to work flexible and extended hours as needed. Positive attitude, team player, self-starter; takes initiative and can work independently. Comfortable working in an Agile environment. Application support experience in Microsoft Azure and Google Cloud Platforms. Hands on experience on UNIX, SQL, Microsoft Office Tools, and application monitoring tools but not limited to Datadog, LogicMonitor, PagerDuty, and OpsGenie. Docker knowledge will be an added advantage Experience working in Global Command Center and Site Reliability Engineering (SRE) practices. Additional Information Our Benefits Flexible working environment Volunteer time off LinkedIn Learning Employee-Assistance-Program (EAP) About NIQ NIQ is the world’s leading consumer intelligence company, delivering the most complete understanding of consumer buying behavior and revealing new pathways to growth. In 2023, NIQ combined with GfK, bringing together the two industry leaders with unparalleled global reach. With a holistic retail read and the most comprehensive consumer insights—delivered with advanced analytics through state-of-the-art platforms—NIQ delivers the Full View™. NIQ is an Advent International portfolio company with operations in 100+ markets, covering more than 90% of the world’s population. For more information, visit NIQ.com Want to keep up with our latest updates? Follow us on: LinkedIn | Instagram | Twitter | Facebook Our commitment to Diversity, Equity, and Inclusion NIQ is committed to reflecting the diversity of the clients, communities, and markets we measure within our own workforce. We exist to count everyone and are on a mission to systematically embed inclusion and diversity into all aspects of our workforce, measurement, and products. We enthusiastically invite candidates who share that mission to join us. We are proud to be an Equal Opportunity/Affirmative Action-Employer, making decisions without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability status, age, marital status, protected veteran status or any other protected class. Our global non-discrimination policy covers these protected classes in every market in which we do business worldwide. Learn more about how we are driving diversity and inclusion in everything we do by visiting the NIQ News Center: https://nielseniq.com/global/en/news-center/diversity-inclusion
Posted 2 weeks ago
6.0 years
4 - 5 Lacs
Bengaluru
Remote
Overview: The DevOps Engineer strengthens the collaboration between symplr's Engineering, Development, and Operations teams by streamlining and automating the software development lifecycle (SDLC) to achieve faster delivery, improved quality, and higher reliability. This role leverages scripting and automation expertise to bridge the gap between development and operations, fostering a culture of continuous integration and continuous delivery (CI/CD). Duties & Responsibilities: Promote collaboration and shared responsibility throughout the SDLC. Design, develop, and implement automated build, test, and deployment pipelines using scripting languages and CI/CD tools (GitHub Actions). Design, build, and maintain scalable and reliable infrastructure to support our applications and services. Automate infrastructure provisioning and configuration management using tools such as Terraform, Ansible, or similar. Work closely with developers, testers, security professionals, and operations teams to break down silos and ensure seamless software delivery. Ensure adherence to security best practices and compliance regulations throughout the development and deployment process. Troubleshoot complex technical issues related to deployments, infrastructure, and automation. Mentor and guide junior team members, promoting a culture of continuous improvement and collaboration. Document processes, tools, and solutions to promote knowledge transfer within the team. Have HEART. To work here, you must be: Humble – self-aware and respectful Effective – measurably move the needle & immeasurably add Adaptable – innately curious and constantly Remarkable – stand out in some Transparent – openly and honestly sharing Skills Required: 6+ years of Engineering experience in the following areas: Solid understanding of system administration principles with experience managing both Windows Server (Active Directory, GPO, DFS, RBAC) and Linux servers (Ubuntu/Debian, RHEL/CentOS). Scripting proficiency in PowerShell, Bash, or Python to automate infrastructure and application deployment tasks. Experience with Infrastructure as Code (IaC), particularly Terraform, to define and manage infrastructure in a repeatable and scalable manner. In-depth knowledge of development infrastructure tools such as Terraform, Ansible, Helm, and container orchestration platforms (e.g., Kubernetes) for building and deploying applications efficiently. Hands-on experience with CI/CD tools (e.g., GitHub, Jenkins, Octopus Deploy, ArgoCD/Drone) to automate the software development lifecycle, enabling continuous integration and continuous delivery (CI/CD). Familiarity with monitoring tools such as DataDog, NewRelic Solid understanding of networking and other supporting components like Load Balancer, Application Gateway, Web Application Firewall Experience with application hosting platforms like IIS, Apache, and Tomcat to deploy and manage applications in various environments. Bachelor’s degree with Computer Science background Experience and/or knowledge in the implementation and support of Cloud Services, such as Azure, AWS, Private Cloud or Hybrid Cloud is Apply for this job online Share on your newsfeed About symplr: As a leader in healthcare operations solutions, we empower healthcare organizations to navigate the complexities of integrating critical business operations. Our customers are at the heart of everything we do, and they rely on our mission-critical systems to drive better operations and better outcomes. We are a remote-first company with employees working across the United States, India, and the Netherlands. Guided by values, we focus on teamwork, championing our customers, being rooted in action and outcomes, overcoming challenges, and leading through equality and integrity. Read more about symplr's culture and values at symplr.com/careers.
Posted 2 weeks ago
15.0 years
0 Lacs
Bengaluru, Karnataka, India
On-site
Role Purpose The Software Architect is responsible for driving the technical roadmap and direction in all aspects of the development of platforms and applications in this multi-dimensional role. Subject to their specialization, they will lead the design, development, testing, publishing, and support of different cloud-based products and solutions. They will be providing subject matter expert for customer implementation and cloud platform support. They will be closely with a global team of engineers to build robust solutions that meet our business objectives following continuous integration and continuous deployment processes. Additionally, they will provide recommendations for actions to management as to process, technologies, or other improvements intended to benefit productivity, efficiency and quality of the solutions developed by the team and lead the implementation of new solutions. Job Metrics Lead technical requirements and translate them into software system design Meet quality gates for deliverables by ensuring that all content and information distribution channels are available at high quality and are current Provide technical leadership and mentor senior members of the Engineering team Assist with complex customer issues Define and adhere to coding standards and guidelines Specific MBOs as agreed with Manager Technical Requirement: Degree in Computer Science or Engineering or Equivalent with 15+ years of working Core Technology : Strong experience with Java (Springboot, JPA ), MERN (MongoDB, Express Js, React Js, Node Js) and/or Python. Extensive experience with microservices architecture, event driven architecture and RESTful APIs. FrontEnd: 8+ years of experience with Vue.js/React.js, JavaScript, HTML5, CSS, TypeScript, Node.js, RESTful Web Services, SQL (PostgreSQL/MySQL), and NoSQL (MongoDB / DynamoDB). 10+ Years of working experience in Web Technologies Solid understanding of Object Oriented Programming Language AI: 5+ years working experience with Python, Machine Learning technologies, NLP, Python libraries (Pandas, Keras, TensorFlow etc.) DevOps: Experience with cloud tools (Kubernetes, Docker, GitHub) and CI/CD tools (Harness, Jenkins). Monitoring /Observability: Strong knowledge of Monitoring Tools such as Datadog and/or OpsGenie. Cloud: Strong experience of Cloud (preferably AWS/Azure/GCP) is a must. Working experience of Cloud (OKD / OpenShift preferred) development Working knowledge of Web Sockets, Event Driven applications preferred Design and implement scalable, robust, and secure microservices architectures. Able to utilize design patterns to solve complex software design problems. Knowledge of Amazon SQS for message queuing and Caching tools like Redis/Valkey 10+ years of experience with Kafka/SQS and other message brokers (e.g., RabbitMQ) Experience with voice and video recording platforms is advantageous Good understanding of Computer Vision, Speech Analytics and Deep Learning tools & techniques Previous experience in at least 5-6 computer vision / Deep Learning use cases on detection, tracking, classification, recognition, intent understanding using Keras/Tensorflow/PyTorch frameworks Strong programming background and should be able to design and deliver solutions quickly Apply: https://fa-epcb-saasfaprod1.fa.ocs.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX/job/3463/?utm_medium=jobshare&utm_source=External+Job+Share
Posted 2 weeks ago
8.0 years
0 Lacs
India
On-site
🔹 We’re Hiring: Senior Data Engineer | Hybrid | All Xebia Locations 📩 Apply by sending your resume to vijay.s@xebia.com Locations: Chennai, Hyderabad, Bangalore, Pune, Bhopal, Jaipur, Gurugram Mode: Hybrid – 3 days/week from office Experience: 8+ Years Joining: Immediate or within 2 weeks 🚀 About the Role: Xebia is seeking a Senior Data Engineer to join our Data Warehouse team. You’ll be building cloud-native, data-intensive applications using modern tech stacks including Databricks, Airflow, Spark, and Terraform. Ideal for professionals with a strong foundation in Python, AWS, and modern data platforms. 🔧 Key Responsibilities: Build and optimize robust Data Warehouse solutions Develop cloud-native, data-intensive applications (AWS experience required) Architect and implement Airflow -based workflow management systems Design, develop, and maintain Spark applications Work with modern data formats – Parquet, Delta Lake, OTFs Use IaC tools like Terraform/CDK/CloudFormation for infrastructure automation Establish observability via Datadog, Prometheus, or Grafana Drive CI/CD with GitHub Actions, Jenkins, or ArgoCD Collaborate across teams with business-aligned documentation and code 🔍 Required Skills: 5+ years in Python, JVM, and Shell scripting (production experience) 3+ years in cloud-native data applications (must: AWS , good to have: GCP) Strong hands-on experience in Databricks , dbt , and Spark IaC tool expertise ( Terraform preferred) Airflow experience is mandatory Containerization (Docker/Kubernetes) understanding Experience with unit testing, CI/CD pipelines, and code reviews Excellent written & verbal communication skills Ability to convert business needs into data solutions 📬 How to Apply: Send your CV along with the following details to vijay.s@xebia.com : Full Name Total Experience Current CTC Expected CTC Current Location Preferred Xebia Location (Chennai, Hyderabad, Bangalore, Pune, Bhopal, Jaipur, Gurugram) Notice Period / Last Working Day (if serving) Primary Skills LinkedIn Profile 🛑 Only apply if you're available to join immediately or within 2 weeks
Posted 2 weeks ago
8.0 years
0 Lacs
Noida, Uttar Pradesh, India
On-site
About the Job Who we are and what do we do India has witnessed a journey of Innovation in Digital Payments and today it leads the world with over 45% of the Global digital transaction volume. At NPST, we believe that our decade long journey has carved an opportunity for building future roadmap for the world to follow. We are determined to contribute immensely to nation’s growth story with our vision “to provide digital technology across financial value chain” and our mission to create leadership position in digital payment space. Founded in 2013, NPST is a leading fintech firm in India, part of the Make in India initiative and listed on BSE and National Stock Exchange. We specialize in Digital Payments operating as Technology Service Provider to Regulated entities and providing Payment Platform to Industry – empowered by payment processing engine, Financial Super app, Risk Intelligence engine and digital merchant solution. While we drive 3% of global digital transaction volume for over 100+ clients, we aim to increase our market share by 5X in next five years through innovation and industry first initiatives. What will you do We are looking for a seasoned DevOps Lead to join our TechOps team within the IT Infra & Security department. You will be responsible for designing, implementing, and managing CI/CD pipelines, infrastructure automation, and cloud-native operations to ensure high availability, scalability, and security of fintech products. This is a hands-on leadership role involving cross-functional collaboration with engineering, security, and product teams. Job responsibilities: Lead and manage DevOps practices including CI/CD automation, infrastructure as code, system monitoring, and environment management across staging and production. Design and maintain high-availability and scalable systems using tools such as Docker, Kubernetes, Terraform, Jenkins, Ansible, GitLab CI/CD. Manage cloud infrastructure across platforms like AWS, GCP, or Azure ensuring cost optimization, reliability, and performance. Implement and enforce security best practices, including network/firewall policies, secrets management, and compliance audits (e.g., ISO, PCI-DSS). Monitor and maintain system performance using tools like Prometheus, Grafana, ELK Stack, Datadog, or CloudWatch. Collaborate with development teams to streamline release processes and improve build/deploy strategies. Guide and mentor junior DevOps engineers; drive adoption of modern DevOps and SRE principles. Take ownership of disaster recovery planning, backup solutions, and incident response playbooks. Maintain documentation for infrastructure, deployment processes, and recovery procedures. Work in Agile/Scrum environment and participate in sprint planning and stand-ups as required. What are we looking for: Proven experience as a DevOps Engineer with hands-on leadership responsibilities. Strong command of Linux system administration, scripting (Shell, Python, or Bash), and version control tools (Git). Proficiency in setting up and maintaining CI/CD pipelines, automation frameworks, and containerization tools (Docker, Kubernetes). Hands-on with Infrastructure as Code using Terraform, Ansible, or CloudFormation. Cloud expertise (preferably AWS) including EC2, S3, RDS, IAM, VPC, and load balancing concepts. Good understanding of networking, VPNs, firewalls, and DNS management. Familiarity with monitoring/logging tools and performance tuning. Exposure to DevSecOps and integration of security tools into DevOps workflows is a plus. Excellent problem-solving skills, communication, and collaboration abilities. Willingness to be available for on-call support and incident response as needed. Certifications like AWS Certified DevOps Engineer, Certified Kubernetes Administrator (CKA), or equivalent is highly desirable. Experience in fintech or banking domain will be an added advantage. Familiarity with compliance frameworks (ISO 27001, SOC2, etc.). Entrepreneurial skills, ability to observe, innovate and own your work. Detail-oriented and organized with strong time management skills. Influencing skills and the ability to create positive working relationships with team members at all levels. Excellent communication and interpersonal skills. Collaborative approach and work with perfection as a group effort to achieve organization goal. Education Qualification – BE/B.Tech Experience – 8-12 years Industry - IT/Software/BFSI/ Banking /Fintech Location – Noida Work arrangement – 5 days working from office What do we offer: An organization where we strongly believe in one organization, one goal. A fun workplace which compels us to challenge ourselves and aim higher. A team that strongly believes in collaboration and celebrating success together. Benefits that resonate ‘We Care’. If this opportunity excites you, we invite you to apply and contribute to our success story. If your resume is shortlisted, you will hear back from us.
Posted 2 weeks ago
4.0 years
0 Lacs
Delhi, India
On-site
Job Summary We are seeking a skilled Cloud Engineer with strong hands-on experience in AWS to join our cloud infrastructure team. Responsibilities The ideal candidate will be responsible for implementing, maintaining, and supporting scalable and secure cloud infrastructure, driving operational excellence, and collaborating with development, DevOps, and security teams to ensure seamless service Responsibilities : Design, deploy, and manage secure, scalable, and highly available infrastructure on AWS. Provision and configure core AWS services including EC2, VPC, S3, IAM, RDS, Lambda, CloudFormation/Terraform, and more. Implement and manage CI/CD pipelines for automated deployments and infrastructure as code (IaC). Monitor, troubleshoot, and optimize system performance, availability, and reliability. Implement cloud security best practices including identity/access management, encryption, and compliance monitoring. Perform cost optimization, resource tagging, and usage analysis. Maintain infrastructure documentation and standard operating procedures. Participate in on-call rotation and incident response processes. Collaborate with development, DevOps, security, and operations teams to support cloud-native Skills & Qualifications : Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience). 4+ years of experience as a Cloud Engineer or in a similar role. Proven hands-on expertise in AWS services: EC2, VPC, IAM, S3, CloudWatch, RDS, Lambda, ECS/EKS, Route 53, CloudTrail, etc. Proficient in Infrastructure as Code (IaC) using CloudFormation and/or Terraform. Experience with configuration management tools (e.g., Ansible, Chef, or Puppet). Strong scripting skills (Python, Bash, or PowerShell). Familiarity with monitoring and alerting tools (e.g., CloudWatch, Datadog, Prometheus). Understanding of networking concepts, security controls, and cloud governance. Experience with DevOps practices, CI/CD tools (e.g., Jenkins, GitLab CI, AWS Qualifications : AWS Certified Solutions Architect Associate or Professional. Experience in hybrid cloud environments or multi-cloud strategies. Exposure to containerization (Docker, Kubernetes). Working knowledge of Agile/Scrum methodologies. (ref:hirist.tech)
Posted 2 weeks ago
3.0 - 7.0 years
0 Lacs
haryana
On-site
As a Node.js Developer at Fitelo, a fast-growing health and wellness platform, you will play a crucial role in leading the data strategy. Collaborating with a team of innovative thinkers, front-end experts, and domain specialists, you will be responsible for designing robust architectures, implementing efficient APIs, and ensuring that our systems are both lightning-fast and rock-solid. Your role goes beyond mere coding; it involves shaping the future of health and wellness technology by crafting elegant solutions, thinking creatively, and making a significant impact on our platform. Your responsibilities will include taking complete ownership of designing, developing, deploying, and maintaining server-side components and APIs using Node.js. You will manage the database operations lifecycle with MongoDB and PostgreSQL, collaborate with front-end developers for seamless integration, optimize application performance and scalability, and implement security protocols to safeguard data integrity. Additionally, you will oversee the entire development process, conduct code reviews, maintain documentation, research and integrate new technologies, and drive collaboration across teams to ensure successful project delivery. The ideal candidate for this role would have at least 3 years of experience in backend development primarily with Node.js. Advanced proficiency in JavaScript and Typescript, along with experience in frameworks like Express.js or Nest.js, is required. A strong understanding of asynchronous programming, event-driven architecture, SQL and NoSQL databases, RESTful APIs, GraphQL services, microservices architecture, and front-end integration is essential. Proficiency in version control tools, CI/CD pipelines, cloud platforms, problem-solving skills, debugging, and testing frameworks are also key qualifications for this role. If you are passionate about technology, enjoy crafting innovative solutions, and want to contribute to the future of health and wellness tech, we welcome you to join our team at Fitelo. Qualifications: - Bachelor's degree in technology This is a full-time position with a day shift schedule, based in Gurugram.,
Posted 2 weeks ago
5.0 years
0 Lacs
Andhra Pradesh, India
On-site
At PwC, our people in infrastructure focus on designing and implementing robust, secure IT systems that support business operations. They enable the smooth functioning of networks, servers, and data centres to optimise performance and minimise downtime. Those in cloud operations at PwC will focus on managing and optimising cloud infrastructure and services to enable seamless operations and high availability for clients. You will be responsible for monitoring, troubleshooting, and implementing industry leading practices for cloud-based systems. Focused on relationships, you are building meaningful client connections, and learning how to manage and inspire others. Navigating increasingly complex situations, you are growing your personal brand, deepening technical expertise and awareness of your strengths. You are expected to anticipate the needs of your teams and clients, and to deliver quality. Embracing increased ambiguity, you are comfortable when the path forward isn’t clear, you ask questions, and you use these moments as opportunities to grow. Skills Examples of the skills, knowledge, and experiences you need to lead and deliver value at this level include but are not limited to: Respond effectively to the diverse perspectives, needs, and feelings of others. Use a broad range of tools, methodologies and techniques to generate new ideas and solve problems. Use critical thinking to break down complex concepts. Understand the broader objectives of your project or role and how your work fits into the overall strategy. Develop a deeper understanding of the business context and how it is changing. Use reflection to develop self awareness, enhance strengths and address development areas. Interpret data to inform insights and recommendations. Uphold and reinforce professional and technical standards (e.g. refer to specific PwC tax and audit guidance), the Firm's code of conduct, and independence requirements. Job Title: Site Reliability Engineer (SRE) – Senior Associate Location : Bangalore (Hybrid) Department : Managed Services – Core Automation Team Job Overview We’re seeking a Senior Associate with deep hands-on experience in scripting, automation, and RPA to help build intelligent, resilient systems across Managed Services. You’ll work at the intersection of platform reliability and automation—developing scripts, automating runbooks, and integrating low-code/no-code solutions to eliminate manual work and improve operational efficiency. This role is ideal for someone who thrives in solving real-world production challenges with code, automation, and curiosity. Key Responsibilities Automate repetitive infrastructure and application support activities using scripting (Python, Bash, PowerShell) and RPA/low-code platforms. Develop and maintain scripts and reusable components to drive system configuration, monitoring, and auto-remediation. Build self-healing workflows to identify and resolve issues proactively—minimizing human intervention. Integrate observability and alerting tools with automation pipelines to enable real-time anomaly detection and resolution. Leverage low-code/no-code automation platforms (e.g., Power Automate, UiPath, Automation Anywhere) to streamline manual business processes. Collaborate with operations, engineering, and platform teams to build reliable automation frameworks and support scaled delivery. Use GenAI and AI-driven tools to enhance decision automation and support proactive operations management. Create and maintain runbooks and documentation that evolve into automation-first playbooks. Continuously analyze operational inefficiencies and develop automation to close gaps. Required Skills And Qualifications 5+ years of hands-on experience in Site Reliability Engineering, Automation Engineering, or RPA roles. Strong scripting proficiency in Python, Bash, and PowerShell for infrastructure and application automation. Practical experience with low-code/no-code platforms and RPA tools like UiPath, Power Automate, Automation Anywhere, or similar. Solid understanding of automation across monitoring, alerting, configuration management, and incident response. Exposure to log aggregation tools (e.g., Elastic Stack, Splunk) for troubleshooting and automation triggers. Experience building self-healing systems and integrating with event-based automation platforms. Familiarity with cloud environments (AWS, Azure, GCP) and integrating automation across hybrid infrastructure. Experience applying GenAI/AI-driven solutions to automate operations and support predictive monitoring. Strong analytical and root cause analysis skills for solving recurring issues via automation. Ability to work independently and collaborate effectively in cross-functional teams. Desired Skills And Qualifications Experience working in a Managed Services or enterprise support environment with a focus on automation maturity. Understanding of ITIL/ITSM processes and how automation can improve service quality and consistency. Exposure to containerized environments (e.g., Docker, Kubernetes) and automation of application deployments. Experience with observability platforms like Datadog, Prometheus, or AppDynamics is a plus. Strong communication and stakeholder engagement skills to align automation initiatives with business needs. Education Requirements Bachelor’s degree in Computer Science, IT, Engineering, or a related technical field. Certifications in RPA platforms, cloud technologies, or scripting/automation tools are a plus.
Posted 2 weeks ago
34.0 years
0 Lacs
Delhi, India
On-site
About The Role We are seeking a skilled and motivated DevOps Engineer with 34+ years of hands-on experience in automation, CI/CD pipelines, container orchestration, and infrastructure monitoring. This role is ideal for professionals who thrive in fast-paced environments and are passionate about enabling scalable and secure DevOps practices. Key Responsibilities Design, implement, and manage scalable CI/CD pipelines using tools like Jenkins or similar. Manage source control repositories (e.g., Git, SVN) and ensure versioning best practices are followed. Build, deploy, and maintain containerized applications using Docker and Kubernetes. Leverage container registry solutions such as Harbor, JFrog, or Quay for secure image storage. Implement and maintain configuration management using tools like Ansible or Chef. Optimize Kubernetes clusters with best practices in networking, security, and resource management. Set up and monitor system performance using tools like DataDog, Prometheus, Nagios, ELK, or other open-source monitoring solutions. Collaborate closely with development and infrastructure teams to automate and streamline operations and processes. Required Skills & Qualifications Strong hands-on experience with Kubernetes and container orchestration. Proficiency in CI/CD tools (e.g., Jenkins, GitLab CI) and version control systems (e.g., Git, SVN). Expertise in configuration management tools (e.g., Ansible, Chef). Familiarity with container registries such as Harbor, JFrog Artifactory, or Quay. Solid understanding of Kubernetes networking and security best practices. Experience with system monitoring and log management tools (e.g., Prometheus, Nagios, ELK, DataDog). Strong troubleshooting and problem-solving skills. Preferred Qualifications Relevant certifications such as : Certified Kubernetes Administrator (CKA) Red Hat Certified Engineer (RHCE) VMware Certifications Exposure to Infrastructure as Code (IaC) tools like Terraform is an added advantage. Knowledge of scripting languages like Bash, Python, or Shell scripting. (ref:hirist.tech)
Posted 2 weeks ago
8.0 years
0 Lacs
Pune, Maharashtra, India
On-site
Job Title : Senior Performance Engineer. Experience : 8+ years. Location : Pune. Key Responsibilities Design and develop automated suites/tools, Identify performance bottlenecks in design and implementation and be involved in deployment, troubleshooting/analysis, and preparing performance engineering reports. Work within the Agile framework executing performance tasks along with the sprint team. Interact and co-ordinate between different stakeholders in Engineering/Architect/PO/Doc functions. Prioritize the tasks and communicate to all stakeholders. Adopt, enable, and ensure timely decision-making across the stakeholders. Desired Skills & Experience Professional degree (Bachelor's / Master's) in engineering with a consistent academic record. Professional hands-on experience of 5 to 8 years in Performance Engineering activities and developing performance tools. Must Have Skills Good knowledge of JAVA, JMeter, Python. Good knowledge of Database and DB queries in Postgress & MongoDB. Good knowledge of performance monitoring tools like Newrelic, Datadog, Grafana. Good knowledge of performance profiling tools like Yourkit, JProfiler, JVisual VM. Exposure to Virtualization, K8, AWS, Azure and Unix flavors. Exposure to microservices benchmarking and horizontal scalability. Confident in complex determinate analysis, managing trade-offs between technical benefits, risk and efficiency. Very strong communicator, critical thinking, and natural in articulating complex technical topics and ideas to technical and non-technical stakeholders. Demonstrated success in providing clarity and delivering the sprint goals during ambiguities. Nice To Have Skills Exposure to working in auto-scaling, and redundant technologies is a huge plus. Familiar with Message Queues like Kafka, large Storage Systems like S3, and NoSQL DBs is a plus. (ref:hirist.tech)
Posted 2 weeks ago
10.0 years
0 Lacs
Pune, Maharashtra, India
On-site
Technical Product Manager As a Technical Product Manager (TPM) for our internal Observability & Insights Platform, you will be responsible for defining the product strategy, owning discovery and delivery, and ensuring our engineers and stakeholders across 350+ services can build, debug, and operate confidently. You will own and evolve a platform that includes logging (ELK stack), metrics (Prometheus, Grafana, Thanos), tracing (Jaeger), structured audit logs, and SIEM integrations, while competing with high-cost solutions like Datadog and Honeycomb. Your impact will be both technical and strategic, improving developer experience, reducing operational noise, and driving platform efficiency and cost visibility. Key Deliverables (Quarterly Outcomes) Successfully manage and deliver initiatives from the Observability Roadmap / Job Jar, tracked via RAG status and Jira epics. Complete structured discoveries for upcoming capabilities (e.g., SIEM exporter, SDK adoption, trace sampling). Design and roll out scorecards (in Port) to measure observability maturity across teams. Ensure feature parity and stakeholder migration in cost-saving initiatives (e.g., Datadog , Prometheus). Track and report platform usage, reliability, and cost metrics aligned to business outcomes. Drive feature documentation, adoption plans, and enablement sessions across engineering. Jobs To Be Done Define and evolve the observability product roadmap (Logs, Metrics, Traces, SDK, Dashboards, SIEM). Lead dual-track agile product discovery for upcoming initiatives gather context, define problem, validate feasibility. Partner with engineering managers to break down initiatives into quarterly deliverables, epics, and sprint-level execution. Maintain the Observability Job Jar and present RAG status every 2 weeks with confidence backed by Jira hygiene. Define and track metrics to measure success of every platform capability (SLOs, cost savings, adoption %, etc). Work closely with FinOps, Security, and Platform teams to ensure observability aligns with cost, compliance, and operational goals. Champion the adoption of SDKs, scorecards, and dashboards via enablement, documentation, and evangelism. Ways Of Working Work in dual-track agile : Discover next quarters priorities while delivering this quarters committed outcomes. Maintain a GPS PRD (Product Requirements Doc) for each major initiative : What problem are we solving? Why now? How do we measure value? Collaborate deeply with engineers in backlog grooming, planning, demos, and retrospectives. Follow RAG-based reporting with stakeholders: escalate risks early, present mitigation paths clearly. Operate with full visibility in Jira (Initiative , Epics , Stories , Subtasks), driving delivery rhythm across sprints. Use quarterly Job Jar reviews to recalibrate product priorities, staffing needs, and stakeholder alignment. You Should Have 10+ years of product management experience, ideally in platform/infrastructure products. Proven success managing internal developer platforms or observability tooling. Experience launching or migrating enterprise-scale telemetry stacks (e.g., Datadog , Prometheus/Grafana, Honeycomb , Jaeger). Ability to break down complex engineering requirements into structured product plans with measurable outcomes. Strong technical grounding in cloud-native environments (EKS, Kafka, Elasticsearch, etc). Excellent documentation and storytelling skills especially to influence engineers and non-technical stakeholders. Success Metrics Reduction in Datadog/Honeycomb usage & cost post migration. Uptime & latency of observability pipelines (Jaeger, ELK, Prometheus). Scorecard improvement across teams (Bronze , Silver , Gold). Number of issues detected/resolved using the new observability stack. Time to incident triage with new tracing/logging capabilities. (ref:hirist.tech)
Posted 2 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39817 Jobs | Dublin
Wipro
19388 Jobs | Bengaluru
Accenture in India
15458 Jobs | Dublin 2
EY
14907 Jobs | London
Uplers
11185 Jobs | Ahmedabad
Amazon
10459 Jobs | Seattle,WA
IBM
9256 Jobs | Armonk
Oracle
9226 Jobs | Redwood City
Accenture services Pvt Ltd
7971 Jobs |
Capgemini
7704 Jobs | Paris,France