Jobs
Interviews

1633 Grafana Jobs - Page 42

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 - 10.0 years

20 - 35 Lacs

Chennai, Bengaluru

Work from Office

Role Overview: Zolvit is seeking a Node.js Backend Lead Engineer to lead our engineering efforts in building scalable systems with a microservices architecture. The ideal candidate will have 7+ years of experience in backend development, platformization expertise, and the ability to mentor junior engineers. You will play a key role in driving architectural decisions, ensuring system scalability, and fostering a strong engineering culture. Responsibilities: Design and implement scalable backend systems using Node.js and microservices architecture. Lead the development of platform components to enable efficient code reuse, modularity, and scalability. Collaborate with stakeholders to define system architecture and technical roadmap. Design and build solutions using event-driven architecture and middleware such as Kafka. Develop and maintain SQL and NoSQL databases, ensuring optimal performance and scalability. Define and implement high-level and low-level designs, documenting key decisions and ensuring junior engineers understand the architecture. Mentor junior engineers, conduct code reviews, and promote best practices in coding, design, and system architecture. Lead technical discussions, participate in hiring processes, and contribute to building a high-performance engineering team. Implement and maintain CI/CD pipelines to ensure seamless integration and deployment. Leverage AWS services for scalable infrastructure and deployment solutions. Requirements: 7+ years of hands-on experience in building scalable backend systems using Node.js. Strong understanding of microservices architecture, event-driven systems, and middleware like Kafka. Experience in building platform solutions with a focus on reusability and modularity. Proficient in SQL and NoSQL databases with a clear understanding of their tradeoffs. Solid knowledge of high-level and low-level system design concepts. Proven experience in mentoring engineers, conducting code reviews, and driving engineering excellence. Experience working with CI/CD pipelines and modern DevOps practices. Proficient in leveraging AWS services for building scalable infrastructure.Strong problem-solving skills, effective communication, and ability to thrive in a fast-paced environment. What We Offer: Opportunity to lead technical initiatives and shape the platform architecture. Work on cutting-edge technologies with a team that values innovation and engineering excellence. A collaborative environment where mentorship and growth are highly encouraged. Competitive compensation and growth opportunities aligned with your contributions.

Posted 1 month ago

Apply

2.0 - 5.0 years

10 - 20 Lacs

Bengaluru

Work from Office

Experience : 2+ years Expected Notice Period : 15 Days Shift : (GMT+05:30) Asia/Kolkata (IST) Must have skills required: Bash, Dynatrace, ELK, Grafana, Prometheus, Terraform, AWS, Kubernetes, ???Linux, Python Job Overview We are looking for a Site Reliability Engineer (SRE) with 2.5 to 5 years of experience to join our team. The ideal candidate will be responsible for ensuring the availability, scalability, and reliability of our distributed systems, improving observability, automating infrastructure, and enhancing system performance. This role provides an opportunity to work on high-scale, mission-critical environments and contribute to building a resilient infrastructure. Key Responsibilities Improve observability by implementing and managing monitoring, logging, and alerting solutions using Prometheus, ELK stack, and Grafana. Work with APMs like Dynatrace, New Relic to monitor performance metrics, define SLIs, SLOs, and error budgets. Participate in incident management, including on-call rotation, and Root Cause Analysis (RCA). Automate infrastructure provisioning using Terraform and Infrastructure as Code (IaC) principles. Ensure system scalability, reliability, and performance in a distributed environment. Strengthen security by applying cybersecurity best practices, vulnerability assessments, and compliance policies. Collaborate with cross-functional teams to establish SRE best practices, improve release pipelines, and minimize deployment risks. Maintain and improve disaster recovery plans to enhance resilience. Manage and optimize workflows using Apache Airflow to ensure efficient scheduling and execution of data pipelines. Support Snowflake data operations, ensuring high availability, performance optimization, and security compliance. Qualifications & Certifications Education: Bachelor's degree in Computer Science, Engineering, or related fields. Experience: 2.5 to 5 years of experience in Site Reliability Engineering, Observability, or Performance Monitoring. Hands-on experience in: Monitoring and observability using Prometheus, ELK, Grafana. Application Performance Monitoring (APM) tools like Dynatrace, New Relic, or Datadog. Incident response and on-call rotation management. Infrastructure automation using Terraform. Distributed systems operations and scaling. Load testing and performance analysis using tools like JMeter, k6, or Locust. Security at scale, including vulnerability scanning and compliance automation. Workflow automation and orchestration using Apache Airflow. Experience with Snowflake, including query optimization, data management, and security controls. Technical Skills: Strong knowledge of cloud platforms (AWS preferred). Experience with troubleshooting distributed systems and high-traffic environments. Hands-on knowledge of Linux, networking, and security fundamentals. Familiarity with container orchestration (Kubernetes, Docker). Ability to write automation scripts using Python, Bash, or Go. Preferred Certifications: AWS Certified DevOps Engineer Professional (or equivalent AWS certification). HashiCorp Certified: Terraform Associate. Certified Kubernetes Administrator (CKA). Google SRE Professional Certificate (preferred but not mandatory). Skills Bash, Dynatrace, ELK, Grafana, Prometheus, Terraform, AWS, Kubernetes, ???Linux, Python

Posted 1 month ago

Apply

8.0 - 12.0 years

13 - 17 Lacs

Chennai, Bengaluru

Work from Office

JobDescription: Lead team indeveloping platforms based on GCP, Kubernetes and Grafana. Work withSystem Engineers, Developers, Architects, and DevOps engineers to design andimplement automated solutions with CI/CD Pipelines for infrastructure andsoftware build, deployment, and configuration. Design, deployand support Cloud Networking including cross cloud and site-to-cloudconnections. Ensure DigitalPlatform service level expectations are clearly defined and measurable throughService Level Objectives (SLOs) and Service Level Indicators (SLIs). Ensure thatInstances Deployed to Cloud Platforms meet patching, monitoring and DR/BackupStandards. Leveragescripting languages, such as Python and Terraform, to build automated solutionsand integrations on an ad-hoc basis. Maintain codeand scripting Git version control repositories and related systems supportinginitiatives around infrastructure as code. Identify,troubleshoot, and resolve problems with the build, deployment, and continuousintegration process. Ensure securityis integrated into all architecture solutions to meet compliance standards. Design, build,and test cloud architecture to ensure large amounts of data can be transferredand stored efficiently. Create standardoperating procedures and policy documentation while adhering to industrystandards and best practices. Provideguidance to cross-functional teams on using and accessing data on the cloud,especially during major transfer, updates, or changes. Work in tandemwith our engineering team to identify and implement the most optimalcloud-based solutions Ensureapplication performance, uptime, and scale, maintaining high standards for codequality and thoughtful design Manage cloudenvironments following company security guidelines Leveragescripting to build required automation and tools Interfacebetween application development, support and infrastructure support teams Createknowledge sharing presentations and documentation to help developers andoperations teams understand and leverage the systems capabilities Lead in thedesign, development, debugging, and maintenance of software components using acombination of technologies Experience withEvaluating, Designing, Developing and Building PoC and Architect Infrastructuresolutions Deepunderstanding of networking, storage technologies and troubleshooting Serve as thecloud architect subject-matter expert for cloud practice within Mindsprint ProfileDescription: Bachelorsdegree (or equivalent), Masters degree is preferred GCPcertification 10+ Years of Experiencein IT Industry with good experience in cloud engineering Proficiency inone or more of Python, Perl, Ruby, Bash or Java Experienceusing GCP resource technologies to develop and deploy enterprise solutions Experiencemanaging scaled cloud systems with a focus on operational excellence Experienceworking with high-availability, distributed systems and services in a hostingenvironment including hardware, OS, storage, network, and database solutions Experience withDNS, DHCP, SSH, HTTP, TCP/IP and other common network protocols Experience withsystem analysis and troubleshooting in large-scale Linux environment Experiencewith database technology (both relational and NoSQL) Workingknowledge of agile development methods Workingknowledge of data structured/algorithms Deepunderstanding of cloud computing technologies, applications, and trends. Knowledge ofcloud infrastructure, software application, and design. Strong cloudmigration and data management skills with an emphasis on data privacy andsecurity. Excellentproblem-solving capabilities and can thrive in a fast-paced work environment. Strongcommunication skills with the willingness to collaborate with cross-functionaldepartments and teams. We are Mindsprint! A leading-edge technology and business services firmthat provides impact driven solutions to businesses, enabling them outpacespeed of change. Forover three decades we have been accelerating technology transformation for theOlam Group and their large base of global clients. Working with leading technologies and empowered withthe freedom to create new solutions and better existing ones, we have beeninspiring businesses with pioneering initiatives. Awards bagged in the recent years: Great Place to Work certified 2023 Best Shared Services in India Award by Shared Services Forum 2019 Asias No.1 Shared Services in Process Improvement and Value Creation by Shared Services and Outsourcing Network Forum 2019 International Innovation Award for Best Services and Solutions 2019 Kincentric Best Employer India 2020 Creative Talent Management Impact Award SSON Impact Awards 2021 The Economic Times Best Workplaces for Women 2021 & 2022 #SSFExcellenceAward for Delivering Business Impact through Innovative People Practices 2022 For more info: Follow us in LinkedIn: Required abilities Physical: Other: Work Environment Details: Specific requirements Travel: Vehicle: Work Permit: Other details Pay Rate: Contract Types: Time Constraints: Compliance Related: Union Affiliation:

Posted 1 month ago

Apply

6.0 - 10.0 years

8 - 12 Lacs

Mumbai

Work from Office

We are looking for an experienced DevOps Engineer (Level 2 & 3) to design, automate, and optimize cloud infrastructure. You will play a key role in CI/CD automation, cloud management, observability, and security, ensuring scalable and reliable systems. Key Responsibilities : Design and manage AWS environments using Terraform/Ansible. Build and optimize deployment pipelines (Jenkins, ArgoCD, AWS CodePipeline). Deploy and maintain EKS, ECS clusters. Implement OpenTelemetry, Prometheus, Grafana for logs, metrics, and tracing. Manage and scale cloud-native microservices efficiently. Required Skills : Proven experience in DevOps, system administration, or software development. Strong knowledge of AWS. Programming languages: Python, Go, Bash, are good to have Experience with IAC tools like Terraform, Ansible Solid understanding of CI/CD tools (Jenkins, ArgoCD , AWS CodePipeline). Experience in containers and orchestration tools like Kubernetes (EKS) Understanding of OpenTelemetry observability stack (logs, metrics, traces) Good to have : Experience with container orchestration platforms (e.g., EKS, ECS). Familiarity with serverless architecture and tools (e.g., AWS Lambda). Experience using monitoring tools like DataDog/ NewRelic, CloudWatch, Prometheus/Grafana Experience with managing more than 20+ cloud-native microservices. Previous experience of working in a startup Education & Experience : Bachelors degree in Computer Science, Information Technology, or a related field (or equivalent work experience). Years of relevant experience in DevOps or a similar role. About Kissht: Kissht, a Great Place to Work certified organization, is a consumer-first credit app that is transforming the landscape of consumer credit. As one of the fastest-growing and most respected FinTech companies, Kissht is a pioneer in data and machine-based lending. With over 15 million customers, including 40% from tier 2 cities and beyond, we offer both short and long-term loans for personal consumption, business needs, and recurring expenses. Founded by Ranvir and Krishnan, alumni of IIT and IIM, and backed by renowned investors like Endiya Partners, the Brunei Investment Authority, and the Singapore Government, Kissht is synonymous with excellence in the industry. Join us and be a part of a dynamic, innovative company that is changing the future of financial technology.

Posted 1 month ago

Apply

8.0 - 12.0 years

35 - 50 Lacs

Bengaluru

Work from Office

We are looking for an experienced DevOps Architect to lead the design, development, and maintenance of scalable DevOps solutions. The ideal candidate will have deep expertise in cloud platforms, CI/CD pipelines, automation frameworks, and infrastructure-as-code (IaC) principles. You will work closely with development, IT operations, and security teams to streamline processes and enhance the reliability and scalability of our applications. Key Responsibilities: Architect and Implement : Design and implement scalable, secure, and high-performance DevOps pipelines. Infrastructure as Code (IaC): Manage infrastructure using tools like Terraform, CloudFormation, or Ansible. CI/CD Management : Build and maintain robust CI/CD pipelines for automated testing, integration, and deployment. Cloud Management: Architect solutions on cloud platforms such as AWS, Azure, or GCP. Monitoring and Logging: Set up monitoring tools (Prometheus, Grafana, ELK Stack) to ensure application reliability and performance. Security and Compliance : Implement security best practices in CI/CD pipelines, infrastructure, and cloud environments. Collaboration : Work closely with development and operations teams to automate and optimize their workflows. Documentation : Maintain clear documentation of architecture, configurations, and processes. Skills and Qualifications: Technical Skills: Proficiency in cloud platforms: AWS, Azure, or GCP. Expertise in CI/CD pipelines: Jenkins, GitLab CI, CircleCI, or similar. Hands-on experience with Infrastructure as Code (IaC): Terraform, CloudFormation, Ansible. Strong scripting skills: Bash, Python, or PowerShell. Knowledge of containerization and orchestration: Docker, Kubernetes. Familiarity with monitoring and logging tools: Prometheus, Grafana, ELK Stack. Solid understanding of networking, security, and performance optimization. Soft Skills: Excellent problem-solving and troubleshooting abilities. Strong communication and collaboration skills. Ability to work in an agile environment and adapt to change.

Posted 1 month ago

Apply

2.0 - 4.0 years

4 - 9 Lacs

Bengaluru

Work from Office

Skills Required: Technical areas (hands-on experience in academic projects/internships) Experience with Kubernetes, Jenkins, Gitlab, Github, CI/CD, Terraform, Linux, Bash, Python, AWS, GCP, GKE, and EKSUnderstanding of Public/Private/Hybrid Cloud Solutions. Own the responsibility for platform management, supporting services, and all related tooling and automation. Proficient in cloud-native technologies, automation, and containerization. Experience in setting up and managing cloud infrastructure and services for a wide range of Applications. Some experience in ReactJS / NodeJS, PHP, Phyton and UNIX shell,so a background in system- oriented languages is important. Managing and deploying cloud-native applications on Kubernetes clusters, Setting CI/CD pipelines in (Jenkins, Gitlab, Github), Databases Migration (MySQL, Postgresql, Cassandra), Setting up Monitoring (Grafana, Loki, Prometheus, Mimir, ELK Stack). Certified in Kubernetes and Jenkins.Experienced in using Terraform to automate infrastructure provisioning. We are looking for bright, passionate, and dedicated people with clearly demonstrated initiative and a history of success in their past positions to join our growing team.

Posted 1 month ago

Apply

6.0 - 10.0 years

15 - 25 Lacs

Noida, Gurugram, Delhi

Work from Office

Mandatory Skills (Docker and Kubernetes) Should have good understanding of various components of Kubernetes cluster Should have hands on experience of provisioning of Kubernetes cluster Should have expertise on managing and upgradation Kubernetes Cluster / Redhat Openshift platform Should have good experience of Container storage Should have good experience on CICD workflow (Preferable Azure DevOps, Ansible and Jenkin) Should have hands on experience of linux operating system administration Should have understanding of Cloud Infrastructure preferably Vmware Cloud Should have good understanding of application life cycle management on container platform Should have basis understanding of cloud networks and container networks Should have good understanding of Helm and Helm Charts Should be good in performance optimization of container platform Should have good understanding of container monitoring tools like Prometheus, Grafana and ELK Should be able to handle Severity#1 and Severity#2 incidents Good communication skills Should have capability to provide the support Should have analytical and problem-solving capabilities, ability to work with teams Should have experience on 24*7 operation support framework) Should have knowledge of ITIL Process Preferred Skills/Knowledge Container Platforms - Docker, CRI/O, Kubernetes and OpenShift Automation Platforms - Shell Scripts, Ansible, Jenkin Cloud Platforms - GCP/AZURE/OpenStack Operating System - Linux/CentOS/Ubuntu Container Storage and Backup

Posted 1 month ago

Apply

0.0 - 5.0 years

1 - 6 Lacs

Ahmedabad

Work from Office

Design and implement CI/CD pipelines for automated deployments Manage infrastructure using tools like Docker and Kubernetes Monitor and troubleshoot system performance Collaborate with developers Ensure scalability, security, reliability of systems Required Candidate profile Proficiency in CI/CD tools (Jenkins, GitLab) Experience cloud platforms (AWS, Azure, GCP) Knowledge of Docker, Kubernetes, infrastructure automation tools (Terraform, Ansible). Scripting skills

Posted 1 month ago

Apply

6.0 - 10.0 years

0 - 0 Lacs

Bengaluru

Work from Office

What You'll Do: Take complex engineering problems, design appropriate solutions and deliver on them fairly independently. Work on a variety of technologies - from system implementations, to software and tools built in house, to application systems delivering acceleration as a service. Architect and implement robust, well-tested services. Provide technical design and code reviews for peers within your team. Provide insights into opportunity areas for the platform, influencing priorities and team roadmaps in close partnership with Engineering and Product leadership. Be a multiplier, mentor other engineers on the team and help them become more productive and implement engineering best practices. Promote a culture of engineering excellence, up-leveling the technical expertise of engineers across Engineering Effectiveness org. What youll Need: 6+ years of experience in software engineering and designing systems at scale. Experience in development of new applications using technologies such as Java, Python or C#; SQL Experience with Cloud Native architecture in one of the big 3 providers (GCP, Azure, AWS). Experience with Continuous Integration (CI/CD) practices and tools (Buildkite, Jenkins, etc.). Experience leveraging monitoring and logging technologies (e.g. DataDog, Elasticsearch, InfluxDB, etc.). Track-record of being a hands-on developer efficiently building technically sound systems. Strong verbal and written communication skills. Ability to work effectively with engineers, product managers, and business stakeholders alike. Experience mentoring engineers and leading code reviews. Proficient in effective troubleshooting and issue resolution techniques.

Posted 1 month ago

Apply

7.0 - 11.0 years

4 - 7 Lacs

Bengaluru

Work from Office

Skill required: Delivery - Marketing Analytics and Reporting Designation: I&F Decision Sci Practitioner Specialist Qualifications: Any Graduation Years of Experience: 7 to 11 years About Accenture Combining unmatched experience and specialized skills across more than 40 industries, we offer Strategy and Consulting, Technology and Operations services, and Accenture Song all powered by the worlds largest network of Advanced Technology and Intelligent Operations centers. Our 699,000 people deliver on the promise of technology and human ingenuity every day, serving clients in more than 120 countries. Visit us at www.accenture.com What would you do Data & AIAnalytical processes and technologies applied to marketing-related data to help businesses understand and deliver relevant experiences for their audiences, understand their competition, measure and optimize marketing campaigns, and optimize their return on investment. What are we looking for Python (Programming Language)Structured Query Language (SQL)Machine LearningData ScienceWritten and verbal communicationAbility to manage multiple stakeholdersStrong analytical skillsDetail orientationExpertise in AWS, Azure, or Google Cloud for ML workflows.Hands-on experience with Kubernetes, Docker, Jenkins, or GitLab CI/CDFamiliarity with MLflow, TFX, Kubeflow, or SageMaker.Knowledge of Prometheus, Grafana, or similar tools for tracking system health and model performance.Understanding of ETL processes, data pipelines, and big data tools like Spark or Kafka.Proficiency in Git and model versioning best practices. Roles and Responsibilities: In this role you are required to do analysis and solving of moderately complex problems May create new solutions, leveraging and, where needed, adapting existing methods and procedures The person would require understanding of the strategic direction set by senior management as it relates to team goals Primary upward interaction is with direct supervisor May interact with peers and/or management levels at a client and/or within Accenture Guidance would be provided when determining methods and procedures on new assignments Decisions made by you will often impact the team in which they reside Individual would manage small teams and/or work efforts (if in an individual contributor role) at a client or within Accenture Work closely with data scientists, engineers, and DevOps teams to operationalize MLOptimize ML pipelines for performance, cost, and scalability in production.Automate deployment pipelines for ML models, ensuring fast and reliable transitions from development to production environments Set up and manage scalable cloud or on-premise environments for ML workflows. Qualification Any Graduation

Posted 1 month ago

Apply

4.0 - 9.0 years

25 - 40 Lacs

Bengaluru

Hybrid

Key Responsibilities Design, develop, and maintain scalable microservices using Python, Go, or Node.js. Build and maintain RESTful APIs to support web and mobile applications. Develop event-driven and asynchronous systems using Apache Kafka. Deploy and manage services using Kubernetes in AWS cloud environments. Work with SQL and NoSQL databases to store and retrieve data efficiently. Write clean, maintainable, and well-tested code following Test-Driven Development (TDD) practices. Ensure systems are fault-tolerant, scalable, and performant under load. Collaborate cross-functionally with frontend engineers, DevOps, and product teams. Participate in code reviews, architecture discussions, and team ceremonies. Continuously improve system design and development workflows. Required Skills & Qualifications 4-7 years of professional backend development experience. Proficiency in Python, Go, or Node.js (at least one language required). Strong understanding and hands-on experience with Microservices architecture. Experience with Kafka or other messaging systems (e.g., RabbitMQ). Solid understanding of both SQL (e.g., PostgreSQL, MySQL) and NoSQL (e.g., MongoDB, DynamoDB) databases. Working experience with AWS services (e.g., EC2, S3, RDS, Lambda). Hands-on experience with Kubernetes and containerized application deployment. Proven experience in writing and maintaining RESTful APIs. Commitment to Test-Driven Development (TDD) and clean coding practices. Strong debugging, problem-solving, and analytical skills. Ability to thrive in a fast-paced, dynamic startup environment. Familiarity with observability and monitoring tools (e.g., Prometheus, Grafana, ELK Stack). Nice to Have Exposure to gRPC APIs. Familiarity with Large Language Models (LLMs) and their integration into applications. Experience with Voice Technologies, such as speech recognition, text-to-speech (TTS), conversational AI Understanding of real-time streaming and event-driven systems.

Posted 1 month ago

Apply

5.0 - 10.0 years

10 - 20 Lacs

Gurugram

Work from Office

Experience: 5 + years Expected Notice Period: 30 Days Shift: (GMT+05:30) Asia/Kolkata (IST) Opportunity Type: Office (Gurugram) Placement Type: Full Time Permanent position Must have skills required: Bash, Grafana, serverless architectures, Ansible, CI/CD Tools, Cloud, IAC, Docker, Kubernetes, Python We are seeking a highly experienced DevOps Engineer to join our engineering team. In this role, you will design and manage robust CI/CD pipelines, optimize cloud infrastructure, and ensure the scalability, reliability, and security of our systems. You will work closely with development teams to drive automation, improve deployment practices, and contribute to our evolving DevOps strategy. Key Responsibilities Design, implement, and maintain end-to-end CI/CD pipelines for automated testing, building, and deployment. Architect, manage, and optimize secure and scalable cloud infrastructure (AWS, Azure, or GCP). Automate infrastructure provisioning and configuration using IaC tools like Terraform or CloudFormation. Collaborate with development teams to troubleshoot production issues and implement performance improvements. Develop and maintain monitoring, logging, and alerting systems for high availability and rapid incident response. Implement and manage disaster recovery and business continuity plans. Enforce security best practices across the software development lifecycle. Drive continuous improvement of DevOps tools, practices, and workflows. Participate in on-call rotations for operational support and incident management. Required Qualifications Bachelors degree in Computer Science, Engineering, or a related field (or equivalent experience). 3+ years of proven experience in a DevOps or similar engineering role. Proficient in at least one major cloud platform (AWS, Azure, or GCP). Experience with configuration management tools (Ansible, Puppet, Chef). Strong knowledge of CI/CD tools (Jenkins, GitLab CI, CircleCI, etc.) and version control systems (Git). Deep understanding of containerization (Docker) and orchestration (Kubernetes). Solid grasp of networking, security, and system administration. Hands-on experience with Infrastructure as Code (Terraform, CloudFormation). Excellent problem-solving, automation, and communication skills. Preferred Qualifications Experience with scripting languages such as Python or Bash. Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK stack). Exposure to serverless architectures. Experience with database administration and performance tuning. Prior experience in an eCommerce environment is highly preferred.

Posted 1 month ago

Apply

3.0 - 8.0 years

6 - 10 Lacs

Mumbai

Work from Office

Experience: 3 + years Shift: (GMT+05:30) Asia/Kolkata (IST) Opportunity Type: Office (Mumbai) Placement Type: Full time Permanent Position Must have skills required: Jenkins, Kubernetes, AWS Why Join Us? Be at the forefront of privacy AIsolving real-world compliance challenges using cutting-edge AI. Work on challenging AI problems in contract analysis, PII classification, and regulatory automation. Collaborate with top legal, compliance, and AI researchers. Competitive salary, flexible work culture, and a high-impact role in a fast-growing privacy tech company. We are the perfect match if you.... Are looking for 3+ Yrs Cloud Architects & DevOps Professionals to Design, Build & Manage IDfy Cloud Native Digital Fraud Platform Have an understanding of product development methodologies and microservices architecture. Are Hands-on experience with at least one major cloud provider (AWS, GCP, or Azure). Multi-cloud experience is a strong advantage. Have Expertise in designing, implementing, and managing cloud architectures focusing on scalability, security, and resilience. Have Understanding and experience with cloud fundamentals like Networking, IAM, Compute, and Managed Services like DB, Storage, GKE/EKS, and KMS. Are Hands-on experience with cloud architecture design & setup. Have an in-depth understanding of Infrastructure as Code tools like Terraform or equivalent. Have practical experience in deploying, maintaining, and scaling applications on Kubernetes clusters using Helm Charts or Kustomize Are hands-on experience with any CI/CD tools like Gitlab CI, Jenkins, Github Actions and GitOps tools like Argocd, Flux Have experience with Monitoring and Logging tools like Prometheus, Grafana and Elastic Stack

Posted 1 month ago

Apply

4.0 - 6.0 years

1 - 5 Lacs

Chandigarh, Bengaluru

Work from Office

Role & responsibilities Vulnerability management (Nessus, EVMS, infra patching, code patching). Windows 2019 Operating System Cloudflare. Integration specialist. MSSQL Administration, Configuration, Replication, Optimization, Monitoring. Kibana, Grafana. DevSecOps Methodology Knowledge of Total Materia an asset

Posted 1 month ago

Apply

8.0 - 13.0 years

85 - 90 Lacs

Noida

Work from Office

About the Role We are looking for a Staff EngineerReal-time Data Processing to design and develop highly scalable, low-latency data streaming platforms and processing engines. This role is ideal for engineers who enjoy building core systems and infrastructure that enable mission-critical analytics at scale. Youll work on solving some of the toughest data engineering challenges in healthcare. A Day in the Life Architect, build, and maintain a large-scale real-time data processing platform. Collaborate with data scientists, product managers, and engineering teams to define system architecture and design. Optimize systems for scalability, reliability, and low-latency performance. Implement robust monitoring, alerting, and failover mechanisms to ensure high availability. Evaluate and integrate open-source and third-party streaming frameworks. Contribute to the overall engineering strategy and promote best practices for stream and event processing. Mentor junior engineers and lead technical initiatives. What You Need 8+ years of experience in backend or data engineering roles, with a strong focus on building real-time systems or platforms. Hands-on experience with stream processing frameworks like Apache Flink, Apache Kafka Streams, or Apache Spark Streaming. Proficiency in Java, Scala, or Python or Go for building high-performance services. Strong understanding of distributed systems, event-driven architecture, and microservices. Experience with Kafka, Pulsar, or other distributed messaging systems. Working knowledge of containerization tools like Docker and orchestration tools like Kubernetes. Proficiency in observability tools such as Prometheus, Grafana, OpenTelemetry. Experience with cloud-native architectures and services (AWS, GCP, or Azure). Bachelor's or Masters degree in Computer Science, Engineering, or a related field.

Posted 1 month ago

Apply

10.0 - 15.0 years

20 - 35 Lacs

Gurugram

Work from Office

Key Responsibilities Design and implement scalable, secure, and highly available infrastructure solutions. Architect and maintain CI/CD pipelines for efficient, reliable software delivery. Drive adoption of DevOps tools, practices, and automation across engineering teams. Lead cloud infrastructure strategy (AWS/GCP/Azure), cost optimization, and security controls. Implement Infrastructure as Code (IaC) using tools like Terraform, Cloud Formation, or Pulumi. Ensure monitoring, alerting, and observability best practices (Prometheus, ELK, Datadog, etc.). Guide container orchestration using Docker, Kubernetes, EKS, or AKS. Collaborate with development, QA, and security teams to ensure high-quality delivery. Mentor and support a team of DevOps engineers and promote a DevSecOps culture. Participate in architecture discussions and contribute to system design decisions. Required Skills & Qualifications 10+ years of experience in DevOps, Site Reliability Engineering, or Infrastructure Engineering. Expertise in at least one cloud platform (AWS, Azure, or GCP). Strong hands-on experience with CI/CD tools (Jenkins, GitLab CI/CD, CircleCI, etc.). Proficiency with Docker, Kubernetes, Helm. Infrastructure as Code (Terraform, Cloud Formation, Ansible). Solid scripting skills in Bash, Python, or Go. Deep understanding of networking, security, and cloud architecture patterns. Strong knowledge of monitoring/logging tools like Prometheus, Grafana, ELK, or Splunk. Experience with version control systems (Git, GitHub, Bitbucket). Excellent problem-solving and communication skills.

Posted 1 month ago

Apply

8.0 - 13.0 years

85 - 90 Lacs

Noida

Work from Office

About the Role We are seeking a highly skilled Staff Engineer to lead the architecture, development, and scaling of our Marketplace platform - including portals & core services such as Identity & Access Management (IAM), Audit, and Tenant Management services. This is a hands-on technical leadership role where you will drive engineering excellence, mentor teams, and ensure our platforms are secure, compliant, and built for scale. A Day in the Life Design and implement scalable, high-performance backend systems for all the platform capabilities Lead the development and integration of IAM, audit logging, and compliance frameworks, ensuring secure access, traceability, and regulatory adherence. Champion best practices for reliability, availability, and performance across all marketplace and core service components. Mentor engineers, conduct code/design reviews, and establish engineering standards and best practices. Work closely with product, security, compliance, and platform teams to translate business and regulatory requirements into technical solutions. Evaluate and integrate new technologies, tools, and processes to enhance platform efficiency, developer experience, and compliance posture. Take end-to-end responsibility for the full software development lifecycle, from requirements and design through deployment, monitoring, and operational health. What You Need 8+ years of experience in backend or infrastructure engineering, with a focus on distributed systems, cloud platforms, and security. Proven expertise in building and scaling marketplace platforms and developer/admin/API portals. Deep hands-on experience with IAM, audit logging, and compliance tooling. Strong programming skills in languages such as Python or Go. Experience with cloud infrastructure (AWS, Azure), containerization (Docker, Kubernetes), and service mesh architectures. Understanding of security protocols (OAuth, SAML, TLS), authentication/authorization, and regulatory compliance. Demonstrated ability to lead technical projects and mentor engineering teams & excellent problem-solving, communication, and collaboration skills. Proficiency in observability tools such as Prometheus, Grafana, OpenTelemetry. Prior experience with Marketplace & Portals Bachelor's or Masters degree in Computer Science, Engineering, or a related field

Posted 1 month ago

Apply

7.0 - 12.0 years

15 - 30 Lacs

Hyderabad

Work from Office

Site Reliability Engineer Required Technical Skill Set-: • Practical experience with Monitoring tools, such as: Grafana, Azure Monitor, Log Analytics, Network Monitoring and Alerting Tools (i.e. Big Panda). • Experience with Automation Tooling, such as: Azure Open AI, Amelia Automation, Service Now Orchestration, Power Apps / Power Platform, Python and PowerShell. Good foundational understanding of Agile Methodologies, AI/ML for automating operational initiatives and ITIL / Change Management processes. • Knowledge of core Azure Cloud computing concepts (AZ-900 Certification as a minimum requirement, with AZ-104 certification preferred). • Knowledge of Azure Chaos Studio for Chaos Engineering Minimum 5 mandate details are mandate with two or 3 liners 1. Implementing proactive remediation automation based on past issues / incidents and hypothesising use cases where an issue / incident may, thereby, automatically restoring stability, should the incident occur. 2. Track record of implementing Monitoring tooling to encompass: Health state of Infrastructure, Network, Log & Events, Performance, Capacity and Synthetic monitoring. 3. Experience in Data Correlation & Analysis and Configuring Alerts for detected issues / incidents. 4. Knowledge of Azure Open AI, and how various data sources can be integrated with the AI for data analysis, in order to initiate events based on informed decision making. 5. Experience in leading Blameless Post-Mortems, following production incidents / outages, in order to identify opportunities for improvement

Posted 1 month ago

Apply

8.0 - 12.0 years

15 - 27 Lacs

Pune

Hybrid

*****GenAI DevOps Engineer AWS Bedrock***** *****Pune Hinjewadi***** *****Immediate Joiners Preferred***** *****Minimum 4 Days WFO***** Job Description: We are seeking a highly experienced GenAI DevOps Engineer to join our dynamic team in Pune. The ideal candidate will have a strong background in building, deploying, and optimizing Generative AI applications on AWS Bedrock, along with expertise in DevOps practices. You will be responsible for automating infrastructure, managing CI/CD pipelines, and ensuring high performance and reliability of AI models. Key Responsibilities: Design, develop, and deploy Generative AI applications leveraging AWS Bedrock and SageMaker. Automate infrastructure provisioning and deployment processes. Build and maintain robust CI/CD pipelines using CodePipeline and CodeBuild. Monitor application and model performance using CloudWatch, Prometheus, and Grafana. Optimize AI models for performance, scalability, and cost-efficiency. Work with RAG (Retrieval-Augmented Generation) tools such as LangChain, Haystack, and LlamaIndex for building advanced AI solutions. Collaborate with data scientists and developers to streamline model deployment and monitoring. Required Skills: Extensive hands-on experience with AWS Bedrock and SageMaker. Strong expertise in CI/CD tools: CodePipeline, CodeBuild. Proficiency with monitoring tools: CloudWatch, Prometheus, Grafana. Experience with RAG frameworks like LangChain, Haystack, and LlamaIndex. Solid understanding of DevOps best practices and automation. Ability to troubleshoot and optimize AI deployment pipelines. Excellent problem-solving and communication skills. Preferred Skills: Knowledge of containerization (Docker, Kubernetes). Familiarity with scripting languages (Python, Bash). Experience with cloud security best practices. Understanding of machine learning lifecycle management. Mandatory Skills: AWS Bedrock and SageMaker expertise. CI/CD pipeline automation. Monitoring and performance optimization. RAG-based application development.

Posted 1 month ago

Apply

8.0 - 13.0 years

25 - 30 Lacs

Hyderabad, Chennai, Bengaluru

Hybrid

Develop and maintain Kafka-based data pipelines for real-time processing. Implement Kafka producer and consumer applications for efficient data flow. Optimize Kafka clusters for performance, scalability, and reliability. Design and manage Grafana dashboards for monitoring Kafka metrics. Integrate Grafana with Elasticsearch, or other data sources. Set up alerting mechanisms in Grafana for Kafka system health monitoring. Collaborate with DevOps, data engineers, and software teams. Ensure security and compliance in Kafka and Grafana implementations. Requirements: 8+ years of experience in configuring Kafka, ElasticSearch and Grafana Strong understanding of Apache Kafka architecture and Grafana visualization. Proficiency in .Net, or Python for Kafka development. Experience with distributed systems and message-oriented middleware. Knowledge of time-series databases and monitoring tools. Familiarity with data serialization formats like JSON. Expertise in Azure platforms and Kafka monitoring tools. Good problem-solving and communication skills. Mandate : Create the Kafka dashboards , Python/.NET Note: Candidate must be immediate joiner.

Posted 1 month ago

Apply

3.0 - 8.0 years

3 - 7 Lacs

Noida

Hybrid

Job Title: DevOps Engineer (Kubernetes & Terraform) Location: Noida Experience: 3 to 8 years Type: Full-time About the Role: We are looking for a DevOps Engineer with 38 years of experience who specializes in Kubernetes and Terraform. This role is ideal for someone passionate about automation, infrastructure scalability, and cloud-native technologies. You will be responsible for designing and maintaining infrastructure platforms that support continuous delivery and scalability across our development and production environments. Key Responsibilities: Design, deploy, and manage scalable and secure Kubernetes clusters in production. Develop and manage Infrastructure as Code (IaC) using Terraform to provision cloud infrastructure. Build and maintain CI/CD pipelines to automate build, test, and deployment workflows. Ensure system availability, performance, and security across all environments. Work closely with development and QA teams to enable efficient DevOps practices. Automate system provisioning, configuration, and application deployments. Monitor infrastructure using tools like Prometheus, Grafana, ELK, or similar. Implement security best practices in container orchestration and infrastructure management. Must-Have Qualifications: 3-8 years of experience in DevOps, SRE, or infrastructure engineering roles. Hands-on experience with Kubernetes (deployment patterns, Helm, RBAC, ingress controllers, etc.). Proficiency in Terraform, including module creation and state management. Strong background in at least one public cloud provider (AWS, Azure, or GCP). Experience with CI/CD tools such as Jenkins, GitLab CI, GitHub Actions, or ArgoCD. Solid Linux administration skills. Experience with containerization using Docker. Scripting skills in Bash, Python, or Go. What You'll Get: Competitive compensation and benefits. Exposure to cutting-edge DevOps tools and practices. A collaborative, remote-friendly engineering culture. Opportunities for upskilling and certifications. Involvement in end-to-end infrastructure design and decisions.

Posted 1 month ago

Apply

3.0 - 8.0 years

10 - 20 Lacs

Pune

Hybrid

Lead Site Reliability Engineer Lead Site Reliability Engineers at UKG are critical team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation. Lead Site Reliability Engineers must be passionate about learning and evolving with current technology trends. They strive to innovate and are relentless in pursuing a flawless customer experience. They have an automate everything mindset, helping us bring value to our customers by deploying services with incredible speed, consistency, and availability. Job Responsibilities: Engage in and improve the lifecycle of services from conception to EOL, including system design consulting, and capacity planning Define and implement standards and best practices related to: System Architecture, Service delivery, metrics and the automation of operational tasks Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response. Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis Collaborate closely with engineering professionals within the organization to deliver reliable services Increase operational efficiency, effectiveness, and quality of services by treating operational challenges as a software engineering problem (reduce toil) Guide junior team members and serve as a champion for Site Reliability Engineering Actively participate in incident response, including on-call responsibilities Partner with stakeholders to influence and help drive the best possible technical and business outcomes Required Qualifications Engineering degree, or a related technical discipline, or equivalent work experience Experience coding in higher-level languages (e.g., Python, JavaScript, C++, or Java) Knowledge of Cloud based applications & Containerization Technologies Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing Working experience with industry standards like Terraform, Ansible Demonstrable fundamentals in 2 of the following: Computer Science, Cloud architecture, Security or Network Design fundamentals Demonstrable fundamentals in 2 of the following: Computer Science, Cloud architecture, Security, or Network Design fundamentals (Experience, Education, Certification, License and Training) Must have at least 5 years of hands-on experience working in Engineering or Cloud Minimum 5 years' experience with public cloud platforms (e.g. GCP, AWS, Azure) Minimum 3 years' Experience in configuration and maintenance of applications and/or systems infrastructure for large scale customer facing company Experience with distributed system design and architecture

Posted 1 month ago

Apply

15.0 - 20.0 years

45 - 60 Lacs

Mumbai

Work from Office

This position is for Site reliability Engineer within Client Engagement and Protection APS team. The primary purpose is to be accountable for all core engineering / transformation activities of ISPL Transversal CEP APS Responsibilities Direct Responsibilities Automate away toil using a combination of scripting, tooling, and process improvements Drive transformation strategies involving infrastructure hygiene / end of life Implementing new technologies or processes to improve efficiency and reduce costs eg:- CI/CD implementation Monitoring system performance and capacity levels to ensure high availability of applications with minimal downtime Investigating any service disruptions or other service issues to identify their causes Performing regular audits of computer systems to check for signs of degradation or malfunction Developing and implementing new methods of measuring service quality and customer satisfaction Conducting capacity planning to ensure that new technologies can be accommodated without impacting existing users Conducting post-mortem examinations of failed systems to identify and address root cause Drive various Automation, Monitoring & Tooling common purpose initiatives across CEP APS and other teams within CIB APS Accountable for generation, reporting and improvements of various Production KPIs, SLs and dashboards for APS teams Accountable for improvements in service and presentations for all governances and steering committees Accountable for maintenance and improvement of IT continuity plans (ICP) Contributing Responsibilities Technical & Behavioral Competencies Strong knowledge of DevOps methodology and toolsets Strong knowledge of Cloud based applications/services Strong knowledge of APM Tools i.e. Dynatrace / AppDynamics Strong Distributed Computing and Database technologies skillset Strong knowledge of Jenkin, Ansible, Python, Scripting etc. Good understanding of Log aggregators i.e. Splunk/ELK Good understanding of observability tools i.e. Grafana / Prometheus Ability to work with various APS, Development, Operations stakeholders, locally and globally Dynamic, proactive and teamwork oriented Independent, self-starter and fast learner Good communication and interpersonal skills Practical knowledge of change, incident & problem management tools Innovative and transformational mindset Flexible attitude Ability to perform under pressure Strong analytical skills Preferred to have ITIL Dockers/Kubernetes Prior knowledge on Site Reliability Engineering / Dev-Ops / Application Production Support / Development background Specific Qualifications (if required) Graduate in any discipline or Bachelor in Information Technology 15 of IT experience Skills Referential Behavioural Skills : (Please select up to 4 skills) Ability to collaborate / Teamwork Creativity & Innovation / Problem solving Ability to deliver / Results driven Communication skills - oral & written Transversal Skills: (Please select up to 5 skills) Ability to manage a project Ability to set up relevant performance indicators Ability to anticipate business / strategic evolution Ability to develop and adapt a process Analytical Ability Education Level: Bachelor Degree or equivalent Experience Level At least 15 years

Posted 1 month ago

Apply

3.0 - 8.0 years

15 - 30 Lacs

Bengaluru

Remote

Hiring for USA based big Multinational Company (MNC) The Cloud Engineer is responsible for designing, implementing, and managing cloud-based infrastructure and services. This role involves working with cloud platforms such as AWS, Microsoft Azure, or Google Cloud to ensure scalable, secure, and efficient cloud environments that meet the needs of the organization. Design, deploy, and manage cloud infrastructure in AWS, Azure, GCP, or hybrid environments. Automate cloud infrastructure provisioning and configuration using tools like Terraform, Ansible, or CloudFormation. Ensure cloud systems are secure, scalable, and reliable through best practices in architecture and monitoring. Work closely with development, operations, and security teams to support cloud-native applications and services. Monitor system performance and troubleshoot issues to ensure availability and reliability. Manage CI/CD pipelines and assist in DevOps practices to streamline software delivery. Implement and maintain disaster recovery and backup procedures. Optimize cloud costs and manage billing/reporting for cloud resources. Ensure compliance with data security standards and regulatory requirements. Stay current with new cloud technologies and make recommendations for continuous improvement. Bachelors degree in Computer Science, Information Technology, Engineering, or a related field. 3+ years of experience working with cloud platforms such as AWS, Azure, or Google Cloud. Proficiency in infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation). Experience with CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps). Familiarity with containerization and orchestration (e.g., Docker, Kubernetes). Strong scripting skills (e.g., Python, Bash, PowerShell). Solid understanding of networking, security, and identity management in the cloud. Excellent problem-solving and communication skills. Ability to work independently and as part of a collaborative team.

Posted 1 month ago

Apply

8.0 - 12.0 years

30 - 45 Lacs

Hyderabad

Work from Office

Responsibilities: Design, implement, and maintain scalable cloud infrastructure primarily on AWS, with some exposure to Azure. Manage and optimize CI/CD pipelines using Jenkins and Git-based version control systems (GitHub/GitLab). Build and maintain containerized applications using Docker, Kubernetes (including AWS EKS), and Helm. Automate infrastructure provisioning and configuration using Terraform and Ansible. Implement GitOps-style deployment processes using ArgoCD and similar tools. Ensure observability through monitoring and logging with Prometheus, Grafana, Datadog, Splunk, and Kibana. Develop automation scripts using Python, Shell, and GoLang Implement and enforce security best practices in CI/CD pipelines and container orchestration environments using tools like Trivy, OWASP, SonarQube, Aqua Security, Cosign, and HashiCorp Vault. Support blue/green deployments and other advanced deployment strategies. Required Qualifications: 8-12 years of professional experience in a DevOps, SRE, or related role. Strong hands-on experience with AWS (EC2, S3, IAM, EKS, RDS, Lambda, Secrets Manager). Solid experience with CI/CD tools (Jenkins, GitHub/GitLab, Maven). Proficient with containerization and orchestration tools: Docker, Kubernetes, Helm. Experience with Infrastructure as Code tools: Terraform and Ansible. Proficiency in scripting languages: Python, Shell; GoLang Strong understanding of observability, monitoring, and logging frameworks. Familiarity with security practices and tools integrated into DevOps workflows. Excellent problem-solving and troubleshooting skills. Certifications (good to have): AWS Certified DevOps Engineer Certified Kubernetes Administrator (CKA) Azure Administrator/Developer Certifications

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies