Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
4.0 - 7.0 years
11 - 16 Lacs
Bengaluru
Work from Office
In this role, you will build and maintain an observability stack for IBM’s Cloud Object Storage service using managed services as well as custom built services. This stack is used by Cloud Object Storage SREs and devs to understand the health of the service. Work duties and responsibilities include: Design, setup, configure and implement the COS Monitoring System using technologies such as Elasticsearch, Logstash, Kibana, Kafka, Kafka Mirrors, Filebeat, Grafana and Sysdig. Automate CICD tasks and infrastructure using Ansible, Terraform, Jenkins, and Travis. Experience with microservices and distributed application architecture, such as containers and Kubernetes. Experience with Linux administration and programming languages such as java, python and sql. Performance and configuration tuning to support the increasing load of data flowing into the COS Monitoring System. Provide design recommendations and thought leadership to provide best-in-class observability as part theCOS Monitoring System. Provide 24x7 on-call customer support on a rotational basis. Design and develop dashboards for metrics analysis Design, Develop and Configure an alerting solution for an end-to-end incident management and recovery process by integrating Sysdig with Pagerduty, Email and Slack. Required education Bachelor's Degree Required technical and professional expertise Ability and tenacity to solve increasingly complex technical issues through analysis and a variety of problem-solving techniques. Working knowledge of Object-Oriented Python with demonstrable experience in applying these skills. Working knowledge ofLinux environments. Experience working in an Agile-Scrum development environment. Experience using tools such as Jira, GitHub and Logging and monitoring tools BS in CS, CE or similar field, plus 10-12 years relevant work experience.
Posted 3 weeks ago
1.0 - 3.0 years
3 - 7 Lacs
Bengaluru
Work from Office
At IBM, we are driven to shift our technology to an as-a-service model and to help our clients transform themselves to take full advantage of the cloud. With industry leadership in AI, analytics, security, commerce, and quantum computing and with unmatched hardware and software design and industrial research capabilities, no other company is as well positioned to address the full opportunity of enterprise cloud computing. We are looking for a backend developer to join our IBM Cloud VPC Observability team. This team is part of IBM Cloud VPC Service dedicated to ensuring that the IBM Cloud is at the forefront of reliable enterprise cloud technology. We are building Observability platforms to deliver performance, reliability and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. In this role, you will be responsible for producing and enhancing features that collect, transform, and surface data on the various components of our cloud. The ability to take in requirements on an agile basis and be able to work autonomously with high level perspective is a must. You understand cloud native concepts and have experience with highly tunable and scalable Kubernetes based cloud deployments. Youwill participate in the design of the service, writing tools and automation, building containers, developing tests, determining monitoring best practices and handling complex escalations.If you are the kind of person who is collaborative, able to handle responsibility and enjoys not only sharing a vision but getting your hands dirty to be sure that the vision is made a reality in a fast-paced, challenging environment, then we want to talk to you! Required education Bachelor's Degree Required technical and professional expertise Bachelor's in Engineering, Computer Science, or relevant experience 2+ years experience and expertise in programming atleast in one language Python/Go/Node.js 1+ years experience in developing and deploying applications on Kubernetes and containerization technologies like Docker 2+ years familiarity with working in a CICD environment 2+ years experience with developing and operating highly available, distributed applications in production environments on Kubernetes Experience with building automated tests, handling customer escalations, 1+ years experience with managing service dependencies via Terraform or Ansible At least 2 years of experience coding and troubleshooting applications written in Go, Python, Node.js, Express.js. 1+ years experience in operating with secure principles At least 3 years of experience with micro-service development At least 1 years' experience with no-sql database systems such as MongoDB At least 1 years' experience with operating, configuring, and developing with caching systems like redis Proven understanding of REST principles and architecture Familiarity with working with Cloud services (IBM Cloud, GCP, AWS, Azure) Preferred technical and professional experience Advanced Experience with Kubernetes Experience with development on PostgreSQL, Kafka, Elastic, MySQL, Redis, or MongoDB 2 years experience with managing Linux machines using Configuration management (eg, Chef, Puppet, Ansible). Debian experience is preferred 2+ years experience with ability to automate using scripting languages like Python, Shell Experience with troubleshooting, using and configuring Linux systems 2+ years experience with infrastructure automation 2+ years experience with using monitoring tooling like Grafana, Prometheus
Posted 3 weeks ago
4.0 - 9.0 years
8 - 12 Lacs
Pune
Work from Office
Primary Skills Node.js Backend Development - Proficient in building scalable and efficient server-side applications using Node.js, with a strong understanding of asynchronous programming, event-driven architecture, and non-blocking I/O. RESTful and GraphQL API Design - Experience in designing, developing, and maintaining RESTful APIs and GraphQL endpoints, including versioning, documentation, and adherence to best practices for performance and security. Express.js and Middleware Architecture - Skilled in using Express.js for routing, middleware integration, and request/response handling to build modular and maintainable backend services. Database Integration - Hands-on experience with both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Redis) databases, including schema design, query optimization, and ORM/ODM tools like Sequelize or Mongoose. Authentication and Authorization - Implementation of secure authentication mechanisms using JWT, OAuth2, and session-based strategies, along with role-based access control (RBAC) and API security best practices. Testing and Debugging - Proficient in writing unit, integration, and end-to-end tests using tools like Mocha, Chai, Jest, or Supertest, and debugging using Node Inspector or Chrome DevTools. API Documentation and Tools - Experience with API documentation tools such as Swagger (OpenAPI), Postman, and Insomnia for testing and sharing API specifications. Deployment and CI/CD - Familiarity with deploying Node.js applications on cloud platforms (AWS, Azure, GCP) and integrating with CI/CD pipelines using tools like GitHub Actions, Jenkins, or GitLab CI. Secondary Skills Knowledge of containerization using Docker and orchestration with Kubernetes Experience with message brokers like RabbitMQ, Kafka, or Redis Pub/Sub Familiarity with microservices architecture and service discovery patterns Understanding of serverless frameworks (e.g., AWS Lambda, Azure Functions) Exposure to frontend technologies (React, Angular) for full-stack collaboration Basic knowledge of logging and monitoring tools like ELK Stack, Prometheus, or Grafana Strong communication and collaboration skills for working in Agile teams
Posted 3 weeks ago
3.0 - 5.0 years
60 - 65 Lacs
Mumbai, Delhi / NCR, Bengaluru
Work from Office
We are seeking a talented and passionate Engineer to design, develop, and enhance our SaaS platform. As a key member of the team, you will work to create the best developer tools, collaborate with designers and engineers, and ensure our platform scales as it grows. The ideal candidate will have strong expertise in backend development, cloud infrastructure, and a commitment to delivering reliable systems. Location-Remote,Delhi NCR,Bangalore,Chennai,Pune,Kolkata,Ahmedabad,Mumbai,Hyderabad
Posted 3 weeks ago
8.0 - 12.0 years
35 - 60 Lacs
Pune
Work from Office
About the Role: We are seeking a skilled Site Reliability Engineer (SRE) / DevOps Engineer to join our infrastructure team. In this role, you will design, build, and maintain scalable infrastructure, CI/CD pipelines, and observability systems to ensure high availability, reliability, and security of our services. You will work cross-functionally with development, QA, and security teams to automate operations, reduce toil, and enforce best practices in cloud-native environments. Key Responsibilities: Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using Infrastructure as Code (Terraform). Maintain and improve CI/CD pipelines using tools like circleci, GitLab CI, or ArgoCD. Ensure high availability and performance of services using Kubernetes (GKE/EKS/AKS) and container orchestration. Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools. Collaborate with developers to optimize application performance and deployment processes. Manage and automate security controls such as IAM, RBAC, network policies, and vulnerability scanning. Basic Qualifications: Strong knowledge of Linux Experience with scripting languages such as Python, Bash, or Go. Experience with cloud platforms (GCP preferred, AWS or Azure acceptable). Proficient in Kubernetes operations, including Helm, operators, and service meshes. Experience with Infrastructure as Code (Terraform). Solid experience with CI/CD pipelines (GitLab CI, Circleci, ArgoCD, or similar). Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.). Experience with scripting languages such as Python, Bash, or Go. Knowledge of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls). Preferred Qualifications Experience with advanced networking solutions. Familiarity with SRE principles such as SLOs, SLIs, and error budgets. Exposure to multi-cluster or hybrid-cloud environments. Knowledge of service meshes (Istiol). Experience participating in incident management and postmortem processes.
Posted 3 weeks ago
1.0 - 6.0 years
3 - 8 Lacs
Bengaluru
Work from Office
We are seeking an experienced OpenShift Engineer to design, deploy, and manage containerized applications on Red Hat OpenShift. Key Responsibilities: Deploy, configure, and manage OpenShift clusters in hybrid/multi-cloud environments. Automate deployments using CI/CD pipelines (Jenkins, GitLab CI/CD, ArgoCD). Troubleshoot Kubernetes/OpenShift-related issues and optimize performance. Implement security policies and best practices for containerized workloads. Work with developers to containerize applications and manage microservices. Monitor and manage OpenShift clusters using Prometheus, Grafana, and logging tools.
Posted 3 weeks ago
1.0 - 5.0 years
3 - 8 Lacs
Bengaluru
Work from Office
We are seeking an experienced OpenShift Engineer to design, deploy, and manage containerized applications on Red Hat OpenShift. Key Responsibilities: Deploy, configure, and manage OpenShift clusters in hybrid/multi-cloud environments. Automate deployments using CI/CD pipelines (Jenkins, GitLab CI/CD, ArgoCD). Troubleshoot Kubernetes/ OpenShift-related issues and optimize performance. Implement security policies and best practices for containerized workloads. Work with developers to containerize applications and manage microservices. Monitor and manage OpenShift clusters using Prometheus, Grafana, and logging tools.
Posted 3 weeks ago
6.0 - 11.0 years
16 - 31 Lacs
Noida, Hyderabad, Gurugram
Hybrid
Sr Site Reliability Engineer Role Overview: . Our team supports a range of critical functions, including: Resiliency and Reliability Initiatives : Partnering with teams on various improvement projects. Observability : Ensuring comprehensive visibility into our systems. Alert Analysis and Optimization : Refining alert mechanisms to minimize disruptions. Automation and Self-Healing : Implementing automated solutions to proactively address issues. Incident and Problem Management : Supporting priority incidents, assisting with restoration, root cause analysis, and preventative actions. Release Management : Streamlining the release process for seamless updates. Monitor Engineering : Support installation, configuration, and review of monitor instrumentation. Cloud Operations support : Supporting and augmenting activities handled by the OptumRx Public Cloud team. Responsibilities: Support priority incidents. Gather requirements for automation opportunities and instrumentation improvements. Recommend improvements and changes to monitoring configurations and service architecture design. Summarize and provide updates to key stakeholders. Assist with incident/problem root cause analysis (RCA) and identification of trends. Help teams define service level objectives and build views within monitoring tools. Conduct analysis on alerts and incident data and recommend changes and improvements. Drive improvements to monitoring and instrumentation for services. Assess and monitor overall application stability and performance, providing insights for potential improvements. Build automation and self-healing capabilities to improve efficiency, stability, and reliability of services. Participate in rotational on-call support. Technical Skills: Proficiency in monitoring and instrumentation tools. U nderstanding of application performance monitoring (APM) and log management tools, with a preference for Dynatrace and Splunk. Experience with automation and scripting languages (e.g., Python, Bash, PowerShell) with a preference for Python. Experience implementing comprehensive monitoring for services to detect anomalies and trigger timely alerts. Understanding of cloud platforms (e.g., AWS, Azure, Google Cloud). Knowledge of incident management and root cause analysis (RCA) processes. Familiarity with service level objectives (SLOs) and service level agreements (SLAs). Analytical Skills: Ability to analyze alerts and incident data to identify trends and recommend improvements. Strong problem-solving skills and attention to detail. Communication Skills: Excellent verbal and written communication skills. Ability to summarize and present updates to key stakeholders effectively. Collaboration Skills: Experience working in cross-functional teams. Ability to collaborate with different teams to define service level objectives, gather requirements, discuss opportunities and recommend improvements. Experience working across geographies. Operational Skills: Ability to participate in rotational on-call support.
Posted 3 weeks ago
10.0 - 20.0 years
10 - 20 Lacs
Hyderabad, Chennai, Bengaluru
Work from Office
Job Description: Strong & assertive communication skills, able to take command of bridges & discussions Understanding of web applications, typical java stack, APIs, common errors, familiarity with technical terminologies Know how of application hosting (cloud /container / on-prem) & components involved (GTM, LTM, Firewall, network, Infra) Ability to leverage tools & dashboards like Splunk, AppD, Grafana to ask right questions during triage Work experience in production support environment. Handled projects of this nature & worked on P1 /P2 incident bridges
Posted 3 weeks ago
3.0 - 5.0 years
15 - 20 Lacs
Pune
Work from Office
About the job Sarvaha would like to welcome Kafka Platform Engineer (or a seasoned backend engineer aspiring to move into platform architecture) with a minimum of 4 years of solid experience in building, deploying, and managing Kafka infrastructure on Kubernetes platforms. Sarvaha is a niche software development company that works with some of the best funded startups and established companies across the globe. Please visit our website at What Youll Do - Deploy and manage scalable Kafka clusters on Kubernetes using Strimzi, Helm, Terraform, and StatefulSets - Tune Kafka for performance, reliability, and cost-efficiency - Implement Kafka security: TLS, SASL, ACLs, Kubernetes Secrets, and RBAC - Automate deployments across AWS, GCP, or Azure - Set up monitoring and alerting with Prometheus, Grafana, JMX Exporter - Integrate Kafka ecosystem components: Connect, Streams, Schema Registry - Define autoscaling, resource limits, and network policies for Kubernetes workloads - Maintain CI/CD pipelines (ArgoCD, Jenkins) and container workflows You Bring - BE/BTech/ MTech (CS/IT or MCA), with an emphasis in Software Engineering - Strong foundation in the Apache Kafka ecosystem and internals (brokers, ZooKeeper/ KRaft, partitions, storage) - Proficient in Kafka setup, tuning, scaling, and topic/partition management - Skilled in managing Kafka on Kubernetes using Strimzi, Helm, Terraform - Experience with CI/CD, containerization, and GitOps workflows - Monitoring expertise using Prometheus, Grafana, JMX - Experience on EKS, GKE, or AKS preferred - Strong troubleshooting and incident response mindset - High sense of ownership and automation-first thinking - Excellent collaboration with SREs, developers, and platform teams - Clear communicator, documentation-driven, and eager to mentor/share knowledge.
Posted 3 weeks ago
3.0 - 5.0 years
4 - 8 Lacs
Bengaluru
Work from Office
Role Purpose The purpose of the role is to resolve, maintain and manage client’s software/ hardware/ network based on the service requests raised from the end-user as per the defined SLA’s ensuring client satisfaction Do Ensure timely response of all the tickets raised by the client end user Service requests solutioning by maintaining quality parameters Act as a custodian of client’s network/ server/ system/ storage/ platform/ infrastructure and other equipment’s to keep track of each of their proper functioning and upkeep Keep a check on the number of tickets raised (dial home/ email/ chat/ IMS), ensuring right solutioning as per the defined resolution timeframe Perform root cause analysis of the tickets raised and create an action plan to resolve the problem to ensure right client satisfaction Provide an acceptance and immediate resolution to the high priority tickets/ service Installing and configuring software/ hardware requirements based on service requests 100% adherence to timeliness as per the priority of each issue, to manage client expectations and ensure zero escalations Provide application/ user access as per client requirements and requests to ensure timely solutioning Track all the tickets from acceptance to resolution stage as per the resolution time defined by the customer Maintain timely backup of important data/ logs and management resources to ensure the solution is of acceptable quality to maintain client satisfaction Coordinate with on-site team for complex problem resolution and ensure timely client servicing Review the log which Chat BOTS gather and ensure all the service requests/ issues are resolved in a timely manner Deliver NoPerformance ParameterMeasure1. 100% adherence to SLA/ timelines Multiple cases of red time Zero customer escalation Client appreciation emails Mandatory Skills: AIOPS Grafana Observability. Experience3-5 Years.
Posted 3 weeks ago
7.0 - 10.0 years
6 - 11 Lacs
Bengaluru
Work from Office
Job Title:DevOps Lead Experience7-10 Years Location:Bengaluru : Overall, 7-10 years of experience in IT In-depth knowledge of GCP services and resources to design, deploy, and manage cloud infrastructure efficiently. Certification is big plus. Proficiency in Java or Shell or Python scripting. Develop, maintain, and optimize Infrastructure as Code scripts and templates using tools like Terraform and Ansible, ensuring resource automation and consistency. Strong expertise in Kubernetes using Helm, HAProxy, and containerization technologies Manage and fine-tune databases, including Neo4j, MySQL, PostgreSQL, and Redis Cache Clusters, to ensure performance and data integrity. Skill in managing and optimizing Apache Kafka and RabbitMQ to facilitate efficient data processing and communication. Design and maintain Virtual Private Cloud (VPC) network architecture for secure and efficient data transmission. Implement and maintain monitoring tools such as Prometheus, Zipkin, Loki and Grafana. Utilize Helm charts and Kubernetes (K8s) manifests for containerized application management. Proficient with Git, Jenkins, and ArgoCD to set up and enhance CI and CD pipelines. Utilize Google Artifact Registry and Google Container Registry for artifact and container image management. Familiarity with CI/CD practices, version control and branching and DevOps methodologies. Strong understanding of cloud network design, security, and best practices. Strong Linux and Network debugging skills Primary Skills: - Strong Kubernetes GKE Clusters Grafana Prometheus Terraform and Ansible - good working knowledge Devops Why Join Us: Opportunity to work in a fast-paced and innovative environment. Collaborative team culture with continuous learning and growth opportunities
Posted 3 weeks ago
8.0 - 12.0 years
2 - 6 Lacs
Bengaluru
Work from Office
Job Title:Performance Testing Experience8-12 Years Location:Bangalore JMETER (min 5+ yrs ) 8+yearsofstrongexperienceinPerformanceTesting,candidateshouldbeabletocodeanddesignPerformancetestscripts Abletosetup/maintainandexecutePerformancetestscriptsfromscratch. GoodinJmeter,AzureDevops,Grafana(Goodtohave) ExcellentBusinesscommunicationskill ExperienceinPerformancetestingforbothwebandAPI ExperienceinAgilemethodologyandprocess CustomerInteractionandworkindependentlyonhisownonthedailytasks. Goodinteamhandling,EffortEstimation,PerformanceMetricstrackingandTaskmanagement. Shouldbeproactive,solutionproviderandgoodinstatusreporting.
Posted 3 weeks ago
8.0 - 12.0 years
25 - 40 Lacs
Kolkata, Hyderabad, Bengaluru
Hybrid
Job Title: ELK Developer Experience Required: 8 - 12 Years Location: Hyderabad, Bangalore (Preferred) Also open to Chennai, Mumbai, Pune, Kolkata, Gurgaon Work Mode: On-site / Hybrid Job Summary: We are seeking a highly experienced ELK Developer with a strong background in designing and implementing monitoring, logging, and visualization solutions using the ELK Stack (Elasticsearch, Logstash, Kibana) . The ideal candidate should also have hands-on expertise with Linux/Solaris administration , scripting for automation, and performance testing. Additional experience with modern DevOps tools and monitoring platforms like Grafana and Prometheus is a plus. Primary Responsibilities: Design, implement, and maintain solutions using ELK Stack Elasticsearch , Logstash , Kibana , and Beats Create dashboards and visualizations in Kibana to support real-time data analysis and operational monitoring Define and apply indexing strategies , configure log forwarding , and manage log parsing with Regex Set up and manage data aggregation, pipeline testing, and performance evaluation Develop and maintain custom rules for alerting, anomaly detection, and reporting Troubleshoot log ingestion, parsing, and query performance issues Automate jobs and notifications through scripts (Bash, PowerShell, Python, etc.) Perform Linux/Solaris system administration tasks: Monitor services and system health Manage memory and disk usage Schedule jobs, update packages, and maintain uptime Work closely with DevOps, Infrastructure, and Application teams to ensure system integrity and availability Must-Have Skills: Strong hands-on experience with the ELK Stack (Elasticsearch, Logstash, Kibana) Proficient in Regex , SQL , JSON , YAML , XML Deep understanding of indexing , aggregation , and log parsing Experience in AppDynamics and related observability platforms Proven skills in Linux/Solaris system administration Proficiency in scripting (Shell, Python, PowerShell, Bash) for log handling, jobs, and notifications Experience in performance testing and optimization Good-to-Have / Secondary Skills: Experience with Grafana and Prometheus for metrics and visualization Knowledge of web and middleware components: HTTP server , HAProxy , Keepalived , Tomcat , NGINX Familiarity with DevOps tools: Git, Bitbucket, GitHub, Helm charts, Terraform, JMeter Programming/Scripting experience in Perl , Java , JavaScript Hands-on with CI/CD tools: TeamCity , Octopus , Nexus Working knowledge of Agile methodologies and JIRA Education: Bachelors or Master’s degree in Computer Science, Engineering, or a related field
Posted 3 weeks ago
3.0 - 5.0 years
13 - 15 Lacs
Gurugram
Work from Office
A skilled DevOps Engineer to manage and optimize both on-premises and AWS cloud infrastructure. The ideal candidate will have expertise in DevOps tools, automation, system administration, and CI/CD pipeline management while ensuring security, scalability, and reliability. Key Responsibilities: 1. AWS & On-Premises Solution Architecture: o Design, deploy, and manage scalable, fault-tolerant infrastructure across both on-premises and AWS cloud environments. o Work with AWS services like EC2, IAM, VPC, CloudWatch, GuardDuty, AWS Security Hub, Amazon Inspector, AWS WAF, and Amazon RDS with Multi-AZ. o Configure ASG and implement load balancing techniques such as ALB and NLB. o Optimize cost and performance leveraging Elastic Load Balancing and EFS. o Implement logging and monitoring with CloudWatch, CloudTrail, and on-premises monitoring solutions. 2. DevOps Automation & CI/CD: o Develop and maintain CI/CD pipelines using Jenkins and GitLab for seamless code deployment across cloud and on-premises environments. o Automate infrastructure provisioning using Ansible, and CloudFormation. o Implement CI/CD pipeline setups using GitLab, Maven, Gradle, and deploy on Nginx and Tomcat. o Ensure code quality and coverage using SonarQube. o Monitor and troubleshoot pipelines and infrastructure using Prometheus, Grafana, Nagios, and New Relic. 3. System Administration & Infrastructure Management: o Manage and maintain Linux and Windows systems across cloud and on-premises environments, ensuring timely updates and security patches. o Configure and maintain web/application servers like Apache Tomcat and web servers like Nginx and Node.js. o Implement robust security measures, SSL/TLS configurations, and secure communications. o Configure DNS and SSL certificates. o Maintain and optimize on-premises storage, networking, and compute resources. 4. Collaboration & Documentation: o Collaborate with development, security, and operations teams to optimize deployment and infrastructure processes. o Provide best practices and recommendations for hybrid cloud and on-premises architecture, DevOps, and security. o Document infrastructure designs, security configurations, and disaster recovery plans for both environments. Required Skills & Qualifications: Cloud & On-Premises Expertise: Extensive knowledge of AWS services (EC2, IAM, VPC, RDS, etc.) and experience managing on-premises infrastructure. DevOps Tools: Proficiency in SCM tools (Git, GitLab), CI/CD (Jenkins, GitLab CI/CD), and containerization. Code Quality & Monitoring: Experience with SonarQube, Prometheus, Grafana, Nagios, and New Relic. Operating Systems: Experience managing Linux/Windows servers and working with CentOS, Fedora, Debian, and Windows platforms. Application & Web Servers: Hands-on experience with Apache Tomcat, Nginx, and Node.js. Security & Networking: Expertise in DNS configuration, SSL/TLS implementation, and AWS security services. Soft Skills: Strong problem-solving abilities, effective communication, and proactive learning. Preferred Qualifications: AWS certifications (Solutions Architect, DevOps Engineer) and a bachelors degree in Computer Science or related field. Experience with hybrid cloud environments and on-premises infrastructure automation.
Posted 3 weeks ago
6.0 - 10.0 years
10 - 12 Lacs
Gurugram
Work from Office
We are seeking a Senior Solution Consultant with 6-10 years of experience in API integration, chatbot development, and data analysis tools. The role involves designing solutions, managing virtual assistants' architecture, gathering business requirements, and leading end-to-end delivery. The candidate should have experience in APIs like OData, REST, SOAP, as well as tools like Postman, Excel, and Google Sheets. Familiarity with NoSQL databases such as MongoDB, natural language processing, chatbot platforms (Dialogflow, Microsoft Bot Framework), and analytical tools like Power BI, Tableau, or Grafana is essential. The consultant will work closely with cross-functional teams, mentor other consultants, and ensure seamless integration with customer systems. Skills : - API Integration, Chatbot Development, OData, REST, Node.js, MongoDB, API Integration, Chatbot Development, OData, REST, SOAP, Node.js, Postman, Google Sheets, MongoDB, Power BI, Tableau, Grafana, Natural Language Processing, Microsoft Bot Framework, Dialogflow, Data Analysis
Posted 3 weeks ago
5.0 - 8.0 years
30 - 32 Lacs
Gurugram, Sector-39
Work from Office
Responsibilities: As a Senior DevOps Engineer at SquareOps, you'll be expected to: Drive the scalability and reliability of our customers' cloud applications. Work directly with clients, engineering, and infrastructure teams to deliver high-quality solutions. Design and develop various systems from scratch with a focus on scalability, security, and compliance. Develop deployment strategies and build configuration management systems. Lead a team of junior DevOps engineers, providing guidance and support on day-to-day activities. Drive innovation within the team, promoting the adoption of new technologies and practices to improve project outcomes. Demonstrate ownership and accountability for project implementations, ensuring projects are delivered on time and within budget. Act as a mentor to junior team members, fostering a culture of continuous learning and growth. The Ideal Candidate: A proven track record in architecting complex production systems with multi-tier application stacks. Expertise in designing solutions tailored to industry-specific requirements such as SaaS, AI, Data Ops, and highly compliant enterprise architectures. Extensive experience working with Kubernetes, various CI/CD tools, and cloud service providers, preferably AWS. Proficiency in automating cloud infrastructure management, primarily with tools like Terraform, Shell scripting, AWS Lambda, and Event Bridge. Solid understanding of cloud financial management strategies to ensure cost-effective use of cloud resources. Experience in setting up high availability and disaster recovery for cloud infrastructure. Strong problem-solving skills with an innovative mindset. Excellent communication skills, capable of effectively liaising with clients, engineering, and infrastructure teams. The ability to lead and mentor a team, guiding them to achieve their objectives. 10. High levels of empathy and emotional intelligence, with a talent for managing and resolving conflict. An adaptable nature, comfortable working in a fast-paced, dynamic environment. At SquareOps, we believe in the power of diversity and inclusion. We encourage applicants of all backgrounds, experiences, and perspectives to apply.
Posted 3 weeks ago
7.0 - 10.0 years
13 - 23 Lacs
Bengaluru
Work from Office
Title : DevOps Engineer Location : Bangalore Office (4 Days WFO) Exp : 7 to 10 Years Skills : Devops, Kubernetes, CI/CD, Prometheus or Grafana, AWS, Basic SRE
Posted 3 weeks ago
8.0 - 10.0 years
13 - 15 Lacs
Pune
Work from Office
We are seeking a hands-on Lead Data Engineer to drive the design and delivery of scalable, secure data platforms on Google Cloud Platform (GCP). In this role you will own architectural decisions, guide service selection, and embed best practices across data engineering, security, and performance disciplines. You will partner with data modelers, analysts, security teams, and product owners to ensure our pipelines and datasets serve analytical, operational, and AI/ML workloads with reliability and cost efficiency. Familiarity with Microsoft Azure data services (Data Factory, Databricks, Synapse, Fabric) is valuable, as many existing workloads will transition from Azure to GCP. Key Responsibilities Lead end-to-end development of high-throughput, low-latency data pipelines and lake-house solutions on GCP (BigQuery, Dataflow, Pub/Sub, Dataproc, Cloud Composer, Dataplex, etc.). Define reference architectures, technology standards for data ingestion, transformation, and storage. Drive service-selection trade-offscost, performance, scalability, and securityacross streaming and batch workloads. Conduct design reviews and performance tuning sessions; ensure adherence to partitioning, clustering, and query-optimization standards in BigQuery. Contribute to long-term cloud data strategy, evaluating emerging GCP features and multi-cloud patterns (Azure Synapse, Data Factory, Purview, etc.) for future adoption. Lead the code reviews and oversee the development activities delegated to Data engineers. Implement best practices recommended by Google Cloud Provide effort estimates for the data engineering activities Participate in discussions to migrate existing Azure workloads to GCP, provide solutions to migrate the work loads for selected data pipelines Must-Have Skills 810 years in data engineering, with 3+ years leading teams or projects on GCP. Expert in GCP data services (BigQuery, Dataflow/Apache Beam, Dataproc/Spark, Pub/Sub, Cloud Storage) and orchestration with Cloud Composer or Airflow. Proven track record designing and optimizing large-scale ETL/ELT pipelines (streaming + batch). Strong fluency in SQL and one major programming language (Python, Java, or Scala). Deep understanding of data lake / lakehouse, dimensional & data-vault modeling, and data governance frameworks. Excellent communication and stakeholder-management skills; able to translate complex technical topics to non-technical audiences. Nice-to-Have Skills Hands-on experience with Microsoft Azure data services (Azure Synapse Analytics, Data Factory, Event Hub, Purview). Experience integrating ML pipelines (Vertex AI, Dataproc ML) or real-time analytics (BigQuery BI Engine, Looker). Familiarity with open-source observability stacks (Prometheus, Grafana) and FinOps tooling for cloud cost optimization. Preferred Certifications Google Professional Data Engineer (strongly preferred) or Google Professional Cloud Architect Microsoft Certified: Azure Data Engineer Associate (nice to have) Education Bachelors or Masters degree in Computer Science, Information Systems, Engineering, or a related technical field. Equivalent professional experience will be considered.
Posted 3 weeks ago
2.0 - 4.0 years
11 - 12 Lacs
Bengaluru
Work from Office
Employment Type: Contract iSource Services is hiring for one of their client for the position of Commerce - DevOps - Engineer II About the Role: We are looking for a skilled DevOps Engineer (Level II) to support our Commerce platform. The ideal candidate will have 24 years of experience with a strong foundation in DevOps practices, CI/CD pipelines, and solid exposure to React.js, Node.js, and MongoDB for build and deployment automation. Key Responsibilities: Manage CI/CD pipelines and deployment automation for commerce applications Collaborate with development teams using React.js, Node.js, and MongoDB Monitor system performance, automate infrastructure, and troubleshoot production issues Maintain and improve infrastructure as code using tools like Terraform, Ansible, or similar Ensure security, scalability, and high availability of environments Participate in incident response and post-mortem analysis Qualifications: 24 years of hands-on experience in DevOps engineering Proficiency in CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI) Working knowledge of React.js, Node.js, and MongoDB Experience with containerization (Docker, Kubernetes) Familiarity with monitoring tools (e.g., Prometheus, Grafana, ELK stack) Good scripting skills (Shell, Python, or similar)
Posted 3 weeks ago
3.0 - 5.0 years
5 - 7 Lacs
Pune
Work from Office
Role Overview Join our Pune AI Center of Excellence to drive software and product development in the AI space. As an AI/ML Engineer, youll build and ship core components of our AI products—owning end-to-end RAG pipelines, persona-driven fine-tuning, and scalable inference systems that power next-generation user experiences. Key Responsibilities Model Fine-Tuning & Persona Design Adapt and fine-tune open-source large language models (LLMs) (e.g. CodeLlama, StarCoder) to specific product domains. Define and implement “personas” (tone, knowledge scope, guardrails) at inference time to align with product requirements. RAG Architecture & Vector Search Build retrieval-augmented generation systems: ingest documents, compute embeddings, and serve with FAISS, Pinecone, or ChromaDB. Design semantic chunking strategies and optimize context-window management for product scalability. Software Pipeline & Product Integration Develop production-grade Python data pipelines (ETL) for real-time vector indexing and updates. Containerize model services in Docker/Kubernetes and integrate into CI/CD workflows for rapid iteration. Inference Optimization & Monitoring Quantize and benchmark models for CPU/GPU efficiency; implement dynamic batching and caching to meet product SLAs. Instrument monitoring dashboards (Prometheus/Grafana) to track latency, throughput, error rates, and cost. Prompt Engineering & UX Evaluation Craft, test, and iterate prompts for chatbots, summarization, and content extraction within the product UI. Define and track evaluation metrics (ROUGE, BLEU, human feedback) to continuously improve the product’s AI outputs. Must-Have Skills ML/AI Experience: 3–4 years in machine learning and generative AI, including 18 months on LLM- based products. Programming & Frameworks: Python, PyTorch (or TensorFlow), Hugging Face Transformers. RAG & Embeddings: Hands-on with FAISS, Pinecone, or ChromaDB and semantic chunking. Fine-Tuning & Quantization: Experience with LoRA/QLoRA, 4-bit/8-bit quantization, and model context protocol (MCP). Prompt & Persona Engineering: Deep expertise in prompt-tuning and persona specification for product use cases. Deployment & Orchestration: Docker, Kubernetes fundamentals, CI/CD pipelines, and GPU setup. Nice-to-Have Multi-modal AI combining text, images, or tabular data. Agentic AI systems with reasoning and planning loops. Knowledge-graph integration for enhanced retrieval. Cloud AI services (AWS SageMaker, GCP Vertex AI, or Azure Machine Learning)
Posted 3 weeks ago
8.0 - 11.0 years
35 - 37 Lacs
Kolkata, Ahmedabad, Bengaluru
Work from Office
Dear Candidate, We are hiring an SRE to improve reliability and scalability of production systems. Ideal for engineers passionate about automation, monitoring, and performance optimization. Key Responsibilities: Design and implement SLOs, SLAs, and alerting systems Automate operational tasks and incident responses Build robust observability into all services Conduct post-incident reviews and root cause analysis Required Skills & Qualifications: Strong coding/scripting skills (Python, Go, Bash) Experience with cloud services and Kubernetes Knowledge of monitoring/logging tools (Datadog, Prometheus, ELK) Bonus: Background in performance engineering or chaos testing Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Delivery Manager Integra Technologies
Posted 3 weeks ago
1.0 - 6.0 years
2 - 6 Lacs
Bengaluru
Work from Office
We are seeking an experienced OpenShift Engineer to design, deploy, and manage containerized applications on Red Hat OpenShift. Key Responsibilities: Design, deploy, and manage OpenShift container platforms in on-premises and cloud environments. Configure and optimize OpenShift clusters to ensure high availability and scalability. Implement CI/CD pipelines and automation for containerized applications. Monitor and troubleshoot OpenShift environments, identifying and resolving issues proactively. Work closely with development teams to support containerized application deployment and orchestration. Manage security policies, access controls, and compliance for OpenShift environments. Perform upgrades, patches, and maintenance of OpenShift infrastructure. Develop and maintain documentation for OpenShift architecture, configurations, and best practices. Stay updated with industry trends and emerging technologies in containerization and Kubernetes. Deploy, configure, and manage OpenShift clusters in hybrid/multi-cloud environments. Automate deployments using CI/CD pipelines (Jenkins, GitLab CI/CD, ArgoCD). Troubleshoot Kubernetes/OpenShift-related issues and optimize performance. Implement security policies and best practices for containerized workloads. Work with developers to containerize applications and manage microservices. Monitor and manage OpenShift clusters using Prometheus, Grafana, and logging tools.
Posted 3 weeks ago
3.0 - 8.0 years
15 - 20 Lacs
Pune
Work from Office
About the job Sarvaha would like to welcome Kafka Platform Engineer (or a seasoned backend engineer aspiring to move into platform architecture) with a minimum of 4 years of solid experience in building, deploying, and managing Kafka infrastructure on Kubernetes platforms. Sarvaha is a niche software development company that works with some of the best funded startups and established companies across the globe. Please visit our website at What Youll Do - Deploy and manage scalable Kafka clusters on Kubernetes using Strimzi, Helm, Terraform, and StatefulSets - Tune Kafka for performance, reliability, and cost-efficiency - Implement Kafka security: TLS, SASL, ACLs, Kubernetes Secrets, and RBAC - Automate deployments across AWS, GCP, or Azure - Set up monitoring and alerting with Prometheus, Grafana, JMX Exporter - Integrate Kafka ecosystem components: Connect, Streams, Schema Registry - Define autoscaling, resource limits, and network policies for Kubernetes workloads - Maintain CI/CD pipelines (ArgoCD, Jenkins) and container workflows You Bring - BE/BTech/MTech (CS/IT or MCA), with an emphasis in Software Engineering - Strong foundation in the Apache Kafka ecosystem and internals (brokers, ZooKeeper/KRaft, partitions, storage) - Proficient in Kafka setup, tuning, scaling, and topic/partition management - Skilled in managing Kafka on Kubernetes using Strimzi, Helm, Terraform - Experience with CI/CD, containerization, and GitOps workflows - Monitoring expertise using Prometheus, Grafana, JMX - Experience on EKS, GKE, or AKS preferred - Strong troubleshooting and incident response mindset - High sense of ownership and automation-first thinking - Excellent collaboration with SREs, developers, and platform teams - Clear communicator, documentation-driven, and eager to mentor/share knowledge Why Join Sarvaha? - Top notch remuneration and excellent growth opportunities - An excellent, no-nonsense work environment with the very best people to work with - Highly challenging software implementation problems - Hybrid Mode. We offered complete work from home even before the pandemic.
Posted 3 weeks ago
1.0 - 2.0 years
6 - 8 Lacs
Bengaluru
Work from Office
CI/ CD Developer || 1-2 years exp || Bangalore || Work from office Roles & Responsibilities o Automate and optimize CI/CD workflows to enhance efficiency and developer productivity o Design, implement, and maintain automated CI/CD pipelines for seamless code testing, building, and deployment. o Integrate automated testing (unit, integration, performance) to ensure code quality before deployment. o Manage and monitor CI/CD/DevOps infrastructure to ensure high availability. o Embed security best practices in the DevOps pipeline, addressing vulnerabilities early and ensuring compliance. o Oversee monitoring, logging, root cause analysis, and preventive measures for system failures. o Manage user roles, permissions, and enforce security policies across environments. o Generate actionable insights through interactive reports and visualizations using Power BI. o Collaborate with development teams to understand CI/CD needs and deliver effective solutions. o Possess strong analytical, technical, and problem-solving skills with a research-driven approach. o Be a self-starter, contributing to the adoption of DevOps/CI/CD practices. o Research and evaluate new DevOps tools for continuous improvement. o Document CI/CD/DevOps infrastructure, workflows, and automation processes. Skills Technical o Programming and automation: Python, windows batch scripts/Power Shell o Good knowledge of windows platform o Build Tool: Jenkins o Version control: Subversion o Visualization and reporting: PowerBI o Cloud computing, Containerization orchestration You are best equipped for this role if you have o Expertise and working knowledge of Agile Software Development Methodology o Expert knowledge and hands-on experience in scripting (Power shell/batch/python), automation, DevOps tools and methodologies o Expert knowledge and working experience in build automation using Jenkins o Hands on experience in creating and managing Jenkins pipelines o Skilled in Jenkins server administration o Hands on experience in version control tools: Subversion (SVN), Git o Skilled in administrating version control tools on server: Subversion (SVN), Git o Use and integrate different industry standard tools that fit the different parts of the SDLC. o Knowledge of PowerBI for Visualization and reporting o Knowledge of Cloud computing, containerization orchestration o Team player with good communication Skills - Nice to Have o Knowledge and exposer on containerization using Docker, Kubernetes, OpenShift. o Knowledge and exposer on Monitoring and Logging using Prometheus, Grafana o Understand the complete software development life cycle (SDLC).
Posted 3 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Grafana is a popular tool used for monitoring and visualizing metrics, logs, and other data. In India, the demand for Grafana professionals is on the rise as more companies are adopting this tool for their monitoring and analytics needs.
The average salary range for Grafana professionals in India varies based on experience level: - Entry-level: ₹4-6 lakhs per annum - Mid-level: ₹8-12 lakhs per annum - Experienced: ₹15-20 lakhs per annum
A typical career path in Grafana may include roles such as: 1. Junior Grafana Developer 2. Grafana Developer 3. Senior Grafana Developer 4. Grafana Tech Lead
In addition to Grafana expertise, professionals in this field often benefit from having knowledge or experience in: - Monitoring tools such as Prometheus - Data visualization tools like Tableau - Scripting languages (e.g., Python, Bash) - Understanding of databases (e.g., SQL, NoSQL)
As the demand for Grafana professionals continues to grow in India, it is essential to stay updated with the latest trends and technologies in this field. Prepare thoroughly for interviews and showcase your skills confidently to land your dream job in Grafana. Good luck!
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.