Home
Jobs

609 Prometheus Jobs - Page 19

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7 - 12 years

9 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Project Role : Application Support Engineer Project Role Description : Act as software detectives, provide a dynamic service identifying and solving issues within multiple components of critical business systems. Must have skills : Site Reliability Engineering Good to have skills : NA Minimum 7.5 year(s) of experience is required Educational Qualification : 15 Years of Education Summary: As a Site Reliability Engineer in Application/Cloud Support, you will be responsible for ensuring the reliability, scalability, and performance of critical business systems. Your typical day will involve identifying and solving issues within multiple components of these systems, utilizing your expertise in Site Reliability Engineering. Roles & Responsibilities: - Lead efforts to improve the reliability, scalability, and performance of critical business systems, utilizing your expertise in Site Reliability Engineering. - Collaborate with cross-functional teams to identify and solve issues within multiple components of these systems, acting as a software detective to ensure their smooth operation. - Develop and implement monitoring and alerting systems to proactively identify and address potential issues before they impact system performance. - Automate manual processes and tasks to improve system efficiency and reduce the risk of human error. - Stay up-to-date with the latest advancements in Site Reliability Engineering and related technologies, integrating innovative approaches for sustained competitive advantage. Professional & Technical Skills: - Must To Have Skills: Expertise in Site Reliability Engineering. - Good To Have Skills: Experience with cloud technologies such as AWS or Azure, and proficiency in programming languages such as Python or Java. - Strong understanding of system architecture and design principles, including experience with microservices and containerization technologies such as Docker and Kubernetes. - Experience with monitoring and alerting tools such as Prometheus, Grafana, or Nagios. - Solid grasp of automation and configuration management tools such as Ansible, Chef, or Puppet. - Experience with incident management and root cause analysis processes, including experience with tools such as JIRA or ServiceNow. Additional Information: - The candidate should have a minimum of 7.5 years of experience in Site Reliability Engineering. - The ideal candidate will possess a strong educational background in computer science, engineering, or a related field, along with a proven track record of delivering impactful solutions in Application/Cloud Support. - This position is based at our Bengaluru office. Qualification 15 Years of Education

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Coimbatore

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Chandigarh

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Hyderabad

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

5 - 10 years

10 - 15 Lacs

Hyderabad

Work from Office

Naukri logo

The Provider Technology Shared Services Engineering team is seeking a Software Engineer Lead Analyst for a Band 3 Contributor Career Track position. The Software Engineer Lead Analyst will play a critical role in system development within the broader Provider Technology Solutions and Engineering organization, significantly influencing Operations and Technology Product Management. This position will provide expertise in the engineering, design, installation, and startup of automated systems, including a self-service onboarding kit that enables users to begin utilizing the solution within minutes. The solutions developed will be accessible to individuals with minimal technical skills and will require no additional coding, ensuring zero maintenance is needed. As a member of our team, you will operate within a high-performance, high-frequency enterprise technology environment. This role entails collaborating closely with IT management and staff to identify automated solutions that leverage existing resources with tailored configurations for each use case. The objective is to minimize redundancy in solutions while promoting an enterprise mindset focused on reusability and maintaining high standards, ultimately ensuring minimal future maintenance requirements. The Software Engineer Lead Analyst demonstrates significant creativity, foresight, and sound judgment in the conception, planning, and execution of initiatives. This role requires extensive professional knowledge and expertise to effectively advise functional leaders. Additionally, the Lead Analyst stays informed about the latest advancements in technology, including AI and machine learning, to enhance both existing and new automation solutions. These solutions are designed to optimize production costs while facilitating the addition or updating of features aimed at improving the overall software development lifecycle experiences. Responsibilities: Provide comprehensive consultation to business unit and IT management, as well as personnel, regarding all facets of application development, testing and automation solutions across diverse development, financial, operational, and computing environments. Offers leadership and strategic vision in architectural design and DevOps guidance for the team. Conduct comprehensive research and evaluation of all potential solutions to recommend the most efficient and cost-effective automation solution that can be reused with an enterprise mindset, facilitating scalability for both existing and new applications with minimal modifications. Ensures that engineering solutions are aligned with the overall Technology strategy while addressing all application requirements. Demonstrate industry-leading technical abilities that enhance product quality and optimize day-to-day operations. Understand how changes impact work upstream and downstream including various back-end and front-end architectural modules. Enhance personnel effectiveness using heat matrices to prioritize Quality and Development Engineering resources on high-impact interfaces while identifying areas of lesser focus. Proactively monitor and manage the design of supported automation solutions, ensuring scalability, stability, flexibility, simplicity, performance, availability, security, and capacity. Develop and implement automation solutions to improve engineering and operational efficiency. Troubleshoot and optimize automated solutions and related artifacts to ensure seamless execution in CI/CD pipelines and on local machines, minimizing software and package dependencies or conflicts to reduce cycle time. Execute on a strategy to hand over the automation solutions to every Agile teams for adoption and use within their areas of focus, requiring zero maintenance and minimal effort for any enhancements without delving into coding. Encouraging and building automated processes wherever possible. Recognized internally as a subject matter expert. Required Skills: A solid foundation and practical experience in programming languages are essential, with a particular emphasis on Python, Shell Scripts, Bash Scripts, Groovy Scripts, Ansible Scripts, and Docker Scripts being highly desirable. Experience with Linux-based infrastructure. Strong understanding of object-oriented programming concepts. Knowledge of version control systems (e.g., GitHub). Practical experience in software packaging and importing packages at runtime (e.g., J-Frog, Quay). Proficient with continuous integration and continuous deployment (CI/CD) tools (e.g., Jenkins, Cloud Bees). Familiar with containerization and orchestration technologies (e.g., Docker, Kubernetes). Knowledgeable about infrastructure monitoring tools (e.g., Prometheus, Grafana). Understanding of Infrastructure as Code (IaC) principles (e.g., Terraform, Helm). Demonstrated expertise in cloud infrastructure and platforms, including Amazon Web Services and OpenShift. Proven experience in developing and managing automated infrastructure, overseeing CI/CD pipelines, monitoring application performance, and working collaboratively with development and operations teams to enhance software delivery processes. Required Experience & Education: A Bachelor's degree in Computer Science or a related field is required. A minimum of 5 years of professional experience in Automation or DevOps engineering is necessary. At least 3 years of experience in Agile methodologies is required. Familiarity with an onshore/offshore operational model is essential. Demonstrated experience in the architecture, design, and development of large-scale enterprise application solutions is required. Desired Experience: Proficient in DevOps practices and automation techniques. Experienced in programming languages and scripting, including Python, Shell, Bash, Groovy, Ansible, and Docker. Providing coaching and guidance to team members. Location & Hours of Work: Full-time position, working 40 hours per week. Expected overlap with US hours as appropriate Primarily based in the Innovation Hub in Hyderabad, India in a hybrid working model (3 days WFO and 2 days WAH)

Posted 2 months ago

Apply

10 - 20 years

30 - 35 Lacs

Chennai, Bengaluru, Kolkata

Work from Office

Naukri logo

Exp. : M & SM (10+ & Above) Must have 12+ years of experience as SRE ( Site Reliability Engineer ) Grafana OSS Stack for observability (Mimir, Loki, Tempo, Grafana agent) Azure/GCP hands-on with details around pulling observability data from managed services Golang/Python coding or from solutioning background with experience on SRE development and Open telemetry implementation Deploying/managing and optimizing enterprise level observability platform for Grafana OSS products like Mimir, Loki and Tempo Exp. - 10-27 yrs Contact Person- Megan Contact Number- 9840758962 Email - Megan@gojobs.biz

Posted 2 months ago

Apply

5 - 10 years

5 - 10 Lacs

Bengaluru, Hyderabad

Work from Office

Naukri logo

Role & responsibilities Design, develop, and maintain high-quality Java applications and Kotlin with minimum 5 years of experience. Demonstrated expertise in microservices architecture and implementation, utilizing frameworks like Spring Boot. Hand on experience with CI/CD processes and tools, including GitHub, GitHub Actions, Azure DevOps, Prometheus, and Grafana/ELK stack. Proficient in event-driven architecture and messaging, specifically with Kafka. Experience with containerization technologies like Docker and orchestration with Kubernetes. Ability to work effectively in Agile/Scrum teams, contributing to planning and execution. Location: Hyderabad/Bangalore

Posted 2 months ago

Apply

5 - 7 years

8 - 10 Lacs

Hyderabad

Work from Office

Naukri logo

Position Overview: The Data Platform and Analytics Services (DPaAS) team in Finance IT is looking for a Dev Ops Senior Analyst to provide their contribution and guidance for shared cloud infrastructure/data platform for Corporate Applications and US Market Solutions. This is a key growth area for the Finance organization and will lay a strong foundation for the cloud-based data platform to enable accurate and timely insights for the stakeholders to make informed and strategic decisions. The DevOps Analyst will be responsible for building a shared framework of cloud services as well as related tools and processes. This will enable the build out of new data platforms or the enhancement of existing data platform. This will be done by leveraging AWS and/or Oracle cloud. The ideal candidate will have experience in design, development and automation of scalable cloud infrastructure supporting data workloads. The Senior DevOps analyst must possess a combination of systems, technology and architecture experience optimizing cost, reliability, security, performance, and operational efficiency in order to drive innovation in both DevOps technology and processes. Responsibilities: Collaborate with Solution/Data Architect to provision and automate cloud infrastructure for hosting data applications using Terraform. Create, maintain, and enhance pipelines for Continuous Integration ( CI ) and Continuous Deployment ( CD) of infrastructure and application code. Implement Cigna's security standards and controls governing cloud-based systems by partnering with information protection/security team. Adhere to Cignas cloud compliance requirements for AWS accounts. Monitor and log important network, system and application activity utilizing industry standard tools. Troubleshoot issues based on alerts and logs. Administer Linux and Windows based systems. Develop KPIs that provide in-depth visibility into system health. Establish interfaces with SAML/SSO providers in Cigna. Ensure high availability, scalability, and security of production systems. Maintain awareness of industry best practices in area of DevOps and evaluate their application to the data platform. Qualifications: Required Skills: Ideal candidate must have a broad and deep technical understanding of the technologies in this field, including but not limited to IaaC, RaaS, PaaC. Experience creating integration and deployment pipelines for data platforms is required. Experience in AWS compute (EC2, Lambda), networking (VPC, Subnets, Firewalls etc.), storage (S3, EBS, EFS), security (IAM), encryption (KMS, TLS), data and analytics (Redshift, Glue, RDS ), AI/ML (SageMaker), containers (ECS, EKS) are needed for this role. Experience in Open-Source tools/technologies such as Airflow, Jenkins, Git, Github actions, Terraform, Ansible, Prometheus, Grafana, Python etc. AWS certifications (for example, AWS Certified DevOps Engineer) preferred. Desired Skills: Excellent written and verbal communication skills to effectively communicate across teams and roles. Excellent analytical / troubleshooting skills and willingness to learn and apply innovative technologies. Ability to work collaboratively in a fast-paced, agile environment. Demonstrable ability to deliver projects on time, with high quality, and within budget. Required Experience & Education: Bachelor of Science in Computer Science, Software Engineering, IT or related technical discipline, or equivalent combination of training and experience. 5+ years of hands-on technical expertise in DevOps using AWS cloud. Work Shift : 1 to 10 PM IST - Door to Door Pick and drop

Posted 2 months ago

Apply

3 - 8 years

5 - 10 Lacs

Coimbatore

Work from Office

Naukri logo

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Kubernetes Good to have skills : NA Minimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will design, build, and configure applications to meet business process and application requirements. You will be responsible for developing and implementing software solutions using Kubernetes. Your typical day will involve collaborating with cross-functional teams, analyzing user requirements, designing application architecture, coding, testing, and debugging applications. You will also provide technical support and troubleshoot issues to ensure the smooth functioning of applications. Roles & Responsibilities: Expected to perform independently and become an SME. Required active participation/contribution in team discussions. Contribute in providing solutions to work-related problems. Design, build, and configure applications to meet business process and application requirements. Collaborate with cross-functional teams to analyze user requirements and design application architecture. Code, test, and debug applications using Kubernetes. Provide technical support and troubleshoot issues to ensure the smooth functioning of applications. Stay updated with emerging technologies and industry trends. Contribute to the continuous improvement of software development processes and practices. Professional & Technical Skills: Must To Have Skills:Proficiency in Kubernetes. Good To Have Skills:Experience with containerization technologies like Docker. Strong understanding of microservices architecture and deployment. Experience with cloud platforms like AWS or Azure. Knowledge of CI/CD pipelines and automation tools like Jenkins. Familiarity with monitoring and logging tools such as Prometheus and ELK stack. Additional Information: The candidate should have a minimum of 3 years of experience in Kubernetes. This position is based at our Bengaluru office. A 15 years full-time education is required. Qualifications 15 years full time education

Posted 2 months ago

Apply

3 - 7 years

13 - 17 Lacs

Bengaluru

Work from Office

Naukri logo

Site Reliability Engineer - Private Cloud - Our mission at Booking.com is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have an amazing experience wherever and whenever they choose: mobile, web, and through partners and 3rd parties. About the team - Private cloud: The Private Cloud group operates, orchestrates, and optimizes Booking-managed cloud infrastructure. The Private Cloud capabilities are provided on platform instances that are privately owned and centrally managed by Booking.com. These platform instances, and the workloads running on them, are hosted both in Booking datacenters (on-premises) and on public cloud infrastructure (AWS). The Private Cloud platform has three primary internal customer-facing verticals: virtualization, containerization, and serverless, corresponding to the three types of workloads it supports. At the highest level, the Booking Private Cloud drives three primary business outcomes: Agility in provisioning and using cloud infrastructure. Efficiency in cost and utilization of cloud infrastructure, as well as toil reduction for developers and engineers. Trust in the safety, reliability, and performance of our cloud infrastructure. Years of Experience: 2years-5years Key Job Responsibilities and Duties: The core premise for the Booking SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking. You will impact millions of people all over the globe with your creative solutions You work in one of the biggest e-commerce companies in the world You will solve exciting problems at scale by writing and deploying code across tens of thousands of servers You will have the opportunity to collaborate with many of the worlds leading SREs You will be free to launch your own ideas and solutions within our sophisticated production environment Here are some of the tools and technologies we use to achieve this: Python, Go, Puppet, Kubernetes, Elasticsearch, Prometheus, HAProxy, Cassandra, Kafka etc What youll be Doing: Design, develop and implement systems software that improves the stability, scalability, availability and latency of the Booking.com products; Take ownership of one or more services and have the freedom to do what is best for our business and customers; Solve problems occurring with our highly available production systems and build solutions and automation to prevent them from happening again; Build effective monitoring to monitor the health of your system, and jump in to handle outages; Build and run capacity tests to handle the growth of your systems; Plan for reliability by designing systems to work across our multinational data centers; Develop tools to assist the product development teams with successfully deploying 1000s of change sets every day; Share the on-call rotation and be an escalation contact for incidents (depending on level of role) What youll bring: Solid experience in at least one programming language. Experience with building, operating and maintaining scalable distributed systems, and with operations automation; Experience with Infrastructure as Code technologies; Knowledge of cloud computing fundamentals; Solid foundation in Linux administration and troubleshooting; Understanding of Service level agreements and objectives; Additional experience in OpenStack, Kubernetes, Networking, Security or Storage is desirable; Monitoring / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus; Good interpersonal skills Proficient command of the English language, both written and spoken

Posted 2 months ago

Apply

3 - 7 years

13 - 17 Lacs

Bengaluru

Work from Office

Naukri logo

Role Description Site Reliability Engineer - Private Cloud - Our mission at Booking.com is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have an amazing experience wherever and whenever they choose: mobile, web, and through partners and 3rd parties. About the team - Private cloud: The Private Cloud group operates, orchestrates, and optimizes Booking-managed cloud infrastructure. The Private Cloud capabilities are provided on platform instances that are privately owned and centrally managed by Booking.com. These platform instances, and the workloads running on them, are hosted both in Booking datacenters (on-premises) and on public cloud infrastructure (AWS). The Private Cloud platform has three primary internal customer-facing verticals: virtualization, containerization, and serverless, corresponding to the three types of workloads it supports. At the highest level, the Booking Private Cloud drives three primary business outcomes: Agility in provisioning and using cloud infrastructure. Efficiency in cost and utilization of cloud infrastructure, as well as toil reduction for developers and engineers. Trust in the safety, reliability, and performance of our cloud infrastructure. Years of Experience: 2years-5years Key Job Responsibilities and Duties: The core premise for the Booking SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking. You will impact millions of people all over the globe with your creative solutions You work in one of the biggest e-commerce companies in the world You will solve exciting problems at scale by writing and deploying code across tens of thousands of servers You will have the opportunity to collaborate with many of the worlds leading SREs You will be free to launch your own ideas and solutions within our sophisticated production environment Here are some of the tools and technologies we use to achieve this: Python, Go, Puppet, Kubernetes, Elasticsearch, Prometheus, HAProxy, Cassandra, Kafka etc What youll be Doing: Design, develop and implement systems software that improves the stability, scalability, availability and latency of the Booking.com products; Take ownership of one or more services and have the freedom to do what is best for our business and customers; Solve problems occurring with our highly available production systems and build solutions and automation to prevent them from happening again; Build effective monitoring to monitor the health of your system, and jump in to handle outages; Build and run capacity tests to handle the growth of your systems; Plan for reliability by designing systems to work across our multinational data centers; Develop tools to assist the product development teams with successfully deploying 1000s of change sets every day; Share the on-call rotation and be an escalation contact for incidents (depending on level of role) What youll bring: Solid experience in at least one programming language. Experience with building, operating and maintaining scalable distributed systems, and with operations automation; Experience with Infrastructure as Code technologies; Knowledge of cloud computing fundamentals; Solid foundation in Linux administration and troubleshooting; Understanding of Service level agreements and objectives; Additional experience in OpenStack, Kubernetes, Networking, Security or Storage is desirable; Monitoring / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus; Good interpersonal skills Proficient command of the English language, both written and spoken

Posted 2 months ago

Apply

2 - 7 years

5 - 10 Lacs

Gurgaon

Work from Office

Naukri logo

Title: Sr. EMS Analyst Location: Gurgaon, India Job Description Who We Are: Fareportal is a travel technology company powering a next-generation travel concierge service. Utilizing its innovative technology and company owned and operated global contact centers, Fareportal has built strong industry partnerships providing customers access to over 600 airlines, a million lodgings, and hundreds of car rental companies around the globe. With a portfolio of consumer travel brands including CheapOair and OneTravel, Fareportal enables consumers to book-online, on mobile apps for iOS and Android, by phone, or live chat. Fareportal provides its airline partners with access to a broad customer base that books high-yielding international travel and add-on ancillaries. Fareportal is one of the leading sellers of airline tickets in the United States. We are a progressive company that leverages technology and expertise to deliver optimal solutions for our suppliers, customers, and partners. FAREPORTAL HIGHLIGHTS: Fareportal is the number 1 privately held online travel company in flight volume. Fareportal partners with over 600 airlines, 1 million lodgings, and hundreds of car rental companies worldwide. 2019 annual sales exceeded $5 billion. Fareportal sees over 150 million unique visitors annually to our desktop and mobile sites. Fareportal, with its global workforce of over 2,600 employees, is strategically positioned with 9 offices in 6 countries and headquartered in New York City. Job Description and Responsibilities: Monitor on 24 x7 basis health of Servers/Network/Applications/Websites/APIs and report alarms utilizing network/systems/Application/Websites monitoring tools. Hands-on working on enterprise monitoring tools, i.e. MS SCOM/SolarWinds/ AppInsight/Elastic/Grafana/Promethus along with SAAS base monitoring solutions i.e. Rigor/Website pulse/Catchpoint etc. Should have clear understanding on public cloud i.e. AWS and Azure, its monitoring solution, Datadog/CloudWatch/Appinsight etc. Identify, diagnose, and resolve issues. Create and maintain comprehensive documentation. Strong problem-solving and trouble-shooting skills. Monitor performance, capacity, and availability of the IT components on an ongoing basis. Recommend improvements in technologies and practices to increase uptime. Collect and review performance reports for various systems, and report trends in Network, Server, Application & Websites performance. Provide timely response to all incidents, outages and performance alerts. Categorize issues for escalation to appropriate technical teams. Should have clear understanding of Incident Management experience and mandatory (P1/P2). Should have clear understanding on Change as well Problem Management as well. • Ensure timely follow up with cross-functional teams via e-mails, phone calls and slack. Willing to do rotational shifts, 24 x 7. Required Skills & Qualifications: 2+ years experience in Enterprise Monitoring Tools, Applications, Servers monitoring and troubleshooting or similar role. Work experience on Windows Servers, SQL, Network equipment. Familiarity with scripting, network security, firewalls or Linux environment. Bachelor’s Degree/Diploma in Computer Science, Information Systems, Engineering, Business or technical discipline Preferred Skills & Qualifications: Willing to do rotational shifts, 24 x 7. Strong problem-solving and trouble-shooting skills. Aptitude for learning new technologies, interest in professional development. Good Technical Skills in Networks, Firewalls, Servers. Good communicator with a natural aptitude for dealing with people. Should be a team player. Quick learner and able to deal with a wide range of issues. Good analytical skills and able to collate and interpret data from various sources. Ability to assess and prioritize faults and respond or escalate accordingly. Disclaimer This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Fareportal reserves the right to change the job duties, responsibilities, expectations or requirements posted here at any time at the Company’s sole discretion, with or without notice.

Posted 2 months ago

Apply

6 - 10 years

8 - 12 Lacs

Bengaluru

Work from Office

Naukri logo

About The Role : Job Title:ELK & Grafana Architect Design, implement, and optimize ELK solutions to meet data analytics and search requirements. Collaborate with development and operations teams to enhance logging capabilities. Implement and configure components of the Elastic Stack, including Filebeat, Metricbeat, Winlogbeat, Logstash, and Kibana. Create and maintain comprehensive documentation for Elastic Stack configurations and processes. Ensure seamless integration between various Elastic Stack components. Develop and maintain advanced Kibana dashboards and visualizations. Design and implement solutions for centralized logs, infrastructure health metrics, and distributed tracing for different applications. Implement Grafana for visualization and monitoring, including Prometheus and Loki for metrics and logs management. Build detailed technical designs related to monitoring as part of complex projects. Ensure engagement with customers and deliver business value. Requirements: 6+ years of experience as an ELK Architect/Elastic Search Architect. Hands-on experience with Prometheus, Loki, OpenTelemetry, and Azure Monitor. Experience with data pipelines and redirecting Prometheus metrics. Proficiency in scripting and programming languages such as Python, Ansible, and Bash. Familiarity with CI/CD deployment pipelines (Ansible, GIT). Strong knowledge of performance monitoring, metrics, capacity planning, and management. Excellent communication skills with the ability to articulate technical details to different audiences. Experience with application onboarding, capturing requirements, understanding data sources, and architecture diagrams. Experience with OpenTelemetry monitoring and logging solutions. 3.Competency Building and Branding Ensure completion of necessary trainings and certifications Develop Proof of Concepts (POCs),case studies, demos etc. for new growth areas based on market and customer research Develop and present a point of view of Wipro on solution design and architect by writing white papers, blogs etc. Attain market referencability and recognition through highest analyst rankings, client testimonials and partner credits Be the voice of Wipros Thought Leadership by speaking in forums (internal and external) Mentor developers, designers and Junior architects in the project for their further career development and enhancement Contribute to the architecture practice by conducting selection interviews etc

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Ahmedabad

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Patna

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Pune

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

8 - 13 years

20 - 30 Lacs

Chennai

Work from Office

Naukri logo

• Cloud networking and security • Linux and Windows experience • Scripting and automation skills (e.g., Python, PowerShell) • Proficiency in infrastructure as code (IaC) and configuration management tools. (e.g.,Terraform, Ansible) Required Candidate profile • Containerization and orchestration technologies (e.g., Docker, Kubernetes, Rancher) • Monitoring and logging tools (e.g., Nagios, Prometheus, ELK stack) • Virtualization, VPN, RDP, SSO, Kafka

Posted 2 months ago

Apply

5 - 9 years

14 - 24 Lacs

Chennai

Work from Office

Naukri logo

• Cloud networking and security • Linux and Windows experience • Scripting and automation skills (e.g., Python, PowerShell) • Proficiency in infrastructure as code (IaC) and configuration management tools. (e.g.,Terraform, Ansible) Required Candidate profile • Containerization and orchestration technologies (e.g., Docker, Kubernetes, Rancher) • Monitoring and logging tools (e.g., Nagios, Prometheus, ELK stack) • Virtualization, VPN, RDP, SSO, Kafka

Posted 2 months ago

Apply

5 - 10 years

8 - 18 Lacs

Bengaluru

Hybrid

Naukri logo

EMS and Observability Consultant Location : Bangalore(Hybrid) Exp-5+ yrs Skills: implementation, Solutioning, Prometheus, Grafana, Ops Ramp, ELK, Ansible, Shell, Terraform, Python

Posted 2 months ago

Apply

10 - 15 years

15 - 20 Lacs

Pune

Remote

Naukri logo

NEOGOV is the market and technology leader in human resources software for the public sector. Our innovative HR platform automates the entire employee lifecyclefrom recruitment through offboarding—empowering governments, educational institutions, and public sector organizations to better serve their communities. We are looking for a Senior DevOps Engineer to join our dynamic team, supporting mission-critical environments across both Windows and Linux platforms . Role & responsibilities : As a Senior DevOps Engineer , you will play a key role in managing and scaling our production systems, release pipelines, and container environments in complex private cloud setups. This is a hands-on position working closely with Development and QA teams, supporting enterprise applications across a diverse technology stack while driving automation and reliability at scale. Preferred candidate profile: 10+ years of experience in DevOps roles supporting complex, high-availability systems. Proven success in diagnosing and resolving complex application support issues in production environments. Experience working in Agile or Lean development teams. Expertise in: Windows IIS , WebLogic administration. Linux environments (Ubuntu, RHEL, CentOS) with a focus on performance and reliability. Technologies such as Solr , Redis , Elasticsearch , MongoDB , RabbitMQ . Container platforms, particularly Docker and Rancher . CI/CD tooling ( Bitbucket , Bamboo ) and automation with Puppet and Ansible . Monitoring and log analysis tools like AppDynamics , Logstash , SolarWinds , Grafana , Prometheus , and Loki . Scripting and automation ( PowerShell , Python , Bash , Ruby , WLST ). Familiarity with databases ( SQL Server , Oracle D Exposure to hybrid cloud environments (AWS, Azure, GCP) is an add on You can find more about us here - About Our HRMS Systems For Public Sector | NEOGOV NGV Software India | NEOGOV

Posted 2 months ago

Apply

8 - 12 years

30 - 35 Lacs

Noida

Work from Office

Naukri logo

What this role includes :- Manage Data Centre, IT Infrastructure, Cloud Operations (Public, Private & Hybrid Cloud) and production of entire SaaS platform Direct business operations by deploying, automating, managing & maintaining Cloud-based System to ensure availability, performance, scalability and security Manage various migration projects Develop world class & cost-optimized IT Infrastructure, Network, Security Framework, DR & BCP Framework, Hardware Management and Access Control Management Develop world class & cost-optimized IT Infrastructure, Network, Security Framework, DR & BCP Framework, Hardware Management and Access Control Management Define Change & Release Management to improve capability of team and deploy work with repeatable success Lead Organizational IT Security Policies, Migration activities, IT Audit, Risk and Compliance related records Effectively formulating budget and manage Information Services Infrastructure projects with key focus on ROI for all IT spends Skilled in extending high-end technical support on various servers and ensuring high customer satisfaction Assigns and monitors work of technical personnel, ensuring that application development and deployment is done in the best possible way, and implements quality control and review systems throughout the development and deployment processes Take responsibility for the architecture and technical leadership of the entire DevOps infrastructure Provides process improvement recommendations based on best practices and industry standards Educational Qualifications: B.Tech/B.E Skills Required: DevOps , Web Servers , Nginx , cicd , jenkins , Kubernetes , Docker , ELK , Prometheus , terraform , Unix , Node js , networking , cloud , security Candidate Attributes: Skillsets - Must Have :- Knowledge of Kubernetes, Docker, ELK, Prometheus, Nagios, Chef/Ansible, Terra-form, GitLab, Jenkins etc, Must be comfortable with Unix Administration Deep understanding of Kernel, Networking and OS fundamentals Strong experience with web technologies - Nginx, HAProxy, Apache, Nodejs Proven record of infra automation and programming skills in any of these languages - PHP, Golang, Javascript Hands on experience in web application development and associated skills in a high-stakes environment - HTTP, REST, Web Services, SOA Good to Have :- Good Database understanding Experience with AWS/ GCP/Azure Good understanding of asynchronous messaging platform like RabbitMQ, Kafka Strong experience in managing both development and operations Good communication and interpersonal skill Must be comfortable working with a distributed team Must have worked in an Agile development environment Ability to drive to big picture goals and milestones while valuing and maintaining a strong attention to detail Ability to quickly identify and drive to the optimal solution when presented with a series of constraints Demonstrated ability in people management, strategic planning, risk management, change management, and project management Thorough knowledge of developing cross platform applications - Web, Mobile Web both.

Posted 2 months ago

Apply

7 - 12 years

8 - 14 Lacs

Kolkata

Work from Office

Naukri logo

Key Responsibilities: Monitor and manage Linux servers to ensure high availability and optimal performance. Implement, configure, and maintain monitoring tools (e.g., Nagios, Zabbix, Prometheus, Grafana, Splunk, ELK). Develop and customize monitoring dashboards, alerts, and reports for system health and performance. Analyze system logs, troubleshoot server performance issues, and take preventive measures.

Posted 2 months ago

Apply

2 - 7 years

4 - 9 Lacs

Maharashtra

Work from Office

Naukri logo

Description Mumbai/Bangalore Generic JD What will SREs do? Provide hands-on SRE with 24x7 SRE support, including incident management, problem management, root cause analysis, monitoring, alerting, and maintenance of infrastructure, compliance Track, audit, monitor and implement on technical work streams Act as portfolio SME (Subject Matter Expert) understand document common components, core functionalities, infrastructure of supported applications Be an escalation point in the on-call rotation, and support our maintenance, scheduled work, support and release deployment requirements Lead in incident management and problem management for applications in scope and RCA Action items fulfillment/ownership Focus on Continuous improvement and technical standards Drive improvements in productivity, monitoring, tooling and best practices Manage technology currency (server patching, certificate renewal, compliance, etc.) with keen eye on automating opportunities Drive best-in-class technical solutions by tracking closely industry leading solutions and applying to RBC environment and needs Leverage the value in unit, department, and enterprise wide teams to develop better solutions and achieve a cross enterprise mindset EngineeringDevelop SRE solutions (monitoring and alerting, machine learning anomaly detection, self-healing and reliability testing) Apply design-thinking and agile mindset in working with SREs, Scrum Masters and Incident Leads Contribute to and leverage best practices in SRE Simplifies development by building repeatable solutions to manual tasks Supports unit's goals to adopt automation solutions for applications in scope Production SupportPerform production support role, including off-hours support and rotational on-call support to be compensated accordingly with overtime pay, lieu time, and on-call allowance Assist in incident management and problem management for applications in scope Evaluate continuously what went well, what went wrong, what can be done to improve and prevent in future Maintain technology currency (perform server patching, certificate renewal, etc.) with keen eye on automating opportunities Ensure availability and uptime of applications in scope, as per service level objectives Ensure compliance of all systems and applications in scope, including maintaining segregation of duties Technical ConsultationSupport initiatives outside of application or squad level scope Consult on products build to other teams in RBPT and enterprise Innovation and LearningStay abreast of technology change and learn constantly, through official training assignments and self-assigned learning Provide demos to team at large of new technology findings Advanced knowledge of the following SRE practices and technologies 3-5 years of experience in related field oPython, YAML, Shell scripting oAzure, Linux oDynatrace, Prometheus, PagerDuty, Moog, Splunk, Elastic, Azure monitor oChaos Engineering oMQ, Kafka oPerform production support role, including off-hours support In-depth hands-on experience in a variety of SRE tools (Ansible, Azure Automation, Catchpoint) Named Job Posting? (if Yes - needs to be approved by SCSC) Additional Details Global Grade C Level To Be Defined Named Job Posting? (if Yes - needs to be approved by SCSC) No Remote work possibility Yes Global Role Family To be defined Local Role Name To be defined Local Skills reliability metrics;reliability controls Languages RequiredENGLISH Role Rarity To Be Defined

Posted 2 months ago

Apply

2 - 7 years

4 - 9 Lacs

Uttar Pradesh

Work from Office

Naukri logo

Requirements Excellent technical skills, enabling the implementation of future-proof, complex global solutions. Excellent documentation and dashboarding skills. Excellent interpersonal communication and organizational skills that are required to operate as a member of global, distributed teams that deliver quality services and solutions. Ability to rapidly gain knowledge of the organizational structure of the firm to facilitate work with groups outside of the immediate technical team. Ability to work independently, understanding the timeline and dependencies Familiarity with IT methodologies and life cycles. Required Technical and Professional Expertise Expertise in DevOps activities supporting CI/CD delivery using code management, orchestration management and automation tools Experience with Cloud Administration, understanding the basic principles and common practices Familiarity with REST, API, JSON, SOAP for system integration Basic knowledge of multi-tier system design and implementation Proficiency ino Infrastructure-as-Code (specially Bicep or Terraform or ARM Templates)o Configuration management platforms (Ansible / AWX)o Code Management tools (Git / GitHub / Azure DevOps)o Release Management tools (Azure DevOps)o Databases like MongoDB, PostgreSQLo At least one programming/scripting language (PowerShell, Bash, Python)o Serverless / containerized solutions Any monitoring solutions like Grafana, Prometheus Experience working in large companies is a plus Bachelors Degree or equivalency (CS, CE, CIS, IS, MIS, or engineering discipline) Preferred technologies experience Azure (must know) Ansible Automation Platform, Ansible Tower (must know) Bicep (good to know), Terraform or similar. MongoDB, PostgreSQL (good to know) Azure ARM Templates Azure DevOps pipelines / GitHub Action GitHub Knowing fundamental network protocols and security aspects Kubernetes and docker as a plus but not a mandatoryDesired Experience Experience with building, deploying and operating applications (CI/CD) including mission critical support Basic knowledge of application development using .NET core Ability to communicate complex technical challenges to a non-technical audience. Experience coordinating the intersection of complex system dependencies and interactions Experience in solution delivery using common methodologies especially Agile working iterations

Posted 2 months ago

Apply

2 - 7 years

4 - 9 Lacs

Bengaluru

Work from Office

Naukri logo

Description Experience3 to 6 years of experience on DevOps /Build Release with following skills Skill SetHands on experience in Git, git version control tools for the desktop and web app build routine. Ability to understand the Azure DevOps build pipelines and how the artifactory mechanism works (for various packages). Able to create maintain the Azure build jobs and dashboards. Hands on experience with Docker and Kubernetes orchestrated deployments in Azure or AWS. Good knowledge in Azure or AWS services and various resources. Good experience with Helm charts. Maintenance of build agents (on premise and on cloud) used for projects. Hands on experience with Conan and Nuget packaging. Knowledge of SDLC Ability to use repositories and knowledge yml definitions. Ability to manage the staging process in git repositories Resolve the conflicts and build the pipelines. Must have good knowledge about operating system, registry and installer debugging tools Collaborate with development teams in Build Engineering process. Maintain the projects on AWS EC2 instance. Proficient scripting with Bash, Powershell. CapabilitiesExcellent problem-solving skills, Proactive, Self motivated, Result oriented, Good Oral and written communication. Named Job Posting? (if Yes - needs to be approved by SCSC) Additional Details Global Grade B Level To Be Defined Named Job Posting? (if Yes - needs to be approved by SCSC) No Remote work possibility No Global Role Family To be defined Local Role Name To be defined Local Skills DevOps Languages RequiredENGLISH Role Rarity To Be Defined

Posted 2 months ago

Apply

Exploring Prometheus Jobs in India

Prometheus is a popular monitoring and alerting tool used in the field of DevOps and software development. In India, the demand for professionals with expertise in Prometheus is on the rise. Job seekers looking to build a career in this field have a promising outlook in the Indian job market.

Top Hiring Locations in India

  1. Bangalore
  2. Pune
  3. Hyderabad
  4. Mumbai
  5. Chennai

These cities are known for their vibrant tech industry and have a high demand for professionals skilled in Prometheus.

Average Salary Range

The salary range for Prometheus professionals in India varies based on experience levels. Entry-level positions can expect to earn around ₹5-8 lakhs per annum, whereas experienced professionals can earn up to ₹15-20 lakhs per annum.

Career Path

A typical career path in Prometheus may include roles such as: - Junior Prometheus Engineer - Prometheus Developer - Senior Prometheus Engineer - Prometheus Architect - Prometheus Consultant

As professionals gain experience and expertise, they can progress to higher roles with increased responsibilities.

Related Skills

In addition to Prometheus, professionals in this field are often expected to have knowledge and experience in: - Kubernetes - Docker - Grafana - Time series databases - Linux system administration

Having a strong foundation in these related skills can enhance job prospects in the Prometheus domain.

Interview Questions

  • What is Prometheus and how does it differ from traditional monitoring systems? (basic)
  • Explain the architecture of Prometheus. (medium)
  • How do you set up alerting in Prometheus? (medium)
  • What are exporters in Prometheus and why are they important? (basic)
  • How can you visualize data collected by Prometheus? (medium)
  • Explain the concept of time series data and how it is used in Prometheus. (medium)
  • How does Prometheus store its data? (medium)
  • What is the role of PromQL in Prometheus? (medium)
  • How can you troubleshoot performance issues using Prometheus? (medium)
  • Describe the process of setting up Prometheus alerts. (medium)
  • What are the best practices for monitoring with Prometheus? (advanced)
  • How does federation work in Prometheus? (advanced)
  • Explain the role of relabeling in Prometheus configuration. (medium)
  • How can you secure Prometheus endpoints? (medium)
  • What is the role of service discovery in Prometheus? (basic)
  • Describe the benefits of using Prometheus for monitoring microservices. (medium)
  • How can you scale Prometheus for large deployments? (medium)
  • What are the limitations of Prometheus and how can they be mitigated? (medium)
  • Explain the concept of recording rules in Prometheus. (medium)
  • How can you monitor non-containerized applications with Prometheus? (medium)
  • Describe the process of backing up and restoring Prometheus data. (medium)
  • How do you handle high availability in Prometheus? (medium)
  • What are the common pitfalls to avoid when using Prometheus? (medium)
  • How can you integrate Prometheus with other monitoring tools or systems? (medium)
  • What trends do you see in the future of Prometheus and monitoring tools in general? (advanced)

Closing Remark

As you explore opportunities in the Prometheus job market in India, remember to continuously upgrade your skills and stay updated with the latest trends in monitoring and alerting technologies. With dedication and preparation, you can confidently apply for roles in this dynamic field. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies