Home
Jobs
Companies
Resume

628 Prometheus Jobs - Page 25

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5 - 10 years

7 - 12 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities We are looking for?Software Developer?with container platform and systems-level experience to join our?Fabric Development team in?India, Bangalore We seek individuals who innovate & share our passion for winning in the cloud marketplace.?The Fabric Development team is a team dedicated to ensuring that the IBM Cloud is at the forefront of cloud technology, from bootstrapping data centres, to application architecture, to flexible infrastructure services. We are running IBM's next generation cloud platform to deliver performance and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. It is an exciting time, and as a team we are driven by this incredible opportunity to thrill our clients. Design and developing innovative, company and industry impacting services using open source and commercial technologies at scale Designing and architecting enterprise solutions to complex problems Presenting technical solutions and designs to engineering team Following compliant procedures and secure engineering best practices Collaboration and review of technical designs with architecture and offering management Taking ownership and keen involvement in projects that vary in size and scope depending on requirements. Writing and executing unit, functional, and integration test cases Required education Bachelor's Degree Preferred education Bachelor's Degree Required technical and professional expertise Bachelor's degree in Computer Science, Information Technology, or a related field. 5+ years of experience as a SW Developer, with a focus on Python and Ansible. 3+ years of experience in Ansible for automation and configuration management. Strong Python programming skills for scripting and automation tasks. 3+ years In-depth knowledge of networking protocols and security principles. Experience with security tools and best practices. 3+ years experience with CI/CD pipelines and Ansible CI/CD practices. 3+ years of experience with Kuberneters, Docker deployments Familiarity with Agile development methodologies. Experience with cloud platforms Excellent problem-solving skills and attention to detail. Strong communication and team collaboration skills. Demonstrated skills with troubleshooting, debugging, maintaining and improving existing software Preferred technical and professional experience Familiarity with CI/CD pipelines.Experience with version control systems (e.g., Git). Experience with Ansible Collections and writing custom Ansible modules. Experience with monitoring and logging tools for load balancers (Prometheus, Grafana, ELK Stack). An understanding and hands on experience with networking methodologies

Posted 3 months ago

Apply

10 - 15 years

12 - 17 Lacs

Kochi

Work from Office

Naukri logo

Responsibilities Software Developers at IBM are the backbone of our strategic initiatives to design, code, test, and provide industry-leading solutions that make the world run today - planes and trains take off on time, bank transactions complete in the blink of an eye and the world remains safe because of the work our software developers do. Whether you are working on projects internally or for a client, software development is critical to the success of IBM and our clients worldwide. At IBM, you will use the latest software development tools, techniques and approaches and work with leading minds in the industry to build solutions you can be proud of. Design, develop, test, operate and maintain database features in our products and services and tools to provide a secure environment for the product to be used by customers in the cloud. Evaluate new technologies and processes that enhance our service capabilities. Documenting and sharing your experience, mentoring others Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 10+ years of relevant experience in software development Strong software programming experience and skills using languages like Java/Go Exposure to UI development framework like ReactJS. Exposure to Shell scripting languages Bash/Perl/Python/Ruby Exposure to best practices in design, development and testing of software Experience with leading a feature/project and coach Junior developers on technology and review solutions. Working experience with SQL databases (Db2, PostgreSQL, MySQL, Oracle, SQL Server etc) Understanding of with Virtualization and Containerization technologies Developer knowledge and experience with Docker and Kubernetes frameworks Development familiarity with the usage of Cloud Services (IBM Cloud, Amazon Web Services, Microsoft Azure) Knowledge of Linux/UNIX Operating Systems Preferred technical and professional experience Familiarity with Compute, Storage and Networking components from IBM Cloud, AWS, Azure Familiarity with Red Hat OpenShift Familiarity with LogDNA/Sysdig/Prometheus for cluster analysis in a Kubernetes environment

Posted 3 months ago

Apply

6 - 11 years

8 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Primary Skills Proven experience with Azure DevOps services, including Azure Pipelines , Azure Repos , Azure Artifacts , and Azure Boards . Strong knowledge of CI/CD concepts and implementation in cloud environments. Hands-on experience with Azure Cloud Services , including Azure Compute , Azure Networking , Azure Storage , and Azure Kubernetes Service (AKS) . Expertise in Infrastructure as Code (IaC) using tools such as Terraform , ARM templates , or Bicep . Familiarity with containerization and orchestration tools such as Docker , Kubernetes , and Helm . Experience with monitoring and logging tools like Azure Monitor , Log Analytics , Grafana , and Prometheus . Proficiency in scripting and automation using languages like PowerShell , Bash , or Python . Experience with version control systems like Git . Secondary Skills Knowledge of Agile methodologies and tools such as JIRA or Azure Boards . Strong understanding of security best practices for cloud environments, particularly in Azure . Experience working with team collaboration tools such as Slack , Microsoft Teams , or Confluence .

Posted 3 months ago

Apply

6 - 9 years

8 - 11 Lacs

Mumbai

Work from Office

Naukri logo

Primary Skills Proven experience with Azure Kubernetes Service (AKS) and Kubernetes orchestration. Strong experience in containerization technologies, especially Docker . Proficiency in Azure Cloud services, including but not limited to Azure Resource Manager (ARM) , Azure DevOps , and Azure Container Registry (ACR) . Strong understanding of Infrastructure as Code (IaC) practices using Terraform , ARM Templates , or similar. Experience with CI/CD tools like Jenkins , GitLab CI , or Azure DevOps for building and deploying applications. Experience with monitoring and logging tools such as Prometheus , Grafana , and Azure Monitor . Solid knowledge of networking, security, and authentication strategies in cloud environments, particularly in Azure. Experience in troubleshooting and debugging AKS, Kubernetes, and containerized environments. Familiarity with Helm for Kubernetes package management. Understanding of Agile methodologies and best practices in DevOps. Secondary Skills Familiarity with Helm for Kubernetes package management. Understanding of Agile methodologies and best practices in DevOps.

Posted 3 months ago

Apply

6 - 11 years

8 - 14 Lacs

Mumbai

Work from Office

Naukri logo

Primary skill Strong expertise in AWS services (EC2, S3, Lambda, RDS, DynamoDB, VPC, IAM, CloudFront, API Gateway).Hands-on experience with Infrastructure as Code (Terraform, CloudFormation, AWS CDK) . Experience in cloud security, networking, and IAM policies . Strong knowledge of AWS compute, storage, and database services . Experience with Kubernetes (EKS) and containerization (Docker) . Proficiency in scripting (Python, Bash, PowerShell) for automation. Knowledge of DevOps practices, CI/CD, and monitoring tools (CloudWatch, Prometheus, Grafana) . Experience in AWS cost management and optimization . Secondary skill AWS Certified Solutions Architect Professional or Associate . Experience with serverless computing (AWS Lambda, Step Functions, Fargate) . Familiarity with hybrid cloud and multi-cloud architectures . Knowledge of data analytics and big data solutions (AWS Glue, Redshift, Athena) .

Posted 3 months ago

Apply

7 - 11 years

9 - 13 Lacs

Bengaluru

Work from Office

Naukri logo

Skill required: Delivery - Marketing Analytics and Reporting Designation: I&F Decision Sci Practitioner Specialist Qualifications: Any Graduation Years of Experience: 7 to 11 years What would you do? Data & AIAnalytical processes and technologies applied to marketing-related data to help businesses understand and deliver relevant experiences for their audiences, understand their competition, measure and optimize marketing campaigns, and optimize their return on investment. What are we looking for? Python (Programming Language) Structured Query Language (SQL) Machine Learning Data Science Written and verbal communication Ability to manage multiple stakeholders Strong analytical skills Detail orientation Expertise in AWS, Azure, or Google Cloud for ML workflows. Hands-on experience with Kubernetes, Docker, Jenkins, or GitLab CI/CD Familiarity with MLflow, TFX, Kubeflow, or SageMaker. Knowledge of Prometheus, Grafana, or similar tools for tracking system health and model performance. Understanding of ETL processes, data pipelines, and big data tools like Spark or Kafka. Proficiency in Git and model versioning best practices. Roles and Responsibilities: In this role you are required to do analysis and solving of moderately complex problems May create new solutions, leveraging and, where needed, adapting existing methods and procedures The person would require understanding of the strategic direction set by senior management as it relates to team goals Primary upward interaction is with direct supervisor May interact with peers and/or management levels at a client and/or within Accenture Guidance would be provided when determining methods and procedures on new assignments Decisions made by you will often impact the team in which they reside Individual would manage small teams and/or work efforts (if in an individual contributor role) at a client or within Accenture Work closely with data scientists, engineers, and DevOps teams to operationalize ML Optimize ML pipelines for performance, cost, and scalability in production. Automate deployment pipelines for ML models, ensuring fast and reliable transitions from development to production environments Set up and manage scalable cloud or on-premise environments for ML workflows. Qualifications Any Graduation

Posted 3 months ago

Apply

3 - 8 years

5 - 10 Lacs

Gurgaon

Work from Office

Naukri logo

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Site Reliability Engineering Good to have skills : NA Minimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will design, build, and configure applications to meet business process and application requirements. You will play a crucial role in ensuring the reliability and performance of our applications. Roles & Responsibilities: Expected to perform independently and become an SME. Required active participation/contribution in team discussions. Contribute in providing solutions to work-related problems. Collaborate with cross-functional teams to gather and analyze requirements. Design, develop, and test high-quality software applications. Troubleshoot and debug applications to identify and resolve issues. Ensure the reliability, availability, and performance of applications. Implement best practices for application development and deployment. Stay updated with the latest industry trends and technologies. Provide technical guidance and mentorship to junior team members. Professional & Technical Skills: Must To Have Skills:Proficiency in Site Reliability Engineering. Strong understanding of software development principles and methodologies. Experience with cloud platforms such as AWS or Azure. Knowledge of containerization technologies like Docker and Kubernetes. Familiarity with monitoring and logging tools like Prometheus and ELK stack. Good To Have Skills:Experience with DevOps practices and tools. Knowledge of scripting languages like Python or Bash. Understanding of networking concepts and protocols. Additional Information: The candidate should have a minimum of 3 years of experience in Site Reliability Engineering. This position is based at our Gurugram office. A 15 years full-time education is required. Qualifications 15 years full time education

Posted 3 months ago

Apply

7 - 12 years

9 - 14 Lacs

Kolkata

Work from Office

Naukri logo

Project Role : Service Management Practitioner Project Role Description : Support the delivery of programs, projects or managed services. Coordinate projects through contract management and shared service coordination. Develop and maintain relationships with key stakeholders and sponsors to ensure high levels of commitment and enable strategic agenda. Must have skills : Site Reliability Engineering Good to have skills : Service Integration and Management (SIAM) Minimum 7.5 year(s) of experience is required Educational Qualification : 15 years full time education We are seeking an experienced SRE Observability Engineer to join our team and lead the development, enhancement, and extension of SRE driven observability and alerting platforms for our global clients.As an SRE Observability Engineer, your role will be build, enhance and maintain best in class observability platforms that can effectively monitor the full technology stack for cloud and on-prem systems. An SRE Observability Engineer will play a pivotal role in shaping the evolving needs of our customers including instrumentation of Service Level Indicators and Objectives (SLI/SLO) and development/enhancement of SLI/SLO driven observability dashboards and alerting.Key Responsibilities Gather and analyze logs, metrics and traces from operating systems, infrastructure and network as well as applications to assist in performance tuning and fault finding Implement, enhance and maintain observability and alerting capabilities, especially that are built on SLI/SLO/Error Budget Analyze an existing observability and alerting platform and identify how it can be further improved Help build our unified observability stack using various observability tools Improve automation and increase the systems self-healing capability. Build monitoring that alerts on symptoms rather than on outageQualifications Bachelors or Masters degree in Computer Science, Computer Engineering, Electrical Engineering or related field or a combination of education and equivalent work experience Required Experience Overall 5-8 years of working experience 3-5 years of experience of building observability platforms with tools such as Dynatrace, AppDynamics, New Relic, Prometheus, Splunk, Sensu, Nagios, DataDog, Open Telemetry etc. Very good understanding and strong working knowledge of log collection and aggregation, custom metric development and distributed tracing Experience of building observability dashboard (preferably SLO driven) in visualization tools like Grafana Good understanding of SLIs/SLOs, especially their implementation designs Good working understanding of monitoring Cloud Platforms- AWS, Azure and GCPGood to have experience Prior experience of implementing SLI/SLO/Error Budget driven observability and alerting Strong proficiency with Cloud Platforms Experience programming with one or more of the following:Python, Go, Java/Scala or C Experience with J2EE, NoSQL/SQL Datastore, Spring Boot, GCP, AWS, Azure & Docker orK8 in developing multi-tier applications. Overall good understanding of SRE principles and practices Understanding and ability to implement effective observability strategies to improve MTTD/R Experience with RESTful APIs and microservices platforms Working knowledge of the TCP/IP stack, internet routing and load balancing Solve complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.You may not check every box, or your experience may look a little different from what we've outlined, but if you think you can bring value to Ford Motor Company, we encourage you to apply. Qualifications 15 years full time education

Posted 3 months ago

Apply

5 - 9 years

10 - 20 Lacs

Pune, Bengaluru, Noida

Hybrid

Naukri logo

Experience - 5 to 8 years Location - Hybrid (preferably Bangalore) , Pune , Noida Job Description: Basic Knowledge in Python / Shell Scripting (1+ yrs) Strong Unix/LInux fundamentals Experience in configuring and integrating monitoring/observability tools/framework (log/alert/incident management) Prometheus, Grafana, Opsramp, New Relic , Datadog, AppDynamics, Open Telemetry Experience in Docker , Kubernetes, VMware, Virtual networking, Databases. Experienced in implementing CI/CD pipeline, GITOps Experience in supporting deployment and bring up of Integration and Production setup ( operational/devops ) Working knowledge on distributed computing and microservices Should be able to work in a fast-paced, iterative development methodology ( Agile-SCRUM ). Working knowledge in one of public cloud such as AWS, Azure or GCP Role Description: Position: Technical Lead Develop/maintain monitoring microservices. Integrate Monitoring services into Software delivery framework Automate Integration of multiple monitoring and observability tools Adhere to software development best practices and align to CI/CD process defined Identify and resolve Security vulnerabilities and compliance gaps

Posted 3 months ago

Apply

5 - 10 years

7 - 12 Lacs

Bengaluru

Work from Office

Naukri logo

Job Title:Senior Java Technical Manager & Cloud Architect Corporate Title:VP Location:Bangalore, India Role Description Deutsche Bank has set for itself ambitious goals in the areas of Sustainable Finance, ESG Risk Mitigation as well as Corporate Sustainability. As Climate Change throws new Challenges and opportunities, Bank has set out to invest in developing a Sustainability Technology Platform, Sustainability data products and various sustainability applications which will aid Banks goals. As part of this initiative, we are building an exciting global team of technologists who are passionate about Climate Change, want to contribute to greater good leveraging their Technology Skillset in multiple areas predominantly in Cloud / Hybrid Architecture. As part of this Role, We are seeking a highly experienced Senior Full Stack Subject Matter Expert (SME) to join our team. In this senior role, you will be a trusted advisor, providing comprehensive technical guidance and driving innovation as we leverage GCP alongside our on-premise infrastructure. What we'll offer you As part of our flexible scheme, here are just some of the benefits that youll enjoy Best in class leave policy. Gender neutral parental leaves 100% reimbursement under childcare assistance benefit (gender neutral) Sponsorship for Industry relevant certifications and education Employee Assistance Program for you and your family members Comprehensive Hospitalization Insurance for you and your dependents Accident and Term life Insurance Complementary Health screening for 35 yrs. and above Your key responsibilities Technical Leadership Technically Managing & delivering Enterprise level Hybrid application which is built on OpenShift & GCP. Mentoring and Individual Contribution:Mentor teams in technology, design, and architecture. Contribute individually to project success in terms of JAVA Backend development involving micro frontends, microservices, event driven system integrations and sql/nosql databases. API Design:Design APIs for an API-first platform, ensuring seamless support for primary UI implementation(s). Hybrid Cloud Adaptation:You will work with a hybrid cloud architecture, necessitating flexibility in learning new technologies. Hybrid GCP solutions will replace or coexist with some technologies currently available exclusively on-premise. GCP Solutions:Provide solutions for OLTP application that leverage GCP, as it is crucial to fully exploit the benefits of our proposed cloud strategy. Code Reviews and Best Practices:Participate in code reviews and contribute to evolving best practices for better maintainability, security, observability, reuse and modular development. Assist in tool and platform documentation from both technology and operations perspectives. Stay current with emerging trends and innovations in GCP services, application development frameworks, and programming languages. Strategic Direction Partner with business stakeholders to understand their requirements and translate them into robust technical solutions, leveraging GCP's potential where appropriate. Develop and implement a long-term technology roadmap that aligns with our business goals, considering both cloud and on-premise options. Analyze the feasibility and cost-effectiveness of migrating suitable on-premise systems to GCP. Mentorship & Collaboration Mentor and guide junior developers on full-stack development best practices, focusing on GCP expertise and effective integration with on-premise systems. Foster a culture of knowledge sharing and collaboration within the engineering team. Effectively communicate complex technical concepts, including GCP considerations, to both technical and non-technical audiences. Problem-Solving & Innovation Troubleshoot and resolve complex technical issues across cloud (GCP) and on-premise environments. Continuously evaluate and recommend improvements to our development processes and infrastructure, considering the optimal use of GCP. Lead the exploration and implementation of innovative solutions using GCP services to optimize our technology stack. Your skills and experience 15+ years of experience in full-stack software development. Proven track record of leading and delivering successful software projects using GCP alongside on-premise environments. Experience with containerization technologies (Docker, Kubernetes) for GCP deployments. Excellent communication, collaboration, and problem-solving skills. Ability to think strategically and translate business needs into technical solutions, considering the optimal use of GCP. Leadership presence and the ability to mentor and inspire others. Must to have technology/framework - Microservices and related Desing Patterns, Spring Cloud, Spring Security, Concurrency, Enterprise Integration and related design patterns, JDK 11+, SpringBoot Middleware, MyBatis, Mockito, Junit,SQL, Oracle/postgres Database, Event Driven Architecture, Teamcity /Jenkins, GIt, SSH, Prometheus/Grafana, SPLUNK Knowledge of Sustainable Finance / ESG Risk / CSRD / Regulatory Reporting will be a plus Experience in infrastructure automation and DevOps principles on GCP will be a plus Knowledge of frontend frame work like ReactJS is a plus How we'll support you Training and development to help you excel in your career. Coaching and support from experts in your team A culture of continuous learning to aid progression. A range of flexible benefits that you can tailor to suit your needs. About us and our teams Please visit our company website for further information: https://www.db.com/company/company.htm We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively. Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group. We welcome applications from all people and promote a positive, fair and inclusive work environment.

Posted 3 months ago

Apply

1 - 3 years

3 - 5 Lacs

Bengaluru

Work from Office

Naukri logo

As a DevOps Developer for the IBM Privileged Access Gateway Service, you will play a pivotal role in enhancing the developer experience, productivity, and satisfaction within the organization. Your primary responsibilities include: Collaborating with development teams to understand their needs and provide tailored solutions that align with the organization's goals and objectives. Designing and implementing Continuous Integration and Continuous Deployment (CI/CD) pipelines using tools like Jenkins, Tekton, etc. Designing and implementing tools for automated deployment and monitoring of multiple environments, ensuring seamless integration and scalability. Staying updated with the latest trends and best practices in DevOps and related technologies and incorporating them into the development platform. Ensuring security and compliance of the platforms, including patching, vulnerability detection, and threat mitigation. Providing on-call IT support and monitoring technical operations to maintain the stability and reliability of the developer platform. Collaborating with other teams to introduce best automation practices and tools, fostering a culture of innovation and continuous improvement. Embracing an Agile culture and employing relevant fit-for-purpose methodologies and tools such as GitHub, Jira, etc. Maintaining good communication skills and the ability to lead global teams remotely, ensuring effective collaboration and knowledge sharing. Implement and automate infrastructure solutions that support IBM Cloud products and infrastructure Implement, and maintain state-of-the-art CI/CD pipelines, ensuring full compliance with industry Implement, and maintain state-of-the-art CI/CD pipelines, ensuring full compliance with industry standards and regulatory frameworks. Administer automated CI/CD systems and tools Partner with other teams, managers and program managers to develop alerting and monitoring for mission-critical services Provide technical escalation support for other Infrastructure Operations team Orchestrate and manage infrastructure as code (IaC) implementations using cutting-edge tools like Terraform Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 1-3 Years Experience delivering code and debugging problems. 1-3 Years Experience in DevOps, SRE or similar role A strong preference for collaborative teamwork A rigorous approach to problem-solving Experience with cloud computing technologies Programming skills scripting, Go, Python, or similar Hands-on experience with Container technologies: Kubernetes (IKS), RedHat, OpenShift, Docker, Rancher, Podman Proficient with automation tools and CI/CDs like Jenkins, Tekton, Travis etc. Preferred technical and professional experience Strongly preferred experience in working with production Kubernetes/OpenShift environments. Excellent Git skills (merges, rebase, branching, forking, submodules) Experience with Ansible, Terraform Experience with C/C++, or Java Experience using, configuring and troubleshooting CI/CDs Excellent record of improving solutions through automation Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Kibana, Sysdig, LogDNA). SQL or NoSQL experience

Posted 3 months ago

Apply

6 - 11 years

20 - 35 Lacs

Gurgaon

Work from Office

Naukri logo

Set up and maintain observability tools (Grafana, Prometheus, Instana) for monitoring, logging, and alerting. • Write and maintain infrastructure as code using Terraform. -CI/CD, Jenkins, Git, Docker, Kubernetes, Grafana, Prometheus, Instana, ELK.

Posted 3 months ago

Apply

5 - 9 years

15 - 20 Lacs

Gurgaon

Work from Office

Naukri logo

Extensive monitoring tools experience (Grafana, Prometheus, ELK

Posted 3 months ago

Apply

2 - 5 years

20 - 35 Lacs

Bengaluru

Work from Office

Naukri logo

We are seeking a skilled and passionate DevOps Engineer to join our dynamic team in Bangalore. The ideal candidate will have experience in deploying, managing, and optimizing cloud infrastructure and automation tools. You will be instrumental in enhancing the overall operational efficiency of our platform and ensuring seamless collaboration between development and operations. Key Responsibilities: Design, implement, and manage infrastructure automation and orchestration tools. Collaborate with development teams to ensure seamless CI/CD pipelines and continuous integration practices. Manage cloud infrastructure (AWS, Azure, etc.) with a focus on cost-efficiency, scalability, and security. Troubleshoot and resolve issues related to system performance, availability, and security. Implement monitoring solutions to ensure real-time visibility of production systems and to proactively identify and address any system anomalies. Improve and maintain system reliability and uptime. Optimize deployment processes to improve speed and efficiency. Stay current on industry trends, technologies, and best practices. Required Skills and Qualifications: 2-5 years of hands-on experience in DevOps or related fields. Strong experience with cloud platforms (AWS, Azure, or GCP). Proficiency in automation tools (e.g., Terraform, Ansible, Jenkins). Solid experience with containerization (Docker, Kubernetes). Familiarity with infrastructure as code (IaC) principles. Experience with CI/CD pipeline setup and management. Strong scripting skills (Bash, Python, etc.). Good knowledge of monitoring and logging tools (Prometheus, Grafana, ELK Stack). Experience in troubleshooting and maintaining production environments. Excellent communication and collaboration skills. Preferred Qualifications: Experience with Agile methodologies. Knowledge of security best practices in cloud infrastructure. Familiarity with microservices architecture.

Posted 3 months ago

Apply

5 - 8 years

7 - 11 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities : Run the production environment by monitoring availability and taking a holistic view of system health Provide primary operational support and engineering for IBM infrastructure. Create sustainable systems and services through automation and uplifts Improve reliability, quality, and time-to-market of our suite of cloud solutions Provide support for production escalations and problem resolution for customers. Proactively identifying issues and improvement opportunities. Diagnose and resolve complex system, application software, security and related problems that impact system and availability. Gather and analyze metrics from production systems to assist in performance tuning and fault finding Partner with development teams to improve services through rigorous testing and release procedures Understand business needs to define automation requirements and product architectural solutions. Develop high-level product specifications with attention to system integration and feasibility Define all aspects of development from appropriate technology and workflow to coding standards Collaborate with other professionals to determine functional and non-functional requirements for automation software Participate in technical reviews of requirements, specifications, designs, code and other artifacts. Learn new skills and adopt new practices readily in order to develop innovative and cutting-edge software products that maintain Company's technical leadership position. Required education Bachelor's Degree Required technical and professional expertise 5-8 years of experience on Software Industry. Experience in Linux and Unix-Like operating systems Proficiency in one or more high level languages, such as Python Ability to write shell scripts - Automation Experience in Cloud services and technologies like VPC, Gateways, NACL, security group. Experience in Network debugging and Network routing protocols such as BGP, ISIS and others Experience in DevOps and Site Reliability Engineering. Understanding of Microservice Architecture, Docker, Kubernetes, and other cloud native technologies. Debugging/Monitoring knowledge of Cloud Native Applications using Devops Tools such as Prometheus, NewRelic, Instana and others. Good to have understanding on Devops Lifecycle and associated tools such as Git, CICD tools like Jenkins, Tekton, Travis and others Understanding of Cloud Computing (IAAS, PAAS, SAAS) and Security Principles Understanding of software quality assurance principles A technical mindset with great attention to detail A proactive approach to spotting problems, areas for improvement, and performance bottlenecks Outstanding communication and presentation abilities

Posted 3 months ago

Apply

4 - 9 years

11 - 20 Lacs

Bengaluru

Remote

Naukri logo

SRE profile with strong AWS expertise (CDK, Serverless architecture, lambda) Extensive monitoring tools experience (Grafana, Prometheus, ELK)

Posted 3 months ago

Apply

6 - 10 years

13 - 16 Lacs

Chennai, Pune, Delhi

Work from Office

Naukri logo

Design, implement, and maintain scalable data pipelines and infrastructure using Databricks, Redshift, and AWS services. Set up and manage Big Data environments, ensuring high availability and reliability of data processing systems. Develop and optimize ETL processes to transfer data between various sources, including S3, Redshift, and Databricks. Utilize AWS EMR for processing large datasets efficiently, leveraging Spark for distributed data processing. Implement monitoring solutions to track the performance and reliability of data pipelines and storage solutions. Use tools like Prometheus and Grafana to visualize metrics and identify bottlenecks in data workflows. Ensure data integrity and security across all platforms, implementing best practices for data access and management. Collaborate with data governance teams to establish policies for data quality and compliance. Work closely with software development teams to integrate data solutions into applications, ensuring minimal disruption and high performance. Provide insights on data architecture and best practices for leveraging data in applications. Respond to incidents related to data processing and storage, performing root cause analysis and implementing solutions to prevent recurrence. Facilitate blameless post-mortems to improve processes and systems continuously. Who you are: Bachelor?s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience. 4-8 years of experience in Data, Site Reliability Engineering, or a related field with a focus on data engineering within AWS. Proficiency in Databricks and Redshift, with experience in data warehousing and analytics. Strong knowledge of AWS services, particularly S3, Athena, and EMR, for data storage and processing. Experience with programming languages such as Python or Scala for data manipulation and automation. Familiarity with SQL for querying databases and performing data transformations. Experience with distributed computing frameworks, particularly Apache Spark, for processing large datasets. Knowledge of data lake and data warehouse architectures, including the use of Delta Lake for managing data in Databricks. Proficiency in using tools like Terraform or AWS CloudFormation for provisioning and managing infrastructure. Familiarity with monitoring tools and practices to ensure system reliability and performance, including the use of AWS CloudWatch. Tools and Technologies Data Platforms: Databricks, Amazon Redshift, AWS EMR, AWS S3, AWS Athena Big Data Frameworks: Apache Spark, Delta Lake Monitoring Tools: Prometheus, Grafana, AWS CloudWatch Infrastructure Management: Terraform, AWS CloudFormation Programming Languages: Python, Scala, SQL

Posted 3 months ago

Apply

10 - 18 years

19 - 34 Lacs

Pune, Bengaluru, Hyderabad

Work from Office

Naukri logo

Role & responsibilities As a Senior Site Reliability Engineer, you will play a critical role in providing expert guidance on Application and infrastructure best practices from reliability perspective. Your role covers the entire life cycle of a product/application. Your primary focus will be Automation, Observability, reliability and Release management with an emphasis on solving operations issues Must have at least 5+ years of SRE experience in large programs with focus on release engineering, observability tasks and reliability Must have good understanding of Site Reliability Engineering (SRE) and release management processes should possess strong analytical and troubleshooting skills Should be a strong team player and enjoy collaborating with different people and profiles as well as share knowledge and strive for continuous development and learning. Excellent communication skills along with leadership skills Preferred candidate profile Reliability practices Chaos engineering Strong experience on one or more Observability tools like New Relic, AppDynamics, Prometheus, Dynatrace, DataDog, Splunk etc Experience in event correlation using observability tools like Dynatrace or other tools like BigPanda Experience in Observability Dashboard creation, custom metrics, Synthetic Monitoring and Real User Monitoring (RUM) Experience in defining SLI, SLO, Error budgets and its measurement Experience in infrastructure automation tools like Terraform, Cloud Formation, Ansible, and Puppet (Any one) Experience in automation of infra scalability, infra fail over, infra-availability, performance mgmt. Experience in container orchestration and practices, including Kubernetes, Docker Swarm Understanding of automation avenues Good experience in scripting or development languages, including expertise in Python, Ruby, JSON, Java, and Node.JS, PHP (anyone) Experience with scripting in PowerShell(M) and Bash/Shell/Perl (anyone) Experience of Cloud platforms such as AWS, Azure, and Google Good communication skills Advantage with additional skills AIOps and related tools Experience in CICD tooling and best practices Systems Administration and operating system experience on Linux, windows, including an understanding of networking. Experience working on ITSM tools like Remedy, ServiceNow, Confluence, Jira Experience with Cloud cost optimization / FinOps.

Posted 3 months ago

Apply

2 - 7 years

4 - 9 Lacs

Bengaluru

Work from Office

Naukri logo

IBM Software Labs are research and development centers that create products, technology, and solutions for IBM. As member of zSW you will be having opportunity to work on emerging technologies to drive digital transformation for IBM and its clients. You will be part of worldwide zSW development organization that development product and offering to support customers in their hybrid cloud and AI journey. As a fullstack developer you will work within a squad model, responsible for development and test cases. Collaborate closely with the team, ensuring holistic and organic changes to projects. Operate across the entire technical stack, switching contexts as needed between pipeline activities, Ansible development, and infrastructure automation. Design, implement, and adapt solutions based on project requirements. Create clear and concise documentation using playbooks, Markdown (MD), and reStructuredText (RST). Engage in open-source development and interact with Fortune 500 clients, requiring strong communication and adaptability. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 3+ years of software development experience in Python and Shell Programming (including regex) 2+ years of experience in Ansibile developing (Jinja, Playbooks, Rulebooks). 3+ years experience in developing application for embedded, Enterprise or operating system development (knowledge of IBMz/LinuxONE is advantageous). Comfortable with Linux/Unix development environment (command line development, configuration, libraries,..etc). Familiar with Linux system administration (Linux/Unix fundamentals, administration experience.) Strong analytical, debugging and problem solving skills to analyse issues and defects reported by customer-facing and test teams. Proficient in source control management tools (GitHub, ) and with Agile Life Cycle Management tools. Preferred technical and professional experience Preferred Technical and Professional Expertise Experience in Containerisation & Orchestration technologies like Docker, Podman, Kubernetes, OpenShift. Familiar with Cloud Computing IBM Cloud, Azure, AWS. Experience creating scalable, reusable deployments. Understanding URLs, endpoints, authentication, authorization, parsing JSON/YAML structures. General Skills & Work Environment Ability to adapt to a dynamic DevOps environment with evolving technologies. Familiarity with open-source contributions and community engagement. Exposure to new-to-z/OS technologies such as Terraform, Prometheus, and other automation tools.

Posted 3 months ago

Apply

12 - 16 years

35 - 45 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities The Site Reliability Engineer is a critical role in Cloud based projects. An SRE works with the development squads to build platform & infrastructure management/provisioning automation and service monitoring using the same methods used in software development to support application development. SREs create a bridge between development and operations by applying a software engineering mindset to system administration topics. They split their time between operations/on-call duties and developing systems and software that help increase site reliability and performance. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Overall 12+ yrs experience required. Have good exposure to Operational aspects (Monitoring, Automation, Remediations) - Monitoring tools exposure like NewRelic, Prometheus, ELK, Distributed tracing, APM, App Dynamics, etc. Troubleshooting and documenting Root cause analysis and automate the incident Understands the Architecture, SRE mindset, Understands data model Platform Architecture and Engineering Ability to design, architect a Cloud platform that can meet Client SLAs /NFRs such as Availability, system performance etc. SRE will define the environment provisions framework, identify potential performance bottlenecks and design a cloud platform. Preferred technical and professional experience Effectively communicate with business and technical team members. Creative problem solving skills and superb communication Skill. Telecom domain experience is an added plus

Posted 3 months ago

Apply

8 - 12 years

9 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities Your Role: Software developer in the cloud storage area, implementing and consuming APIs in the IBM cloud infrastructure environment (IaaS). Motivated self-starter who loves to solve challenging problems and feels comfortable managing multiple and changing priorities, and meeting deadlines in an entrepreneurial environment Highly organized, detail-oriented, excellent time management skills and able toeffectively prioritize tasks in a fast-paced, high-volume, and evolving work environmentResponsibilities are: Designing and developing storage integrations to enable and support cloud platform business efforts. Participate in troubleshooting and fixing issues in existing cloud storage environment. Required to produce code that is secure, scalable, and reliable, supported by unit tests, functional tests, and technical documentation. Required to participate in code reviews for your peers' development work, triage and solve live customer issues, and participate in all scrum activities. Additionally, monitor, measure, and improve code and data performance for the application you help to develop. Available for occasional on-call shifts during weekdays and weekends All of this will take place in a strong team environment, which necessitates strong communication. Preferred location for this position is in Austin, TX. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 8-12 years of Industry experience Strong systems management experience in Linux/UNIX systems (RHEL preferred) Expert in Linux networking technologies, and routing protocols (BGP, FRR) Expert in Docker and containerization technologies Experience with cloud computing technologies such as AWS, VMware, Azure Experience with application deployment using CI/CD Experience with monitoring tools such as Prometheus, Sysdig, Grafana, etc. Preferred technical and professional experience Experience with Linux virtualization technologies such as KVM, Xen and QEMU Experience with Ceph, NFS, iSCSI, or object storage technologies Excellent Git skills (merges, rebase, branching, forking, submodules) Excellent with Python, Ansible, Terraform, Jenkins Microservices design and development in Kubernetes and GoLang (preferably) Experience with k8s CRDs, k8s controller programming with watcher informer model

Posted 3 months ago

Apply

2 - 6 years

5 - 9 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities We are looking for a skilled DevOps Engineer to join our team and help streamline our development and deployment processes. The ideal candidate will have expertise in automation, CI/CD pipelines, cloud platforms, infrastructure as code (IaC), and monitoring tools. This role involves working closely with developers, system administrators, and IT teams to enhance reliability, scalability, and security. Key Responsibilities: Design, implement, and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or Azure DevOps. Manage and automate cloud infrastructure using AWS, Azure, or Google Cloud (GCP). Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, or Ansible. Monitor system performance, troubleshoot issues, and optimize system availability using tools like Prometheus, Grafana, or ELK Stack. Enhance security by implementing DevSecOps best practices, vulnerability scanning, and compliance monitoring. Work closely with development teams to improve deployment speed, reliability, and automation. Manage containerized applications using Docker and orchestration tools like Kubernetes. Implement logging and monitoring solutions to ensure system health and quick issue resolution. Automate repetitive tasks using scripting languages such as Bash, Python, or PowerShell. Participate in incident management and on-call rotations to ensure system uptime. Required education Bachelor's Degree Required technical and professional expertise Bachelors degree in Computer Science, IT, or a related field (or equivalent experience). Strong experience with CI/CD tools like Jenkins, GitLab CI/CD, or GitHub Actions. Hands-on experience with cloud services (AWS, Azure, or GCP). Proficiency in Infrastructure as Code (IaC) tools like Terraform or CloudFormation. Experience with containerization and orchestration (Docker, Kubernetes). Knowledge of monitoring/logging tools like Prometheus, Grafana, ELK Stack, or Datadog. Scripting experience in Python, Bash, or PowerShell for automation. Understanding of networking, security best practices, and DevSecOps principles. Experience with version control systems like Git and GitOps methodologies. Strong problem-solving skills and ability to work in an Agile environment. Preferred technical and professional experience Certification in AWS, Azure, or Kubernetes (CKA, CKAD). Experience with serverless computing and cloud-native development. Exposure to configuration management tools like Ansible, Chef, or Puppet. Experience in setting up service mesh architectures using Istio or Linkerd.

Posted 3 months ago

Apply

6 - 10 years

12 - 15 Lacs

Chennai

Work from Office

Naukri logo

Manage and maintain cloud infrastructure on platforms such as AWS, Azure, or Google Cloud. Monitor cloud resources to ensure availability and scalability. Monitor and optimize cloud resource utilization. Required Candidate profile Bachelor’s Degree Computer Science, or related technology field, or equivalent practical experience. 3-4 years of experience in cloud operations and infrastructure management in AWS, Azure,GCP.

Posted 3 months ago

Apply

5 - 9 years

12 - 17 Lacs

Bengaluru

Work from Office

Naukri logo

Your role and responsibilities Job Summary We are seeking a talented and motivated DevOps Engineer to join our team with a minimum of 8years of experience. The ideal candidate will have hands-on experience with DevOps practices, supporting diverse platforms, and developing robust solutions using Python. The role requires expertise in OpenShift, a solid understanding of microservices architecture, and API orchestration. Familiarity with platforms like Watsonx.data, Watsonx.ai, Milvus, and/or Cloud Pak for Data is highly desirable. Experience with Presto and Spark is welcomed and will be considered a strong asset. The candidate must also have a strong focus on writing quality code, automation testing, and ensuring reliability. Exceptional problem determination skills, timeline management, and the ability to thrive in a fast-paced, dynamic environment are essential. Key Responsibilities Design, deploy, and manage highly scalable and reliable DevOps solutions across multiple platforms. Develop and maintain microservices-based architectures and ensure seamless API orchestration. Automate infrastructure and application deployments using tools and scripts, primarily in Python. Support and optimize OpenShift-based environments for high availability and performance. Collaborate with cross-functional teams to implement and support platforms like Watsonx.data, Watsonx.ai, Milvus, and Cloud Pak for Data. Work with Presto and Spark to support scalable, high-performance data processing. Write, maintain, and improve high-quality, reusable code that adheres to best practices. Implement and maintain automation testing frameworks to ensure code reliability and minimize defects. Develop and maintain CI/CD pipelines to streamline application delivery. Monitor system performance, conduct root cause analysis, and provide resolutions for production issues. Ensure compliance with industry standards and security best practices. Required education Bachelor's Degree Preferred education Bachelor's Degree Required technical and professional expertise Required Skills and Qualifications Proven experience in DevOps engineering and managing complex infrastructures. Strong proficiency in Python for scripting, automation, and development. Hands-on expertise with OpenShift and container orchestration tools. Solid understanding of microservices architecture and API orchestration. Deep experience in CI/CD pipelines, automation tools, and infrastructure as code (IaC). Strong focus on code quality and experience with automation testing frameworks (e.g., pytest, Selenium, or similar). Demonstrated ability in problem determination and solving complex technical issues. Exceptional skills in managing timelines and delivering projects on schedule. Preferred technical and professional experience Preferred Qualifications Experience working with Watsonx.data, Watsonx.ai, Milvus, or Cloud Pak for Data. Knowledge or hands-on experience with Presto and Spark for data processing and querying. Knowledge of cloud technologies (e.g., AWS, Azure, IBM Cloud). Familiarity with machine learning workflows and data pipeline management. Experience with monitoring tools (e.g., Prometheus, Grafana) and log aggregation systems (e.g., ELK stack). Strong communication skills and ability to work effectively in a team environment.

Posted 3 months ago

Apply

3 - 7 years

10 - 15 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities The Site Reliability Engineer is a critical role in Cloud based projects. An SRE works with the development squads to build platform & infrastructure management/provisioning automation and service monitoring using the same methods used in software development to support application development. SREs create a bridge between development and operations by applying a software engineering mindset to system administration topics. They split their time between operations/on-call duties and developing systems and software that help increase site reliability and performance. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Overall 12+ yrs experience required. Have good exposure to Operational aspects (Monitoring, Automation, Remediations) - Monitoring tools exposure like NewRelic, Prometheus, ELK, Distributed tracing, APM, App Dynamics, etc. Troubleshooting and documenting Root cause analysis and automate the incident Understands the Architecture, SRE mindset, Understands data model Platform Architecture and Engineering Ability to design, architect a Cloud platform that can meet Client SLAs /NFRs such as Availability, system performance etc. SRE will define the environment provisions framework, identify potential performance bottlenecks and design a cloud platform. Preferred technical and professional experience Effectively communicate with business and technical team members. Creative problem solving skills and superb communication Skill. Telecom domain experience is an added plus

Posted 3 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies