Jobs
Interviews

1154 Prometheus Jobs - Page 11

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 - 11.0 years

0 Lacs

karnataka

On-site

As a Hybrid Cloud Delivery Solution Architect specializing in Infrastructure as Code (IaC), DevSecOps, and Observability, you will be responsible for architecting and delivering hybrid cloud solutions across private and public cloud environments. Your role will involve implementing IaC using tools such as Terraform, Ansible, Python, Golang, or CloudFormation. You will lead DevSecOps practices across CI/CD pipelines, incorporating automated security testing. Additionally, you will design and manage observability platforms utilizing tools like Prometheus, Grafana, ELK, and OpsRamp. Collaboration with stakeholders to define architectures and provide guidance to delivery teams will be a crucial aspect of your role. You will drive container orchestration using Kubernetes, OpenShift, or Rancher and ensure the creation of technical documentation aligning with security and compliance standards. Engaging with senior-level stakeholders, including C-level executives, and leading customer interactions will also form part of your responsibilities. To excel in this role, you should possess over 10 years of experience in solution architecture within hybrid and multi-cloud environments. Strong hands-on experience with IaC tools like Terraform, Ansible, and CloudFormation is required. Proficiency in DevSecOps practices, observability tools, Kubernetes, and cloud security is essential. A solid understanding of identity, networking, and compliance standards such as ISO, SOC2, and GDPR is also expected. Excellent communication skills, stakeholder management abilities, and team leadership qualities are significant assets for this position. Preferred qualifications include experience in multi-cloud architecture, edge-to-cloud integration, familiarity with serverless architecture and microservices, and exposure to Site Reliability Engineering (SRE) practices. If you are looking to leverage your expertise in Kubernetes, automated security testing, observability platforms, IaC tools, team leadership, and other relevant skills within a dynamic hybrid cloud environment, this role offers an exciting opportunity for professional growth and impact.,

Posted 2 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

thiruvananthapuram, kerala

On-site

You should have a minimum of 5 years of experience in DevOps, SRE, or Infrastructure Engineering. Your expertise should include a strong command of Azure Cloud and Infrastructure-as-Code using tools such as Terraform and CloudFormation. Proficiency in Docker and Kubernetes is essential. You should be hands-on with CI/CD tools and scripting languages like Bash, Python, or Go. A solid knowledge of Linux, networking, and security best practices is required. Experience with monitoring and logging tools such as ELK, Prometheus, and Grafana is expected. Familiarity with GitOps, Helm charts, and automation will be an advantage. Your key responsibilities will involve designing and managing CI/CD pipelines using tools like Jenkins, GitLab CI/CD, and GitHub Actions. You will be responsible for automating infrastructure provisioning through tools like Terraform, Ansible, and Pulumi. Monitoring and optimizing cloud environments, implementing containerization and orchestration with Docker and Kubernetes (EKS/GKE/AKS), and maintaining logging, monitoring, and alerting systems (ELK, Prometheus, Grafana, Datadog) are crucial aspects of the role. Ensuring system security, availability, and performance tuning, managing secrets and credentials using tools like Vault and Secrets Manager, troubleshooting infrastructure and deployment issues, and implementing blue-green and canary deployments will be part of your responsibilities. Collaboration with developers to enhance system reliability and productivity is key. Preferred skills include certification as an Azure DevOps Engineer, experience with multi-cloud environments, microservices, and event-driven systems, as well as exposure to AI/ML pipelines and data engineering workflows.,

Posted 2 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

hyderabad, telangana

On-site

The role of DevOps Specialist at our organization requires a seasoned professional with over 8 years of experience in DevOps & Infrastructure practices, specifically focusing on AWS, Azure, and GCP cloud environments. As a DevOps Specialist, you will be responsible for orchestrating and managing Docker and Kubernetes environments to ensure scalable application deployment. Your key responsibilities will include designing and implementing microservices-based architecture, supporting and optimizing CI/CD pipeline architecture, and managing infrastructure using Infrastructure-as-Code tools like Terraform, Pulumi, or CloudFormation. Additionally, you will be expected to maintain version control practices using Git and related tools. In this role, you will also be required to build, monitor, and maintain Development, Staging, and Production environments, develop automation scripts using Python, Bash, or similar scripting languages, and implement and support CI/CD pipelines using tools like Jenkins, GitHub Actions, or AWS CodePipeline. Monitoring and logging tools such as Prometheus and Grafana will be utilized to ensure system performance and reliability. Incident response, root cause analysis, and preventive measures will also be part of your responsibilities as you collaborate with Agile teams and follow DevOps best practices to drive continuous improvement. The primary skills required for this role include strong hands-on experience with Docker and Kubernetes, proficiency in configuring Kubernetes resources using YAML or GitOps, a solid understanding of microservices architecture and SDLC, experience with Ansible for configuration management, proficiency in using Infrastructure-as-Code tools, and a working knowledge of Git and version control systems. Secondary skills that will be beneficial for this role include experience with CI/CD tools, strong scripting skills, familiarity with monitoring and alerting tools, experience managing multiple environments and deployment pipelines, excellent analytical and problem-solving skills, and familiarity with Agile and DevOps methodologies. To qualify for this position, you should hold a Bachelor's degree in computer science, information technology, or a related field. If you are looking to be part of a dynamic team and lead the development and implementation of robust DevOps & Infrastructure practices, this role could be the perfect fit for you.,

Posted 2 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

maharashtra

On-site

The role of Staff Engineer - Data in SonyLIV's Digital Business is to lead the data engineering strategy, architect scalable data infrastructure, drive innovation in data processing, ensure operational excellence, and build a high-performance team to enable data-driven insights for OTT content and user engagement. This position is based in Mumbai and requires a minimum of 8 years of experience in the field. Responsibilities include defining the technical vision for scalable data infrastructure using modern technologies like Spark, Kafka, Snowflake, and cloud services, leading innovation in data processing and architecture through real-time data processing and streaming analytics, ensuring operational excellence in data systems by setting and enforcing standards for data reliability and privacy, building and mentoring a high-caliber data engineering team, collaborating with cross-functional teams, and driving data quality and business insights through automated quality frameworks and BI dashboards. The successful candidate should have 8+ years of experience in data engineering, business intelligence, and data warehousing, with expertise in high-volume, real-time data environments. They should possess a proven track record in building and managing large data engineering teams, designing and implementing scalable data architectures, proficiency in SQL, experience with object-oriented programming languages, and knowledge of A/B testing methodologies and statistical analysis. Preferred qualifications include a degree in a related technical field, experience managing the end-to-end data engineering lifecycle, working with large-scale infrastructure, familiarity with automated data lineage and auditing tools, expertise with BI and visualization tools, and advanced processing frameworks. Joining SonyLIV offers the opportunity to drive the future of data-driven entertainment by collaborating with industry professionals, working with comprehensive data sets, leveraging cutting-edge technology, and making a tangible impact on product delivery and user engagement. The ideal candidate will bring a strong foundation in data infrastructure, experience in leading and scaling data teams, and a focus on operational excellence to enhance efficiency.,

Posted 2 weeks ago

Apply

6.0 - 10.0 years

12 - 22 Lacs

Bhubaneswar, Hyderabad

Work from Office

Greetings from Finix !!! We are looking for Sr and Mid Full stack developer with a strong backend focus to work on enterprise applications using Spring Boot & Reactjs in an Agile environment. Experience in Prometheus, Grafana is required.

Posted 2 weeks ago

Apply

3.0 - 5.0 years

5 - 7 Lacs

Bengaluru

Work from Office

Job Summary: We are seeking a detail-oriented and experienced Stonebranch Universal Automation Center (UAC) Scheduler to manage, configure, and support enterprise job scheduling and automation solutions. The ideal candidate will have hands-on expertise in designing and maintaining workflows within the Stonebranch UAC platform, ensuring efficient, reliable, and scalable job execution across on-prem and cloud environments. Key Responsibilities: Develop, schedule, monitor, and maintain batch jobs, workflows, and tasks using Stonebranch UAC. Design and implement automation solutions for file transfers, ETL processes, application integrations, and system jobs. Collaborate with application, infrastructure, and DevOps teams to understand requirements and build automation accordingly. Configure UAC agents, task types, triggers, and user access controls. Monitor job execution, resolve failures, and provide root cause analysis. Maintain comprehensive documentation for workflows, processes, and configuration. Assist in upgrade planning, testing, and deployment of Stonebranch UAC components. Participate in disaster recovery planning, auditing, and compliance activities. Provide on-call or after-hours support for critical job schedules. Required Qualifications: 4+ years of hands-on experience with Stonebranch Universal Automation Center (UAC). Strong understanding of job scheduling concepts and enterprise automation best practices. Proficiency in scripting languages (e.g., Bash, PowerShell, Python). Experience with cloud platforms (AWS, Azure, GCP) and hybrid environments. Familiarity with ITIL processes, incident management, and change control. Good analytical, problem-solving, and troubleshooting skills. Excellent communication and collaboration abilities. Preferred Qualifications: Experience integrating UAC with tools like SAP, ServiceNow, Oracle, or Kubernetes. Familiarity with REST APIs and developing custom UAC task plugins. Knowledge of other job schedulers (e.g., IBM Workload Automation, Control-M, AutoSys, UC4) is a plus. Work Environment & Tools: Stonebranch UAC Web UI Command line and agent-based scheduling Monitoring tools (e.g., Splunk, Prometheus, Grafana) Collaboration tools (e.g., Jira, Confluence, Slack) Mandatory Skills: BMC Control M Experience : 3-5 Years.

Posted 2 weeks ago

Apply

1.0 - 5.0 years

3 - 7 Lacs

Chennai

Work from Office

Job Title: ELK & Grafana Architect Design, implement, and optimize ELK solutions to meet data analytics and search requirements. Collaborate with development and operations teams to enhance logging capabilities. Implement and configure components of the Elastic Stack, including Filebeat, Metricbeat, Winlogbeat, Logstash, and Kibana. Create and maintain comprehensive documentation for Elastic Stack configurations and processes. Ensure seamless integration between various Elastic Stack components. Develop and maintain advanced Kibana dashboards and visualizations. Design and implement solutions for centralized logs, infrastructure health metrics, and distributed tracing for different applications. Implement Grafana for visualization and monitoring, including Prometheus and Loki for metrics and logs management. Build detailed technical designs related to monitoring as part of complex projects. Ensure engagement with customers and deliver business value. Requirements : 6+ years of experience as an ELK Architect/Elastic Search Architect. Hands-on experience with Prometheus, Loki, OpenTelemetry, and Azure Monitor. Experience with data pipelines and redirecting Prometheus metrics. Proficiency in scripting and programming languages such as Python, Ansible, and Bash. Familiarity with CI/CD deployment pipelines (Ansible, GIT). Strong knowledge of performance monitoring, metrics, capacity planning, and management. Excellent communication skills with the ability to articulate technical details to different audiences. Experience with application onboarding, capturing requirements, understanding data sources, and architecture diagrams. Experience with OpenTelemetry monitoring and logging solutions. 3.Competency Building and Branding Ensure completion of necessary trainings and certifications Develop Proof of Concepts (POCs),case studies, demos etc. for new growth areas based on market and customer research Develop and present a point of view of Wipro on solution design and architect by writing white papers, blogs etc. Attain market referencability and recognition through highest analyst rankings, client testimonials and partner credits Be the voice of Wipros Thought Leadership by speaking in forums (internal and external) Mentor developers, designers and Junior architects in the project for their further career development and enhancement Contribute to the architecture practice by conducting selection interviews etc

Posted 2 weeks ago

Apply

5.0 - 8.0 years

6 - 10 Lacs

Mumbai, Coimbatore

Work from Office

Job Summary: We are seeking a detail-oriented and experienced Stonebranch Universal Automation Center (UAC) Scheduler to manage, configure, and support enterprise job scheduling and automation solutions. The ideal candidate will have hands-on expertise in designing and maintaining workflows within the Stonebranch UAC platform, ensuring efficient, reliable, and scalable job execution across on-prem and cloud environments. Key Responsibilities: Develop, schedule, monitor, and maintain batch jobs, workflows, and tasks using Stonebranch UAC. Design and implement automation solutions for file transfers, ETL processes, application integrations, and system jobs. Collaborate with application, infrastructure, and DevOps teams to understand requirements and build automation accordingly. Configure UAC agents, task types, triggers, and user access controls. Monitor job execution, resolve failures, and provide root cause analysis. Maintain comprehensive documentation for workflows, processes, and configuration. Assist in upgrade planning, testing, and deployment of Stonebranch UAC components. Participate in disaster recovery planning, auditing, and compliance activities. Provide on-call or after-hours support for critical job schedules. Required Qualifications: 4+ years of hands-on experience with Stonebranch Universal Automation Center (UAC). Strong understanding of job scheduling concepts and enterprise automation best practices. Proficiency in scripting languages (e.g., Bash, PowerShell, Python). Experience with cloud platforms (AWS, Azure, GCP) and hybrid environments. Familiarity with ITIL processes, incident management, and change control. Good analytical, problem-solving, and troubleshooting skills. Excellent communication and collaboration abilities. Preferred Qualifications: Experience integrating UAC with tools like SAP, ServiceNow, Oracle, or Kubernetes. Familiarity with REST APIs and developing custom UAC task plugins. Knowledge of other job schedulers (e.g., IBM Workload Automation, Control-M, AutoSys, UC4) is a plus. Work Environment & Tools: Stonebranch UAC Web UI Command line and agent-based scheduling Monitoring tools (e.g., Splunk, Prometheus, Grafana) Collaboration tools (e.g., Jira, Confluence, Slack) Do 1. Provide tool design, development and deployment support for the project delivery a. Interact with the internal project or client to understand the project requirement from a tool perspective b. Design the solution keeping in mind the tool requirements, current tools available as well as details on licenses required etc. c. Provide budget and timeline estimates for the tool development/ deployment as required d. For any new tool development, identify sources for development (internal or 3rd party) and work with the project managers on the development of the tool keeping in mind the production rollout timelines e. Conduct commercial discussions with the 3rd party vendors for licenses or tool development f. Conduct appropriate testing to ensure error free deployment of the tool on the project g. Ensure deployment of tool on time and within the estimated budget Mandatory Skills: BMC Control M. Experience: 5-8 Years.

Posted 2 weeks ago

Apply

1.0 - 4.0 years

3 - 6 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

1+ years experience in below skills: Must Have: - Troubleshoot platform issues - Manage configs via YAML/Helm/Kustomize - Upgrade & maintain OpenShift clusters & Operators - Support CI/CD pipelines, DevOps teams - Monitoring Prometheus/ EFK/ Grafana Required Candidate profile Participate in CR/patch planning Automate provisioning (namespaces, RBAC, NetworkPolicies) Open to work in 24x7 Rotational Coverage/ On-call Support Immediate Joiner is plus Excellent in communication

Posted 2 weeks ago

Apply

6.0 - 11.0 years

9 - 15 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

6+ years experience in below skills: Must Have: - Troubleshoot platform issues - Manage configs via YAML/Helm/Kustomize - Upgrade & maintain OpenShift clusters & Operators - Support CI/CD pipelines, DevOps teams - Monitoring Prometheus/ EFK/ Grafana Required Candidate profile Participate in CR/patch planning Automate provisioning (namespaces, RBAC, NetworkPolicies) Open to work in 24x7 Rotational Coverage/ On-call Support Immediate Joiner is plus Excellent in communication

Posted 2 weeks ago

Apply

5.0 - 10.0 years

25 - 40 Lacs

Noida

Work from Office

Description: Hiring Golang Developer for Noida location. Requirements: Job Title: Go Developer with Kubernetes Experience Location: Noida Experience Level: 8+ Years Team: Engineering / Platform Team About the Role We are looking for a skilled Go (Golang) Developer with working knowledge of Kubernetes. The ideal candidate will be proficient in building scalable backend systems using Go, and comfortable working with cloud-native technologies and Kubernetes for deployment, monitoring, and management. This is a hands-on engineering role that bridges application development and infrastructure orchestration, ideal for someone who enjoys both writing clean code and understanding how it runs in modern containerized environments. You will be involved in ensuring reliable, highly available, scalable, maintainable and highly secure systems. Candidates who fit these roles come from both systems and software development backgrounds. Your development background will help you in designing large scale, highly distributed and fault tolerant applications. Your systems background will help you in ensuring the uptime and reliability through monitoring deep system parameters and remediating issues at the systems level. Skills Golang Kubernetes Docker CI/CD Cloud Platforms (Azure, AWS, Google Cloud etc.) Microservices Git Linux System Monitoring and Logging Job Responsibilities: Responsibilities: - Designing, developing, and maintaining scalable and efficient applications using the Go programming language with a strong grasp of idiomatic Go, interfaces, channels, and goroutines. - Experience developing scalable backend services (microservices, APIs). - Understanding of REST and distributed systems. - Hands-on experience in Public Cloud – Azure, AWS, etc. - Experience with Docker and deploying containerized applications to Kubernetes. - Familiarity with Kubernetes concepts: pods, services, deployments, config maps, secrets, health checks. - Collaborating on the design and implementation of CI/CD pipelines for automated testing and deployment. - Implement best practices for software development and infrastructure management. - Monitor system performance and troubleshoot issues. - Write and maintain technical documentation. - Comfortable with logging/monitoring tools like Prometheus, Grafana, ELK stack New Relic, Splunk etc. - Keeping abreast of the latest advancements in Kubernetes, Go, and cloud-native technologies. - Good communication and teamwork skills. - Excellent problem-solving skills and attention to detail. - Management and leadership experience very helpful. What We Offer: Exciting Projects: We focus on industries like High-Tech, communication, media, healthcare, retail and telecom. Our customer list is full of fantastic global brands and leaders who love what we build for them. Collaborative Environment: You Can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities! Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules, opportunities to work from home, and paid time off and holidays. Professional Development: Our dedicated Learning & Development team regularly organizes Communication skills training(GL Vantage, Toast Master),Stress Management program, professional certifications, and technical and soft skill trainings. Excellent Benefits: We provide our employees with competitive salaries, family medical insurance, Group Term Life Insurance, Group Personal Accident Insurance , NPS(National Pension Scheme ), Periodic health awareness program, extended maternity leave, annual performance bonuses, and referral bonuses. Fun Perks: We want you to love where you work, which is why we host sports events, cultural activities, offer food on subsidies rates, Corporate parties. Our vibrant offices also include dedicated GL Zones, rooftop decks and GL Club where you can drink coffee or tea with your colleagues over a game of table and offer discounts for popular stores and restaurants!

Posted 2 weeks ago

Apply

3.0 - 6.0 years

6 - 9 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

3+ years experience in below skills: Must Have: - Troubleshoot platform issues - Manage configs via YAML/Helm/Kustomize - Upgrade & maintain OpenShift clusters & Operators - Support CI/CD pipelines, DevOps teams - Monitoring Prometheus/ EFK/ Grafana Required Candidate profile Participate in CR/patch planning Automate provisioning (namespaces, RBAC, NetworkPolicies) Open to work in 24x7 Rotational Coverage/ On-call Support Immediate Joiner is plus Excellent in communication

Posted 2 weeks ago

Apply

5.0 - 7.0 years

15 - 25 Lacs

Pune

Work from Office

Required Skills and Qualifications: 3+ years of backend development experience in Java (Java 8+) and Spring Boot Strong understanding of REST APIs, JPA/Hibernate, and SQL databases (e.g., PostgreSQL, MySQL) Knowledge of software engineering principles and design patterns Experience with testing frameworks like JUnit and Mockito Familiarity with Docker and CI/CD tools Good communication and team collaboration skills Roles and Responsibilities Key Responsibilities: Develop and maintain backend systems using Java (Spring Boot) Build RESTful APIs and integrate with databases and third-party services Write unit and integration tests to ensure code quality Participate in code reviews and collaborate with peers and senior engineers Follow clean code principles and best practices in microservices design Support CI/CD deployment pipelines and container-based workflows Continuously learn and stay updated with backend technologies Required Skills and Qualifications: 3+ years of backend development experience in Java (Java 8+) and Spring Boot Strong understanding of REST APIs, JPA/Hibernate, and SQL databases (e.g., PostgreSQL, MySQL) Knowledge of software engineering principles and design patterns Experience with testing frameworks like JUnit and Mockito Familiarity with Docker and CI/CD tools Good communication and team collaboration skills Nice to Have: Exposure to Kubernetes and cloud platforms (AWS, GCP, etc.) Familiarity with messaging systems like Kafka or RabbitMQ Awareness of security standards and authentication protocols (OAuth2, JWT) Interest in DevOps practices and monitoring tools (Prometheus, ELK, etc.)

Posted 2 weeks ago

Apply

5.0 - 8.0 years

7 - 10 Lacs

Chennai, Bengaluru

Work from Office

Automated build/deployment pipelines (Jenkins, GitHub Actions, ArgoCD) for Python based projects. Automation using tools like Ansible. Containerization (Docker or Podman) Python programming experience required. Data or Machine learning CI/CD pipelines Linux OS including shell scripting. Container management (Kubernetes/OpenShift) Helm based pipelines. Automating security/regulatory checks Automated patching Production support (typically done on rotation) Agile engineering and practices Networking fundamentals Security engineering practice and tool Responsibilities: Experienced in DevOps methodologies. Sound infrastructure and security knowledge. Clear internal and external communication, improving transparency across teams and stakeholders. Automate deployment and configuration management tasks to ensure consistent and scalable environments. Troubleshoot and resolve infrastructure and application issues in collaboration with development and operations teams. Implement and manage CI/CD pipelines for deployments on Container Technologies like Native Kubernetes, OpenShift or Public Cloud like Microsoft Azure/AWS/OCI. Sound Knowledge on Micro Service Authentication and Authorization process. Collaborate with cross-functional teams to design, deploy, and manage different environments using Terraform, Ansible, and other IaC tools. Sound Knowledge on Public Cloud providers like Microsoft Azure/AWS/OCI. Strong understanding of GitHub Actions or Jenkins-based Continuous Integration Continuous Deployment/Delivery (CICD) with automation using Shared Library, Pipeline Code, Groovy or Python scripting. Understanding of GitOps tool like Argo CD. Strong understanding on logging and monitoring solutions like ELK, Prometheus and Grafana Python programming experience preferred.

Posted 2 weeks ago

Apply

12.0 - 22.0 years

45 - 65 Lacs

Hyderabad

Work from Office

Role & responsibilities As a Senior Manager, you will work with and manage the engineering team in the Hyderabad Development Centre to deliver the goals and objectives of the business. As a leader, you must be capable of working in a matrixed organization and coordinating the delivery of multiple outcomes. You will be hands-on in terms of design, architecture, and development and should be able to lead the team from front in any critical situation. As a people leader first and a delivery manager second, you must build, inspire, and lead the technical teams. In this role, you are expected to work with stakeholders and internal customers across the different GAP tech locations. You will be managing the Engineering Platform Observe team that set modern architecture principles to promote innovation, flexibility, and reuse. Our team support the engineering teams in building automation to help enable developer success across all our brands and markets. You'll play a key role in building, maintaining, and supporting GAPs next-generation Observability platform enabling innovation, solutioning and exceptional developer experience. We have a sharp technical team, and you will be working with many high-performing software development professionals in a friendly, open-minded, and diverse environment. What Youll Do: Lead DevOps best practices and mentor a team of Observability engineers working towards optimizing our monitoring solutions. Develop the roadmap and strategy of seamlessly onboarding the Product teams on our Observability solutions. Architecture and enhance implementation of Observability platforms across the organization. Present possible updates, recommendations, strategic opportunities to local & US leadership. Develop relationships with local business leaders. Strong desire to simplify the developers debug experience by adopting and on boarding the right tools across the enterprise. Develop an understanding of GAP's Observability Pipelines to automate and enhance user experience. Participate in the design of new or changing monitoring needs. Build, operationalize, and maintain Observability solutions for our technology customers Participate in problem solving and troubleshooting for the assigned applications, functional areas or projects Stay current with changes in the technical area of expertise Build, maintain, and support enterprise production systems with a business mindset, keeping an eye towards simplicity, reliability, maintainability, scalability, extensibility and performance Drive resolution of operational and production issues in a timely manner. Support internal customers in adopting our Next Generation Observability pipelines. Work with the team to develop features and improvements. Identifies opportunities to eliminate or automate remediation through RCA for recurring issues to improve overall operational stability of software applications and systems Preferred candidate profile Minimum 5 years experience in Engineering Leadership position, overall 12+ years of work experience. Hands on experience and managing operations of large-scale internet-centric production environments for application or infrastructure services serving tens to millions of end users. Excellent decision-making, problem-solving and time management skills. Demonstrated ability to innovate and operate outside the comfort zone of established methods and procedures Demonstrated ability to gain immediate credibility at all levels both inside and outside the organization and develop lasting, productive and collaborative relationships Excellent communication and influencing skills including the ability to simplify key messages, present compelling stories and promote technical and personal credibility with internal and external executives, and both technical and non-technical audiences Willingly shares relevant technical and/or industry knowledge and expertise in order to mentor team members. Strong hands on experience with latest Observability trends. Asses new Observability technologies and their potential fit within our current ecosystem. Support the team's technical growth through code reviews, architecture discussions, and knowledge sharing. Drive the development of tools to streamline developer workflows, in collaboration with other teams. Efficiently collaborate with other cross-functional teams in driving initiatives. Participate in an on-call rotation as needed by the business. Retail/Ecommerce industry experience preferred Strong considerable hands-on experience with monitoring tools like Grafana, Prometheus, OpenTelemetry (OTEL), NewRelic, Nagios & Splunk or similar tools. Proficiency with Infrastructure as Code patterns & tools (e.g. ARM, Terraform, GitOps) Proficiency with Multi cloud platforms Observability solutions like Azure Monitor, Google Cloud Observability or AWS Cloudwatch, Working on at least one Kubernetes cloud offering (AKS/GKE) or on-prem Kubernetes (native Kubernetes) Experience with Unix platforms, system administration skills in UNIX Appreciation and preference for open-source solutions like OTEL or eBPF. Ability to maintain and manage observability tools to look at logs, metrics & traces to diagnose issues within that system. Experience in scaling infrastructure to support high-throughput data-intensive applications Experience working on projects following Agile methodologies You're proficient in at least one programming language (e.g., Python, Java, Go) and comfortable working across different types of languages as needed. Working knowledge of Collaboration tools like Slack, JIRA & Confluence & Service Management tools like ServiceNow & PagerDuty About Us: Hyderabad Development Center (HDC): Launched in March 2017 with a small pilot team, Gap Inc.’s Hyderabad Development Center has grown into the India’s largest fashion retail technology hub with 800+ employees today. HDC plays a pivotal role in driving innovation across digital technology, engineering, employee enablement, cybersecurity, data science, product management and customer experience. Home to 40% of Gap Inc.’s global tech workforce, this young and diverse team is powering cutting-edge e-commerce and enterprise solutions for our people and iconic brands. Our growth is powered by a strong focus on nurturing talent and shaping the next generation of innovators in fashion retail technology. About Gap Inc.: Gap Inc., a house of iconic brands, is the largest specialty apparel company in America. Its Old Navy, Gap, Banana Republic, and Athleta brands offer clothing, accessories, and lifestyle products for men, women and children. Since 1969, Gap Inc. has created products and experiences that shape culture, while doing right by employees, communities and the planet. Gap Inc. products are available worldwide through company-operated stores, franchise stores, and e-commerce sites. Fiscal year 2024 net sales were $15.1 billion. For more information, please visit www.gapinc.com.

Posted 2 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

vadodara, gujarat

On-site

As a Senior Software Engineer (Java Developer) at our organization, you will play a crucial role in designing, developing, and deploying high-performance Java-based microservices. Your expertise in Core Java, Spring Boot, and Microservices Architecture will be essential in implementing REST APIs following OpenAPI/Swagger standards. Your responsibilities will focus on ensuring the quality, automation, testing, performance optimization, and monitoring of our systems. In terms of design and development, you will be required to adhere to API-first and Cloud-native design principles while driving the adoption of automated unit tests, integration tests, and contract tests. Your role will involve developing and extending automation frameworks for API and integration-level testing, as well as supporting BDD/TDD practices across development teams. Furthermore, you will contribute to performance tuning, scalability, asynchronous processing, and fault tolerance aspects of the system. Your collaboration with DevOps, Product Owners, and QA teams will be crucial for feature delivery. Additionally, mentoring junior developers, conducting code walkthroughs, and leading design discussions will be part of your responsibilities. The ideal candidate should have at least 5 years of hands-on Java development experience and a deep understanding of Microservices design patterns, API Gateways, and service discovery. Exposure to Cloud deployment models like AWS ECS/EKS, Azure AKS, or GCP GKE is preferred. Proficiency with Git, Jenkins, SonarQube, and containerization (Docker/Kubernetes), along with experience working in Agile/Scrum teams, is highly desired. Experience with API security standards (OAuth2, JWT), event-driven architecture using Kafka or RabbitMQ, Infrastructure as Code (IaC) tools like Terraform or CloudFormation, and performance testing tools like JMeter or Gatling would be considered a plus. Your ownership-driven mindset, strong communication skills, and ability to solve technical problems under tight deadlines will be valuable assets in this role. It is essential for every individual working with or on behalf of our organization to prioritize information security. This includes abiding by security policies, ensuring confidentiality and integrity of information, reporting any security violations, breaches, and completing mandatory security trainings as per company guidelines. If you are a passionate and skilled Senior Software Engineer with expertise in Java development and a desire to contribute to scalable backend systems, we encourage you to apply for this role and join our dynamic team.,

Posted 2 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

chennai, tamil nadu

On-site

The job is located in Chennai, Tamil Nadu, India with the company Hitachi Energy India Development Centre (IDC). As part of the Engineering & Science profession, the job is full-time and not remote. The primary focus of the India Development Centre is on research and development, with around 500 R&D engineers, specialists, and experts dedicated to creating and sustaining digital solutions, new products, and technology. The centre collaborates with Hitachi Energy's R&D and Research centres across more than 15 locations in 12 countries. The mission of Hitachi Energy is to advance the world's energy system to be more sustainable, flexible, and secure while considering social, environmental, and economic aspects. The company has a strong global presence with installations in over 140 countries. As a potential candidate for this role, your responsibilities include: - Meeting milestones and deadlines while staying on scope - Providing suggestions for improvements and being open to new ideas - Collaborating with a diverse team across different time zones - Enhancing processes for continuous integration, deployment, testing, and release management - Ensuring the highest standards of security - Developing, maintaining, and supporting Azure infrastructure and system software components - Providing guidance to developers on building solutions using Azure technologies - Owning the overall architecture in Azure - Ensuring application performance, uptime, and scalability - Leading CI/CD processes design and implementation - Defining best practices for application deployment and infrastructure maintenance - Monitoring and reporting on compute/storage costs - Managing deployment of a .NET microservices based solution - Upholding Hitachi Energy's core values of safety and integrity Your background should ideally include: - 3+ years of experience in Azure DevOps, CI/CD, configuration management, and test automation - 2+ years of experience in various Azure technologies such as IAC, ARM, YAML, Azure PaaS, Azure Active Directory, Kubernetes, and Application Insight - Proficiency in Bash scripting - Hands-on experience with Azure components and services - Building and maintaining large-scale SaaS solutions - Familiarity with SQL, PostgreSQL, NoSQL, Redis databases - Expertise in infrastructure as code automation and monitoring - Understanding of security concepts and best practices - Experience with deployment tools like Helm charts and docker-compose - Proficiency in at least one programming language (e.g., Python, C#) - Experience with system management in Linux environment - Knowledge of logging & visualization tools like ELK stack, Prometheus, Grafana - Experience in Azure Data Factory, WAF, streaming data, big data/analytics Proficiency in spoken and written English is essential for this role. If you have a disability and require accommodations during the job application process, you can request reasonable accommodations through Hitachi Energy's website by completing a general inquiry form. This assistance is specifically for individuals with disabilities needing accessibility support during the application process.,

Posted 2 weeks ago

Apply

1.0 - 5.0 years

0 Lacs

kochi, kerala

On-site

The Software DevOps Engineer (1-3 Years Experience) position requires a Bachelor's degree in Computer Science, Information Technology, or a related field along with 1-3 years of experience in a DevOps or related role. As a Software DevOps Engineer, your responsibilities will include designing, implementing, and maintaining CI/CD pipelines to ensure efficient and reliable software delivery. You will collaborate with Development, QA, and Operations teams to streamline the deployment and operation of applications. Monitoring system performance, identifying bottlenecks, and troubleshooting issues to ensure high availability and reliability are also part of your role. Furthermore, you will automate repetitive tasks and processes to improve efficiency and reduce manual intervention. Participating in code reviews, contributing to the improvement of best practices and standards, and implementing and managing infrastructure as code (IaC) using Terraform are essential duties. Documentation of processes, configurations, and procedures for future reference is required. Staying updated with the latest industry trends and technologies to continuously improve DevOps processes, as well as creating POC for the latest tools and technologies are part of the job. The mandatory skills for this position include proficiency in Azure Cloud, Azure DevOps, CI/CD Pipeline, Version control (git), Linux Commands, Bash Script, Docker, Kubernetes, Helm Charts, Monitoring tools like Grafana, Prometheus, ELK Stack, Azure Monitoring, Azure, AKS, Azure Storage, Virtual Machine, an understanding of micro-services architecture, orchestration, and Sql Server. Optional skills that are beneficial for this role include Ansible Script, Kafka, MongoDB, Key Vault, and Azure CLI. Overall, the ideal candidate for this role should possess a strong understanding of CI/CD concepts and tools, experience with cloud platforms and containerization technologies, a basic understanding of networking and security principles, strong problem-solving skills, attention to detail, excellent communication and teamwork skills, and the ability to learn and adapt to new technologies and methodologies. Additionally, being ready to work with clients directly is a key requirement for this position.,

Posted 2 weeks ago

Apply

4.0 - 8.0 years

5 - 9 Lacs

Noida

Work from Office

Architecture, Lifecycle Management, Platform Governance, Linux, OpenShift,Prometheus, Grafana,Helm,EFK Stack,Ansible,Vault, SCCs, RBAC, NetworkPolicies,CI/CD: Jenkins, GitLab CI, ArgoCD, Tekton,Lead SEV1 issue resolution

Posted 2 weeks ago

Apply

3.0 - 6.0 years

3 - 6 Lacs

Noida

Work from Office

Advanced Troubleshooting, Change Management, Automation,Linux,YAML/Helm/Kustomize,Maintain Operators, upgrade OpenShift clusters,Work with CI/CD pipelines and DevOps teams,Maintain logs, monitoring, and alerting tools (Prometheus, EFK, Grafana)

Posted 2 weeks ago

Apply

6.0 - 8.0 years

18 - 30 Lacs

Hyderabad

Work from Office

Key Skills: Hadoop, Cloudera, HDFS, YARN, Spark, Delta Lake, Linux, Docker, Kubernetes, Jenkins, REST API, Prometheus, Grafana, Splunk, PySpark, Python, Terraform, Ansible, GCP, DevOps, CI/CD, SRE, Agile, Infrastructure Automation Roles & Responsibilities: Lead and support technology teams in designing, developing, and managing data engineering and CI/CD pipelines, and infrastructure. Act as an Infrastructure/DevOps SME in designing and implementing solutions for risk analytics systems transformation, both tactical and strategic, aligned with regulatory and business initiatives. Collaborate with other technology teams, IT support teams, and architects to drive improvements in product delivery. Manage daily interactions with IT and central DevOps/infrastructure teams to ensure continuous support and delivery. Grow the technical expertise within the engineering community by mentoring and sharing knowledge. Design, maintain, and improve the full software delivery lifecycle. Enforce process discipline and improvements in areas like agile software delivery, production support, and DevOps pipeline development. Experience Requirement: 6-8 years of experience in platform engineering, SRE roles, and managing distributed/big data infrastructures. Strong hands-on experience with the Hadoop ecosystem, big data pipelines, and Delta Lake. Proven expertise in Cloudera Hadoop cluster management including HDFS, YARN, and Spark. In-depth knowledge of networking, Linux, HDFS, and DevSecOps tools like Docker, Kubernetes, and Jenkins. Skilled in containerization with Docker and orchestration using Kubernetes. Hands-on experience with designing and managing large-scale tech projects, including REST API standards. Experience with monitoring and logging tools such as Prometheus, Grafana, and Splunk. Global collaboration experience with IT and support teams across geographies. Strong coding skills in Spark (PySpark) and Python with at least 3 years of experience. Expertise in Infrastructure as Code (IaC) tools such as Terraform and Ansible. Working knowledge of GCP or other cloud platforms and their data engineering products is preferred. Familiarity with agile methodologies, with strong problem-solving and team collaboration skills. Education: B.Tech M.Tech (Dual), B.Tech, M. Tech.

Posted 2 weeks ago

Apply

8.0 - 12.0 years

25 - 35 Lacs

Bengaluru

Remote

Job Title : Sr. Devops SRE Location State : Karnataka Location City : Bangalore(Hybrid/ Remote) Experience Required : 8 to 12 Year(s) CTC Range : 25 to 38 LPA Shift: Day Shift Work Mode: Hybrid/ Remote Position Type: Contract ( with possible extension) Openings: 6 Company Name: VARITE INDIA PRIVATE LIMITED About The Client: An American multinational digital communications technology conglomerate corporation headquartered in San Jose, California. The Client develops, manufactures, and sells networking hardware, software, telecommunications equipment, and other high-technology services and products. The Client specializes in specific tech markets, such as the Internet of Things (IoT), domain security, videoconferencing, and energy management. It is one of the largest technology companies in the world, ranking 82nd on the Fortune 100 with over $51 billion in revenue and nearly 83,300 employees. About The Job: Hiring for Sr. Devops SRE Essential Job Functions: Key Responsibilities: Help build a new platform to support business transformation Focus on automation within DevOps (tools, processes) Operate in production environments (Amazon cloud or on-prem datacenters) Strong exposure to Kubernetes clusters and observability tools Top 3 Skill needed Kubernetes Highest priority (hands-on in production cluster setup & management) Observability & Monitoring Tools Grafana, Splunk (logging), Prometheus DevOps Tools & Practices Must Have Skills: Git (code repository) Python (basic to intermediate scripting) Docker Pipelines (CI/CD) Qualifications: Any Graduate How to Apply: Interested candidates are invited to submit their resume using the apply online button on this job post. Equal Opportunity Employer: VARITE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, veteran status, or disability status. Unlock Rewards: Refer Candidates and Earn. If you're not available or interested in this opportunity, please pass this along to anyone in your network who might be a good fit and interested in our open positions. VARITE offers a Candidate Referral program, where you'll receive a one-time referral bonus based on the following scale if the referred candidate completes a three-month assignment with VARITE. Exp Req - Referral Bonus 0 - 2 Yrs. - INR 5,000 2 - 6 Yrs. - INR 7,500 6 + Yrs. - INR 10,000 About VARITE: VARITE is a global staffing and IT consulting company providing technical consulting and team augmentation services to Fortune 500 Companies in USA, UK, CANADA and INDIA. VARITE is currently a primary and direct vendor to the leading corporations in the verticals of Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.

Posted 2 weeks ago

Apply

4.0 - 8.0 years

8 - 18 Lacs

Hyderabad

Work from Office

Key Skills: Elastic Cloud Kubernetes (ECK), Elasticsearch, Kibana, Logstash, Kafka, Cloud Infrastructure, Data Engineering, Prometheus, Grafana, Docker, Kubernetes, CI/CD, Jenkins, High Availability, Disaster Recovery, Networking. Roles & Responsibilities: Manage and configure Elastic Cloud Kubernetes (ECK) clusters in Kubernetes (K8s). Work with Elastic Stack components such as Elasticsearch, Kibana, Logstash, Fleet, and integrate with other tools. Design and implement application services with a focus on OSS integration, availability management, event/incident management, and patch management. Design and develop data pipelines to ingest data into Elastic. Manage log collection tools and technologies such as Beats, Elastic Agent, and Logstash. Implement log management and monitoring tools such as Prometheus and Grafana. Manage cloud infrastructure, including networking concepts like load balancers, firewalls, and VPCs. Ensure high availability and disaster recovery planning for critical infrastructure. Collaborate on continuous integration and continuous delivery (CI/CD) using tools like Jenkins. Experience Requirement: 4-8 years of experience in Elastic Stack (Elasticsearch, Kibana, Logstash), Kafka integration, and cloud technologies. Strong expertise in Kubernetes (K8s), Docker, Linux, and managing cloud infrastructure. Experience with monitoring tools such as Prometheus and Grafana. Hands-on experience with log collection and management technologies (Beats, Logstash, Elastic Agent). Experience with CI/CD tools like Jenkins is a plus. Education: Any Graduation.

Posted 2 weeks ago

Apply

6.0 - 11.0 years

14 - 19 Lacs

Pune

Work from Office

What You'll Do We are looking for accomplished Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping, development, and delivery of the LLM platform features. You will build core agent infrastructureA2A orchestration and MCP-driven tool discoveryso teams can launch secure, scalable agent workflows. You will be reporting to Senior Manager, ML Engineering. What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Drive innovation by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's technical expertise. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, promoting a culture of collaboration and Engineering expertise. Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 6+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Demonstrated experience staying current with breakthroughs in AI/ML, with a focus on GenAI. Experience with design patterns and data structures. Technologies you will work with: Python, LLMs, Agents, A2A, MCP, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, and Grafana We are the AI & ML enablement group in Avalara. We empower Avalara's Product and Engineering teams with the latest AI & ML capabilities, driving easy-to-use, automated compliance solutions that position Avalara as the industry AI technology leader and the go-to choice for all compliance needs.

Posted 2 weeks ago

Apply

6.0 - 11.0 years

10 - 15 Lacs

Pune

Work from Office

What You'll Do We are looking for experienced Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping and delivery of the LLM platform features. You will build core agent infrastructureA2A orchestration and MCP-driven tool discoveryso teams can launch secure, scalable agent workflows. You will be reporting to Senior Manager, Machine Learning What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Promote innovation by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, encouraging a culture of collaboration Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 6+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Experience working with technological innovations in AI & ML(esp. GenAI) and apply them. Experience with design patterns and data structures. Good analytical, design and debugging skills. Technologies you will work with: Python, LLMs, Agents, A2A, MCP, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, and Grafana We are the AI & ML enablement group in Avalara.

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies