Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
4.0 - 7.0 years
9 - 13 Lacs
Pune
Hybrid
So, what’s the role all about? Seeking a skilled and experienced DevOps Engineer in designing, producing, and testing high-quality software that meets specified functional and non-functional requirements within the time and resource constraints given. How will you make an impact? Design, implement, and maintain CI/CD pipelines using Jenkins to support automated builds, testing, and deployments. Manage and optimize AWS infrastructure for scalability, reliability, and cost-effectiveness. To streamline operational workflows and develop automation scripts and tools using shell scripting and other programming languages. Collaborate with cross-functional teams (Development, QA, Operations) to ensure seamless software delivery and deployment. Monitor and troubleshoot infrastructure, build failures, and deployment issues to ensure high availability and performance. Implement and maintain robust configuration management practices and infrastructure-as-code principles. Document processes, systems, and configurations to ensure knowledge sharing and maintain operational consistency. Performing ongoing maintenance and upgrades (Production & non-production) Occasional weekend or after-hours work as needed Have you got what it takes? Experience: 4-7 years in DevOps or a similar role. Cloud Expertise: Proficient in AWS services such as EC2, S3, RDS, Lambda, IAM, CloudFormation, or similar. CI/CD Tools: Hands-on experience with Jenkins pipelines (declarative and scripted). Scripting Skills: Proficiency in either shell scripting or powershell Programming Knowledge: Familiarity with at least one programming language (e.g., Python, Java, or Go). IMP: Scripting/Programming is integral to this role and will be a key focus in the interview process. Version Control: Experience with Git and Git-based workflows. Monitoring Tools: Familiarity with tools like CloudWatch, Prometheus, or similar. Problem-solving: Strong analytical and troubleshooting skills in a fast-paced environment. CDK Knowledge in AWS DevOps. You will have an advantage if you also have: Prior experience in Development or Automation is a significant advantage. Windows system administration is a significant advantage. Experience with monitoring and log analysis tools is an advantage. Jenkins pipeline knowledge What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr! Enjoy NICE-FLEX! At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere. Requisition ID: 6119 Reporting into: Tech Manager Role Type: Individual Contributor
Posted 1 month ago
2.0 - 3.0 years
4 - 5 Lacs
Rajkot
Work from Office
Technical Requirements: Excellent understanding of Linux commands. Thorough knowledge of CI/CD pipelines, automation, and debugging, particularly with Jenkins. Intermediate to advanced understanding of Docker and container orchestration platforms. Hands-on experience with web servers (Apache, Nginx), database servers (MongoDB, MySQL, PostgreSQL), and application servers (PHP, Node.js). Knowledge of proxies and reverse proxies is required. Good understanding and hands-on experience with site reliability tools such as Prometheus, Grafana, New Relic, Datadog, and Splunk. (Hands-on experience with at least one tool is highly desirable.) Ability to identify and fix security vulnerabilities at the OS, database, and application levels. Knowledge of cloud platforms, specifically AWS and DigitalOcean, and their commonly used services. Other Requirements: Good communication skills. Out-of-the-box problem-solving capabilities, especially in the context of technology automation and application architecture reviews. Hands-on experience with GKE, AKS, EKS, or ECS is a plus. Excellent understanding of how to craft effective AI prompts to solve specific issues.
Posted 1 month ago
5.0 - 7.0 years
15 - 27 Lacs
Bangalore Rural, Bengaluru
Work from Office
DevOps, Site Reliability Engineering,loud platforms,GCP,Infrastructure as Code tools (Terraform, Ansible, CloudFormation), Prometheus, Grafana, ELK stack,Python, Bash, Go, Istio, Linkerd
Posted 1 month ago
9.0 - 14.0 years
35 - 40 Lacs
Pune
Work from Office
Role Description Our organization within Deutsche Bank is AFC Production Services. We are responsible for providing technical L2 application support for business applications. The AFC (Anti-Financial Crime) line of business has a current portfolio of 25+ applications. The organization is in process of transforming itself using Google Cloud and many new technology offerings. As an Assistant Vice President, your role will include hands-on production support and be actively involved in technical issues resolution across multiple applications. You will also be working as application lead and will be responsible for technical & operational processes for all application you support. Deutsche Banks Corporate Bank division is a leading provider of cash management, trade finance and securities finance. We complete green-field projects that deliver the best Corporate Bank - Securities Services products in the world. Our team is diverse, international, and driven by shared focus on clean code and valued delivery. At every level, agile minds are rewarded with competitive pay, support, and opportunities to excel. You will work as part of a cross-functional agile delivery team. You will bring an innovative approach to software development, focusing on using the latest technologies and practices, as part of a relentless focus on business value. You will be someone who sees engineering as team activity, with a predisposition to open code, open discussion and creating a supportive, collaborative environment. You will be ready to contribute to all stages of software delivery, from initial analysis right through to production support. Your key responsibilities Provide technical support by handling and consulting on BAU, Incidents/emails/alerts for the respective applications. Perform post-mortem, root cause analysis using ITIL standards of Incident Management, Service Request fulfillment, Change Management, Knowledge Management, and Problem Management. Manage regional L2 team and vendor teams supporting the application. Ensure the team is up to speed and picks up the support duties. Build up technical subject matter expertise on the applications being supported including business flows, application architecture, and hardware configuration. Define and track KPIs, SLAs and operational metrics to measure and improve application stability and performance. Conduct real time monitoring to ensure application SLAs are achieved and maximum application availability (up time) using an array of monitoring tools. Build and maintain effective and productive relationships with the stakeholders in business, development, infrastructure, and third-party systems / data providers & vendors. Assist in the process to approve application code releases as well as tasks assigned to support to perform. Keep key stakeholders informed using communication templates. Approach support with a proactive attitude, desire to seek root cause, in-depth analysis, and strive to reduce inefficiencies and manual efforts. Mentor and guide junior team members, fostering technical upskill and knowledge sharing. Provide strategic input into disaster recovery planning, failover strategies and business continuity procedures Collaborate and deliver on initiatives and install these initiatives to drive stability in the environment. Perform reviews of all open production items with the development team and push for updates and resolutions to outstanding tasks and reoccurring issues. Drive service resilience by implementing SRE(site reliability engineering) principles, ensuring proactive monitoring, automation and operational efficiency. Ensure regulatory and compliance adherence, managing audits,access reviews, and security controls in line with organizational policies. The candidate will have to work in shifts as part of a Rota covering APAC and EMEA hours between 07:00 IST and 09:00 PM IST (2 shifts). In the event of major outages or issues we may ask for flexibility to help provide appropriate cover. Weekend on-call coverage needs to be provided on rotational/need basis. Your skills and experience 9-15 years of experience in providing hands on IT application support. Experience in managing vendor teams providing 24x7 support. Preferred : Team lead role experience, Experience in an investment bank, financial institution. Bachelors degree from an accredited college or university with a concentration in Computer Science or IT-related discipline (or equivalent work experience/diploma/certification). Preferred : ITIL v3 foundation certification or higher. Knowledgeable in cloud products like Google Cloud Platform (GCP) and hybrid applications. Strong understanding of ITIL /SRE/ DEVOPS best practices for supporting a production environment. Understanding of KPIs, SLO, SLA and SLI Monitoring Tools: Knowledge of Elastic Search, Control M, Grafana, Geneos, OpenShift, Prometheus, Google Cloud Monitoring, Airflow,Splunk. Working Knowledge of creation of Dashboards and reports for senior management Red Hat Enterprise Linux (RHEL) professional skill in searching logs, process commands, start/stop processes, use of OS commands to aid in tasks needed to resolve or investigate issues. Shell scripting knowledge a plus. Understanding of database concepts and exposure in working with Oracle, MS SQL, Big Query etc. databases. Ability to work across countries, regions, and time zones with a broad range of cultures and technical capability. Skills That Will Help You Excel Strong written and oral communication skills, including the ability to communicate technical information to a non-technical audience and good analytical and problem-solving skills. Proven experience in leading L2 support teams, including managing vendor teams and offshore resources. Able to train, coach, and mentor and know where each technique is best applied. Experience with GCP or another public cloud provider to build applications. Experience in an investment bank, financial institution or large corporation using enterprise hardware and software. Knowledge of Actimize, Mantas, and case management software is good to have. Working knowledge of Big Data Hadoop/Secure Data Lake is a plus. Prior experience in automation projects is great to have. Exposure to python, shell, Ansible or other scripting language for automation and process improvement Strong stakeholder management skills ensuring seamless coordination between business, development, and infrastructure teams. Ability to manage high-pressure issues, coordinating across teams to drive swift resolution. Strong negotiation skills with interface teams to drive process improvements and efficiency gains.
Posted 1 month ago
5.0 - 8.0 years
7 - 10 Lacs
Chennai
Work from Office
What youll be doing... As a devops engineer, you will design, implement, and manage Kubernetes clusters for our telecom/networking applications. Developing and maintaining CI/CD pipelines for automated build, testing, and deployment. Monitoring and optimizing the performance and scalability of our Kubernetes infrastructure. Implementing and maintaining monitoring and alerting systems to proactively identify and resolve issues. Leading incident response and troubleshooting efforts, including root cause analysis. Automating operational tasks and processes to improve efficiency. Collaborating with development teams to integrate and deploy applications to Kubernetes. Contributing to the development and maintenance of our platform's security posture. Participating in on-call rotations to provide support for production systems. Leveraging network/telecom domain knowledge to effectively triage and resolve network-related issues. Contributing to development efforts by writing code and implementing new features (added advantage). Staying up-to-date with the latest Kubernetes and DevOps technologies and best practices. What were looking for: We are seeking a highly motivated and experienced Engineer with a strong background in Kubernetes and DevOps practices to join our team. This role will focus on building, maintaining, and scaling our network/telecom infrastructure and services in a kubernetes/Openshift based environment. You will play a key role in ensuring the reliability, performance, and security of our platform, working closely with development, operations, and other engineering teams. Experience with triaging and troubleshooting complex issues is essential, as is a willingness to contribute to development efforts. You'll need to have: Bachelors degree or four or more years of work experience. Four or more years of relevant work experience. Four or more years of experience in DevOps engineering or a related role. Proven experience with Kubernetes and containerization technologies (e.g., Docker). Experience with CI/CD tools (e.g., Jenkins, GitLab ). Strong understanding of networking concepts and protocols (e.g., TCP/IP, BGP, MPLS). Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). Experience with cloud computing platforms (e.g., AWS, Azure, GCP) is an added advantage. Excellent problem-solving and troubleshooting skills. Strong communication and collaboration skills. Experience in the telecom/networking domain is essential. Experience with scripting languages (e.g., Python, Bash) is highly desirable. Experience with development and coding is a significant advantage. Even better if you have one or more of the following: Experience with a high-performance, high-availability environment. Experience with Network technologies like SDN/NFV Strong analytical, debugging skills. Good communication and presentation skills. Relevant certifications.
Posted 1 month ago
5.0 - 10.0 years
10 - 20 Lacs
Noida
Work from Office
JOB DESCRIPTION Experience Level desired 5+ yrs Compensation: Salary commensurate w/ experience Reports to: Team Lead RESPONSIBILITIES Application Performance Monitoring - Using Dynatrace APM tools to optimize application performance and identify performance bottlenecks in web applications and provide solutions. Dynatrace OneAgent Installation and Troubleshoot on all types of platform like on cloud (Azure Infra, AKS ,App Services)/on-premises Dynatrace integration with 3rd Party Tools. Demonstrates thorough knowledge and awareness of application performance issues in a complex multi-tiered environment Knowledge of Customer Experience Management, Application Performance monitoring and log analytics tools like Splunk, Dynatrace Synthetic, Dynatrace Appmon, CA APM, Prometheus etc., is highly desired. On-board new application into Dynatrace, profile configuration, agent setup, instrumentation. Ability to do requirement gathering and target environment analysis from an APM perspective Hands-on implementation experience in Dynatrace On-Premise solutions Experience on Configuration and customization of Dynatrace solution Excellent communication skills (both verbal and written) Knowledge of Azure is preferred Hands on APM and other tools like- DataDog, Glassbox, Splunk, Grafana, Prometheus, New-Relic, Postman, Azure Appinsights, Azure Log, Jenkins, Docker. Power BI (good to have required for reporting and extracting data from tools) QUALIFICATIONS B.Tech or MCA preferred Atleast 5 yrs work exp. Good Communication skills
Posted 1 month ago
5.0 - 7.0 years
5 - 9 Lacs
Mumbai, Bengaluru, Delhi / NCR
Work from Office
Key Responsibilities : Chaos Engineering : - Design and implement chaos engineering experiments to identify weaknesses in systems and applications. - Develop and execute strategies to improve system resilience and reliability. - Analyze experiment results, provide actionable insights, and drive remediation efforts. - Collaborate with development, operations, and infrastructure teams to integrate chaos engineering practices. Operational Acceptance : - Develop and maintain comprehensive operational acceptance criteria for new and existing systems. - Conduct thorough operational acceptance testing, ensuring systems meet all predefined criteria before go-live. - Work closely with project managers, developers, and QA teams to align operational acceptance processes with project timelines and objectives. - Document and communicate operational readiness findings, providing recommendations for improvement. System Resilience and Reliability : - Implement and manage strategies for continuous improvement of system resilience and reliability. - Monitor and assess system performance, identifying potential risks and areas for enhancement. - Lead initiatives to improve disaster recovery and business continuity plans. - Stay updated with the latest industry trends and best practices in chaos engineering and operational acceptance. Collaboration and Training : - Educate and mentor team members on chaos engineering and operational acceptance methodologies. - Foster a culture of resilience and reliability within the organization. - Engage with external communities, attending conferences and participating in knowledge-sharing events. Requirements : - Extensive experience in chaos engineering, operational acceptance testing, and system resilience. - Strong understanding of cloud platforms (AWS, Azure, GCP) and their resilience features. - Proficiency in scripting and automation tools (Python, Bash, Terraform, etc. - Experience with monitoring and observability tools (Prometheus, Grafana, Splunk, etc. - Experience with Chaos Engineering Tools such as Gremlin, Chaos Monkey etc. - Excellent analytical and problem-solving skills. - Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams. - Certifications in relevant fields (e.g , AWS Certified Solutions Architect, Azure DevOps Engineer) are a plus. Location: Delhi NCR,Bangalore,Chennai,Pune,Kolkata,Ahmedabad,Mumbai,Hyderabad
Posted 1 month ago
3.0 - 8.0 years
5 - 10 Lacs
Pune
Work from Office
BMCs SaaS Ops team is looking for a DevOps Engineer to join us and design, develop, and implement complex applications, using the latest technologies. Here is how, through this exciting role, YOU will contribute to BMC's and your own success: Participate in all aspects of SaaS product development, from requirements analysis to product release and sustaining. Drive the adoption of the DevOps process and tools across the organization. Learn and implement cutting-edge technologies and tools to build best of class enterprise SaaS solutions. Deliver high-quality enterprise SaaS offerings on schedule Develop Continuous Delivery Pipeline Required Skills: 3+ years of working experience in a software engineering function Hands on experience with CI\CD pipelines and maintenance of containerized deployments Fundamental knowledge of one of automation scripting language Python, Groovy, Ansible, or Shell scripting Hands on experience in creating and maintaining Jenkins pipelines Hands on experience working with Web service protocols (Rest, JSON) Hands on experience working with DevOps and Automation tools like Git, Docker, Helm, Terraform, Jira, Harbor Registry Proficient working on Windows and Linux Operation System platforms. Good exposure and fundamental knowledge of Relational DBs (PostgreSQL, MS SQL) Good exposure and fundamental knowledge of container deployments, persistent storage, PODs, ingress, routes and Kubernetes objects. Good exposure and fundamental knowledge of tools like Elastic Search, Kibana, Grafana, Prometheus Good exposure and fundamental knowledge of Public, Private and hybrid cloud deployments Good exposure and fundamental knowledge of Site Reliability Engineering (SRE) principles and its implementation for SaaS services. Experience working in an Agile methodology with cross functional teams (R&D, DevOps, Operations, Support etc.) Able to design & document the Standard Operating Procedures (SOPs), design document and architecture artifacts Good troubleshooting skills and knowledge of BMC Helix products including ITSM, Digital Workplace, Helix Platform will be an add-on. Ability to work with time bound deadlines Hard Working & dedicated person with effective communication skills Bachelors degree in IT or equivalent professional experience This position is part of BMC SaaS DevOps team. This can include weekend work during scheduled production activities and after-hours work as needed.
Posted 1 month ago
4.0 - 9.0 years
17 - 22 Lacs
Bengaluru
Work from Office
Job Summary: We are seeking a highly skilled Site Reliability Engineer (SRE) with experience to join our team in Bangalore. The ideal candidate will excel in implementing SRE principles to foster a culture of reliability, automation, and monitoring across our software engineering projects. This role is pivotal in ensuring the effective design, development, testing, and support of applications and systems, particularly within cloud environments. Software Requirements: Required Proficiency: Programming LanguagesTypeScript, Node.js Cloud EnvironmentsAWS (ECS Fargate, Vault, Lambda services, Artifactory) CI/CD ToolsGitHub Actions, JFrog Artifactory, Sysdig, Octopus, Terraform Observability ToolsObStack, Prometheus, Grafana, PagerDuty, Observe Infrastructure as Code (IaC) ToolsCloudFormation, Terraform Preferred Proficiency: Familiarity with additional programming languages or frameworks Experience with cloud platforms other than AWS Overall Responsibilities: Partner with senior stakeholders to lead a culture focused on data-driven reliability, monitoring, and automation in alignment with SRE principles. Design, develop, test, and support applications and systems, emphasizing managing and scaling distributed systems across cloud environments. Create and develop tools essential for the operational management and security of software applications and systems. Identify technology limitations and deficiencies in existing systems and implement scalable improvements. Drive automation efforts and enhance application monitoring capabilities. Review code developed by other engineers to ensure adherence to best practices. Thrive in incident response environments, conducting post-mortem analyses and designing secure solutions. Measure and optimize system performance, addressing customer needs and innovating for continuous improvement. Technical Skills (By Category): Programming Languages: Required: TypeScript, Node.js Cloud Technologies: Required: AWS (ECS Fargate, Lambda, Vault, Artifactory) Development Tools and Methodologies: Required: GitHub Actions, JFrog Artifactory, Sysdig, Octopus, Terraform Observability Tools: Required: ObStack, Prometheus, Grafana, PagerDuty, Observe Infrastructure as Code (IaC): Required: CloudFormation, Terraform Experience Requirements: 7 to 10 years of experience in software engineering and SRE practices. Experience in applying SRE practices in large organizations. Familiarity with modern software development practices and DevSecOps environments. Day-to-Day Activities: Collaborate with stakeholders to understand business needs and implement SRE practices. Lead cross-functional teams in enhancing system reliability and performance. Develop and maintain operational management tools for applications. Conduct regular code reviews and ensure adherence to best practices. Participate in incident response and post-mortem analysis to improve system resilience. Qualifications: Required: Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field. Commitment to continuous professional development through industry certifications and training. Professional Competencies: Strong critical thinking and problem-solving skills. Excellent leadership and teamwork abilities. Effective communication and stakeholder management skills. Adaptability and a learning-oriented mindset. Innovative thinking to drive continuous improvement. Strong time and priority management skills.
Posted 1 month ago
1.0 - 5.0 years
8 - 15 Lacs
Bengaluru
Work from Office
Junior DevOps Engineer / DevOps Engineer Location: Bengaluru South, Karnataka, India Experience: 1.53 Years Compensation: 815 LPA Employment Type: Full-Time | Work From Office Only ________________________________________ Are you an aspiring DevOps professional ready to work on a transformative platform? Join a purpose-led team building India’s most disruptive ecosystem at the intersection of technology, property, and sustainability. This role is ideal for engineers who are eager to learn, automate, and contribute to building reliable, scalable, and secure infrastructure. Key Responsibilities Assist in designing, implementing, and managing CI/CD pipelines using tools like Jenkins or GitLab CI to automate build, test, and deployment processes. Support the deployment and management of cloud infrastructure, primarily on AWS, with exposure to Azure or GCP. Contribute to infrastructure as code practices using Terraform, CloudFormation, or Ansible. Participate in maintaining and operating containerized applications using Docker and Kubernetes. Implement and manage monitoring and logging solutions using Grafana, Loki, Prometheus, or ELK stack. Collaborate with engineering and QA teams to streamline release pipelines, ensuring high availability and performance. Develop basic automation scripts in Python or Bash to optimize and streamline operational tasks. Gain exposure to serverless and event-driven architectures under guidance from senior engineers. Troubleshoot infrastructure issues and contribute to system security and performance optimization. Requirements 1.5 to 3 years of experience in DevOps, SRE, or related infrastructure roles. Solid understanding of cloud environments (AWS preferred; Azure/GCP a plus). Basic to intermediate scripting knowledge in Python or Bash. Familiarity with CI/CD concepts and tools such as Jenkins, GitLab CI, etc. Working knowledge of Docker and introductory experience with Kubernetes. Exposure to monitoring and logging stacks (Grafana, Loki, Prometheus, ELK). Understanding of infrastructure as code using tools like Terraform or Ansible. Familiarity with networking, DNS, firewalls, and system security practices. Strong problem-solving skills and a learning mindset. Preferred Qualifications Certifications in AWS, Azure, or GCP. Exposure to serverless architectures and event-driven systems. Experience with additional monitoring tools or scripting languages. Familiarity with geospatial systems, virtual mapping, or sustainability-oriented platforms. Passion for eco-conscious technology and impact-driven development. Why You Should Join Contribute to a next-gen PropTech platform promoting sustainable and inclusive land ownership. Work closely with senior engineers committed to mentorship and ecosystem building. Join a team where your ideas are valued, your skills are sharpened, and your work has real-world impact. Be part of a vibrant, office-first culture that encourages innovation, collaboration, and growth.
Posted 1 month ago
5.0 - 7.0 years
3 - 7 Lacs
Pune
Remote
We are seeking a Grafana Implementation Expert with deep expertise in Grafana and Prometheus, focusing on core development and customization rather than SRE or DevOps responsibilities. This role requires a specialist in monitoring tools, responsible for designing, developing, and optimizing Grafana dashboards, plugins, and data sources to provide real-time observability and analytics. Key Responsibilities : - Develop, customize, and optimize Grafana dashboards with advanced visualizations, queries, and alerting mechanisms.- Integrate Grafana with Prometheus and other data sources (i.e. Loki, InfluxDB, Elasticsearch, MySQL, PostgreSQL, OpenTelemetry).- Extend Grafana capabilities by developing custom plugins, panels, and data sources using JavaScript, TypeScript, React, and Go.- Optimize Prometheus queries (PromQL) and storage solutions to ensure efficient data retrieval and visualization.- Automate dashboard provisioning using JSON, Terraform, or Grafana APIs for seamless deployment across environments.- Work closely with engineering teams to translate monitoring requirements into scalable and maintainable solutions.- Troubleshoot and enhance Grafana performance, including load balancing, scaling, and security hardening.- Implement advanced alerting mechanisms using Alertmanager, Grafana Alerts, and webhook integrations.- Stay updated on Grafana ecosystem advancements and contribute to best practices in observability tooling.- Document configurations, implementation guidelines, and best practices for internal stakeholders. Required Skills & Experience : - 5+ years of experience in monitoring and observability tools with a strong focus on Grafana and Prometheus.- Expertise in Grafana internals, including API usage, dashboard templating, and custom plugin development.- Strong hands-on experience with Prometheus, including metric collection, relabeling, and PromQL queries.- Proficiency in JavaScript, TypeScript, React, and Go for Grafana plugin and dashboard development.- Familiarity with infrastructure monitoring, including Kubernetes, cloud services (AWS, GCP, Azure), and system-level metrics. - Experience with time-series databases and log aggregation tools (i.e., Loki, Elasticsearch, InfluxDB). - Knowledge of security best practices in Grafana, including authentication, RBAC, and API security.- Experience with automation and infrastructure-as-code (IaC) for monitoring stack deployment.- Strong problem-solving skills with the ability to debug and optimize dashboards and alerting configurations.- Excellent communication and documentation skills to collaborate with cross-functional teams. Preferred Qualifications : - Grafana Certified Observability Engineer or equivalent certifications.- Experience contributing to open-source Grafana projects or plugin development.- Knowledge of distributed tracing tools like Jaeger or Zipkin.- Familiarity with service meshes (Istio, Linkerd) and their monitoring strategies.- This is a high-impact role focused on developing and enhancing Grafana-based monitoring solutions for enterprise-grade observability
Posted 1 month ago
3.0 - 8.0 years
16 - 20 Lacs
Mumbai
Work from Office
What will you do at Fynd? - Run the production environment by monitoring availability and taking a holistic view of system health. - Improve reliability, quality, and time-to-market of our suite of software solutions - Be the 1st person to report the incident. - Debug production issues across services and levels of the stack. - Envisioning the overall solution for defined functional and non-functional requirements, and being able to define technologies, patterns and frameworks to realise it. - Building automated tools in Python / Java / GoLang / Ruby etc. - Help Platform and Engineering teams gain visibility into our infrastructure. - Lead design of software components and systems, to ensure availability, scalability, latency, and efficiency of our services. - Participate actively in detecting, remediating and reporting on Production incidents, ensuring the SLAs are met and driving Problem Management for permanent remediation. - Participate in on-call rotation to ensure coverage for planned/unplanned events. - Perform other task like load-test & generating system health reports. - Periodically check for all dashboards readiness. - Engage with other Engineering organizations to implement processes, identify improvements, and drive consistent results. - Working with your SRE and Engineering counterparts for driving Game days, training and other response readiness efforts. - Participate in the 24x7 support coverage as needed Troubleshooting and problem-solving complex issues with thorough root cause analysis on customer and SRE production environments - Collaborate with Service Engineering organizations to build and automate tooling, implement best practices to observe and manage the services in production and consistently achieve our market leading SLA. - Improving the scalability and reliability of our systems in production. - Evaluating, designing and implementing new system architectures. Some specific Requirements : - B.Tech. in Engineering, Computer Science, technical degree, or equivalent work experience - At least 3 years of managing production infrastructure. - Leading / managing a team is a huge plus. - Experience with cloud platforms like - AWS, GCP. - Experience developing and operating large scale distributed systems with Kubernetes, Docker and and Serverless (Lambdas) - Experience in running real-time and low latency high available applications (Kafka, gRPC, RTP) - Comfortable with Python, Go, or any relevant programming language. - Experience with monitoring alerting using technologies like Newrelic / zybix /Prometheus / Garafana / cloudwatch / Kafka / PagerDuty etc. - Experience with one or more orchestration, deployment tools, e. CloudFormation / Terraform / Ansible / Packer / Chef. - Experience with configuration management systems such as Ansible / Chef / Puppet. - Knowledge of load testing methodologies, tools like Gating, Apache Jmeter. - Work your way around Unix shell. - Experience running hybrid clouds and on-prem infrastructures on Red Hat Enterprise Linux / CentOS - A focus on delivering high-quality code through strong testing practices.
Posted 1 month ago
3.0 - 6.0 years
4 - 8 Lacs
Karnataka
Work from Office
Key Responsibilities : - Design, develop, and maintain backend services and automation tools using Golang - Build scalable and efficient microservices, RESTful APIs, and background jobs - Automate repetitive tasks and system processes across CI/CD, deployments, and data pipelines - Optimize code and systems for performance, reliability, and scalability - Collaborate with DevOps, QA, and other engineering teams to streamline operations and workflows - Write scripts and automation for provisioning, monitoring, and self-healing infrastructure - Maintain technical documentation for developed services, APIs, and scripts - Debug and troubleshoot issues across services and systems - Participate in code reviews, testing, and continuous integration activities - Research and implement tools and frameworks to improve development and automation efficiency Required Technical Skills : Programming Languages : - Strong proficiency in Go (Golang) - Familiarity with Python, Bash, or Shell scripting is a plus Automation & DevOps : - Hands-on experience with CI/CD pipelines (e., GitHub Actions, GitLab CI, Jenkins) - Proficiency in writing automation scripts and job schedulers - Familiarity with Ansible, Terraform, or other automation tools is a plus API Development : - RESTful API design, development, testing, and documentation - JSON, gRPC, and protocol buffers experience is a bonus Database Technologies : - Experience with both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Redis) databases - Understanding of database schema design and query optimization Cloud & Containers : - Hands-on experience with Docker, Kubernetes, or other container orchestration tools - Familiarity with cloud platforms like AWS, GCP, or Azure Monitoring & Logging : - Working knowledge of tools like Prometheus, Grafana, ELK Stack, or Splunk Version Control : - Proficient in Git and Git workflows Preferred Qualifications : - Bachelors degree in Computer Science, Engineering, or related field - 36 years of backend development experience, with 2+ years in Golang - Experience working in Agile/Scrum teams - Exposure to event-driven architecture and message brokers like Kafka, RabbitMQ, or NATS Soft Skills : - Strong problem-solving and debugging skills - Good communication and documentation habits - Ability to work independently and within a team - Strong attention to detail and proactive attitude
Posted 1 month ago
4.0 - 8.0 years
13 - 17 Lacs
Bengaluru
Work from Office
Roles & Responsibilities : - Working closely with the CTO and members of technical staff to meet deadlines. - Working with an agile team to setup and configure GitOps (CI/CD) based pipelines on GitLab - Create and deploy Edge AIoT pipelines using AWS Greengrass or Azure IoT - Design and develop secure cloud system architectures in accordance with enterprise standards - Package and automate deployment of releases using Helm charts - Analyze and optimize resource consumption of deployments - Integrate with Prometheus, Grafana, Kibana etc. for application monitoring - Adhering to best practices to deliver secure and robust solutions Requirements : - Experience with Kubernetes and AWS - Knowledge of cloud architecture concepts (IaaS, PaaS, SaaS) - Knowledge of Docker and Linux bash scripting - Strong desire to expand knowledge in modern cloud architectures - Knowledge of System Security Concepts (SAST, DAST, Penetration Testing, Vulnerability analysis) - Familiarity with version control concepts (Git)
Posted 1 month ago
4.0 - 7.0 years
10 - 15 Lacs
Noida
Work from Office
As a Consultant in Automation domain, you will be responsible for delivering automation use cases enabled by AI and Cloud technologies. In this role, you play a crucial part in building the next-generation autonomous networks. You will develop efficient and scalable automation solutions, you will leverage your technical expertise, problem-solving abilities, and domain knowledge to drive innovation and efficiency. You have: Bachelor's degree in Computer Science, Engineering, or a related field preferred, with 8-10+ years of experience in automation or telecommunications. Understanding of telecom network architecture, including Core networks, OSS, and BSS ecosystems, along with industry frameworks like TM Forum Open APIs and eTOM. Practical experience in programming and scripting languages such as Python, Go, Java, or Bash, and automation tools like Terraform, Ansible, and Helm. Hands-on experience with CI/CD pipelines using Jenkins, GitLab CI, or ArgoCD, as well as containerization (Docker) and orchestration (Kubernetes, OpenShift). It would be nice if you also had: Exposure to agile development methodologies and cross-functional collaboration. Experience with real-time monitoring tools (Prometheus, ELK Stack, OpenTelemetry, Grafana) and AI/ML for predictive automation and network optimization is a plus. Familiarity with GitOps methodologies and automation best practices for telecom environments. Design, develop, test, and deploy automation scripts using languages such as Python, Go, Bash, or YAML. Automate the provisioning, configuration, and lifecycle management of network and cloud infrastructure. Design and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, ArgoCD, or Tekton. Automate continuous integration, testing, deployment, and rollback mechanisms for cloud-native services. Implement real-time monitoring, logging, and tracing using tools such as Prometheus, Grafana, ELK, and OpenTelemetry. Develop AI/ML-driven observability solutions for predictive analytics and proactive fault resolution, integrating AI/ML models to enable predictive scaling. Automate self-healing mechanisms to remediate network and application failures. Collaborate with DevOps and Network Engineers to align automation with business goals.
Posted 1 month ago
3.0 - 6.0 years
8 - 13 Lacs
Bengaluru
Work from Office
As a vLab R&D Cloud Engineer, your job requires expertise in Cloud Computing Platforms, Linux and Networking. You have: 8+ years of relevant experience on deployment and troubleshooting of Infrastructure/Platforms especially OpenShift and ACM. Red Hat Certified OpenShift Administrator certification is must Prior troubleshooting experience on OpenStack, Kubernetes and OpenShift Platforms. Expert in software engineering practices like DevOps, Agile Methodologies, Continuous Integration and Test Automation. Practical experience with Kubernetes (K8s), podman and containerized infrastructure management. Expertise in Git, Gerrit, Jenkins, ArgoCD, Ansible, Python scripting for automation and deploying and maintaining common services like Kafka, Redis, Prometheus, Grafana, etc. Expertise in Layer 2/Layer 3 Data Networking. It would be nice if you also had: BE/BTech/MTech in Engineering Degree required. Good knowledge in troubleshooting Ceph and Openshift Data Foundation (ODF) issues and good knowledge of HP/Airframe/Dell NFVI x.x hardware. Good communication, organizational and problem-solving skills. Ability to identify and implement platform/process improvements, create new procedures and ability to work with a global team. Red Hat Certified Specialist in MultiCluster Management certification Learn to Deploy and maintain common cloud services platforms for Cloud and Network Services to meet security, performance, scalability, and reliability requirements. Collaborate with global cross-functional teams to design and implement solutions in a microservices architecture. Explore and implement best practices for continuous integration and continuous deployment (CI/CD). Contribute to short / mid-term decisions in own area and be part of high-performance team. Learn new platform as it evolves
Posted 1 month ago
3.0 - 5.0 years
60 - 65 Lacs
Mumbai, Delhi / NCR, Bengaluru
Work from Office
We are seeking a talented and passionate Engineer to design, develop, and enhance our SaaS platform. As a key member of the team, you will work to create the best developer tools, collaborate with designers and engineers, and ensure our platform scales as it grows. The ideal candidate will have strong expertise in backend development, cloud infrastructure, and a commitment to delivering reliable systems. Location-Remote,Delhi NCR,Bangalore,Chennai,Pune,Kolkata,Ahmedabad,Mumbai,Hyderabad
Posted 1 month ago
8.0 - 12.0 years
35 - 60 Lacs
Pune
Work from Office
About the Role: We are seeking a skilled Site Reliability Engineer (SRE) / DevOps Engineer to join our infrastructure team. In this role, you will design, build, and maintain scalable infrastructure, CI/CD pipelines, and observability systems to ensure high availability, reliability, and security of our services. You will work cross-functionally with development, QA, and security teams to automate operations, reduce toil, and enforce best practices in cloud-native environments. Key Responsibilities: Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using Infrastructure as Code (Terraform). Maintain and improve CI/CD pipelines using tools like circleci, GitLab CI, or ArgoCD. Ensure high availability and performance of services using Kubernetes (GKE/EKS/AKS) and container orchestration. Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools. Collaborate with developers to optimize application performance and deployment processes. Manage and automate security controls such as IAM, RBAC, network policies, and vulnerability scanning. Basic Qualifications: Strong knowledge of Linux Experience with scripting languages such as Python, Bash, or Go. Experience with cloud platforms (GCP preferred, AWS or Azure acceptable). Proficient in Kubernetes operations, including Helm, operators, and service meshes. Experience with Infrastructure as Code (Terraform). Solid experience with CI/CD pipelines (GitLab CI, Circleci, ArgoCD, or similar). Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.). Experience with scripting languages such as Python, Bash, or Go. Knowledge of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls). Preferred Qualifications Experience with advanced networking solutions. Familiarity with SRE principles such as SLOs, SLIs, and error budgets. Exposure to multi-cluster or hybrid-cloud environments. Knowledge of service meshes (Istiol). Experience participating in incident management and postmortem processes.
Posted 1 month ago
1.0 - 6.0 years
3 - 8 Lacs
Bengaluru
Work from Office
We are seeking an experienced OpenShift Engineer to design, deploy, and manage containerized applications on Red Hat OpenShift. Key Responsibilities: Deploy, configure, and manage OpenShift clusters in hybrid/multi-cloud environments. Automate deployments using CI/CD pipelines (Jenkins, GitLab CI/CD, ArgoCD). Troubleshoot Kubernetes/OpenShift-related issues and optimize performance. Implement security policies and best practices for containerized workloads. Work with developers to containerize applications and manage microservices. Monitor and manage OpenShift clusters using Prometheus, Grafana, and logging tools.
Posted 1 month ago
1.0 - 5.0 years
3 - 8 Lacs
Bengaluru
Work from Office
We are seeking an experienced OpenShift Engineer to design, deploy, and manage containerized applications on Red Hat OpenShift. Key Responsibilities: Deploy, configure, and manage OpenShift clusters in hybrid/multi-cloud environments. Automate deployments using CI/CD pipelines (Jenkins, GitLab CI/CD, ArgoCD). Troubleshoot Kubernetes/ OpenShift-related issues and optimize performance. Implement security policies and best practices for containerized workloads. Work with developers to containerize applications and manage microservices. Monitor and manage OpenShift clusters using Prometheus, Grafana, and logging tools.
Posted 1 month ago
6.0 - 11.0 years
16 - 31 Lacs
Noida, Hyderabad, Gurugram
Hybrid
Sr Site Reliability Engineer Role Overview: . Our team supports a range of critical functions, including: Resiliency and Reliability Initiatives : Partnering with teams on various improvement projects. Observability : Ensuring comprehensive visibility into our systems. Alert Analysis and Optimization : Refining alert mechanisms to minimize disruptions. Automation and Self-Healing : Implementing automated solutions to proactively address issues. Incident and Problem Management : Supporting priority incidents, assisting with restoration, root cause analysis, and preventative actions. Release Management : Streamlining the release process for seamless updates. Monitor Engineering : Support installation, configuration, and review of monitor instrumentation. Cloud Operations support : Supporting and augmenting activities handled by the OptumRx Public Cloud team. Responsibilities: Support priority incidents. Gather requirements for automation opportunities and instrumentation improvements. Recommend improvements and changes to monitoring configurations and service architecture design. Summarize and provide updates to key stakeholders. Assist with incident/problem root cause analysis (RCA) and identification of trends. Help teams define service level objectives and build views within monitoring tools. Conduct analysis on alerts and incident data and recommend changes and improvements. Drive improvements to monitoring and instrumentation for services. Assess and monitor overall application stability and performance, providing insights for potential improvements. Build automation and self-healing capabilities to improve efficiency, stability, and reliability of services. Participate in rotational on-call support. Technical Skills: Proficiency in monitoring and instrumentation tools. U nderstanding of application performance monitoring (APM) and log management tools, with a preference for Dynatrace and Splunk. Experience with automation and scripting languages (e.g., Python, Bash, PowerShell) with a preference for Python. Experience implementing comprehensive monitoring for services to detect anomalies and trigger timely alerts. Understanding of cloud platforms (e.g., AWS, Azure, Google Cloud). Knowledge of incident management and root cause analysis (RCA) processes. Familiarity with service level objectives (SLOs) and service level agreements (SLAs). Analytical Skills: Ability to analyze alerts and incident data to identify trends and recommend improvements. Strong problem-solving skills and attention to detail. Communication Skills: Excellent verbal and written communication skills. Ability to summarize and present updates to key stakeholders effectively. Collaboration Skills: Experience working in cross-functional teams. Ability to collaborate with different teams to define service level objectives, gather requirements, discuss opportunities and recommend improvements. Experience working across geographies. Operational Skills: Ability to participate in rotational on-call support.
Posted 1 month ago
5.0 - 7.0 years
30 - 40 Lacs
Bengaluru
Hybrid
Senior Software Developer (Python) Experience: 5 - 7 Years Exp Salary : Upto USD 40,000 / year Preferred Notice Period : Within 60 Days Shift : 11:00AM to 8:00PM IST Opportunity Type: Hybrid (Bengaluru) Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : Apache Airflow, Astronomer, Pandas/Pyspark/Dask, RESTful API, Snowflake, Docker, Python, SQL Good to have skills : CI/CD, Data Vizualization, Matplotlib, Prometheus, AWS, Kubernetes A Single Platform for Loans/Securities & Finance (One of Uplers' Clients) is Looking for: Senior Software Developer (Python) who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description Job Summary We are seeking a highly skilled Senior Python Developer with expertise in large-scale data processing and Apache Airflow. The ideal candidate will be responsible for designing, developing, and maintaining scalable data applications and optimizing data pipelines. You will be an integral part of our R&D and Technical Operations team, focusing on data engineering, workflow automation, and advanced analytics. Key Responsibilities Design and develop sophisticated Python applications for processing and analyzing large datasets. Implement efficient and scalable data pipelines using Apache Airflow and Astronomer. ¢ Create, optimize, and maintain Airflow DAGs for complex workflow orchestration. ¢ Work with data scientists to implement and scale machine learning models. ¢ Develop robust APIs and integrate various data sources and systems. ¢ Optimize application performance for handling petabyte-scale data operations. ¢ Debug, troubleshoot, and enhance existing Python applications. ¢ Write clean, maintainable, and well-tested code following best practices. ¢ Participate in code reviews and mentor junior developers. ¢ Collaborate with cross-functional teams to translate business requirements into technical solutions. Required Skills & Qualifications ¢ Strong programming skills in Python with 5+ years of hands-on experience. ¢ Proven experience working with large-scale data processing frameworks (e.g., Pandas, PySpark, Dask). ¢ Extensive hands-on experience with Apache Airflow for workflow orchestration. ¢ Experience with Astronomer platform for Airflow deployment and management. ¢ Proficiency in SQL and experience with Snowflake database. ¢ Expertise in designing and implementing RESTful APIs. ¢ Basic knowledge of Java programming. ¢ Experience with containerization technologies (Docker). ¢ Strong problem-solving skills and the ability to work independently. Preferred Skills ¢ Experience with cloud platforms (AWS). ¢ Knowledge of CI/CD pipelines and DevOps practices. ¢ Familiarity with Kubernetes for container orchestration. ¢ Experience with data visualization libraries (Matplotlib, Seaborn, Plotly). ¢ Background in financial services or experience with financial data. ¢ Proficiency in monitoring tools like Prometheus, Grafana, and ELK stack. Engagement Type: Fulltime Direct-hire on Riskspan Payroll Job Type: Permanent Location: Hybrid (Bangalore Working time: 11:00 AM to 8:00 PM Interview Process - 3- 4 Rounds How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: RiskSpan uncovers insights and mitigates risk for mortgage loans and structured products. The Edge Platform provides data and predictive models to run forecasts under a range of scenarios and analyze Agency and non-Agency MBS, loans, and MSRs. Leverage our bleeding-edge cloud, machine learning, and AI capabilities to scale faster, optimize model builds, and manage information more efficiently. About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!
Posted 2 months ago
4.0 - 6.0 years
25 - 30 Lacs
Hyderabad
Hybrid
Senior AI Developer Experience: 4 - 6 Years Exp Salary : INR 20-30 Lacs per annum Preferred Notice Period : Within 60 Days Shift : 2:30PM to 11:30PM IST Opportunity Type: Hybrid (Hyderabad) Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : AI RAG, Fast or Flask API, Vector DB, postgresql, Python, Langchain OR Llama Good to have skills : Grafana or Prometheus, Docker, Kubernetes K-3 Innovations (One of Uplers' Clients) is Looking for: Senior AI Developer who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description About K3-Innovations K3-Innovations, Inc. is building a cutting-edge, AI-driven SaaS platform that automates critical workflows in the biopharma industry. We are expanding our team with an AI RAG Engineer who will help us design and optimize retrieval-augmented generation (RAG) pipelines for knowledge workflows. This role combines deep expertise in database design, vector search optimization, backend architecture, and LLM (Large Language Model) integration. You will play a key role in building a scalable AI platform, bridging structured and unstructured biopharma data with next-generation AI. If you're passionate about building intelligent retrieval systems, fine-tuning prompt pipelines, and optimizing LLM-based applications for real-world datasets, we want to hear from you! Key Responsibilities 1. RAG Pipeline Design and Optimization (Priority #1) Architect and implement retrieval-augmented generation pipelines integrating document retrieval and LLM response generation. Design and maintain knowledge bases and vector stores using tools like FAISS, Weaviate, or PostgreSQL PGVector. Optimize retrieval mechanisms (chunking, indexing strategies, reranking) to maximize response accuracy and efficiency. Integrate context-aware querying from structured (Postgres) and unstructured (text/PDF) sources. 2. Database and Embedding Management Design relational schemas to support knowledge base metadata and chunk-level indexing. Manage embeddings pipelines using open-source models (e.g., HuggingFace sentence transformers) or custom embedding services. Optimize large-scale vector search performance (indexing, sharding, partitioning). 3. LLM and Prompt Engineering Develop prompt engineering strategies for retrieval-augmented LLM pipelines. Experiment with prompt chaining, memory-augmented generation, and adaptive prompting techniques. Fine-tune lightweight LLMs or integrate APIs from OpenAI, Anthropic, or open-source models (e.g., LlamaIndex, LangChain). 4. Backend API and Workflow Orchestration Build scalable, secure backend services (FastAPI/Flask) to serve RAG outputs to applications. Design orchestration workflows integrating retrieval, generation, reranking, and response streaming. Implement system monitoring for LLM-based applications using observability tools (Prometheus, OpenTelemetry). 5. Collaboration and Platform Ownership Work closely with platform architects, AI scientists, and domain experts to evolve the knowledge workflows. Take ownership from system design to model integration and continuous improvement of RAG performance. Required Skills AI RAG Engineering (Most Critical) Knowledge Retrieval: o Experience building RAG architectures in production environments. o Expertise with vector stores (e.g., FAISS, Weaviate, Pinecone, PGVector). o Experience with embedding models and retrieval optimization strategies. Prompt Engineering: o Deep understanding of prompt construction for factuality, context augmentation, and reasoning. o Familiarity with frameworks like LangChain, LlamaIndex, or Haystack. Database and Backend Development (Essential) PostgreSQL Expertise: o Strong proficiency in relational and vector extension design (PGVector preferred). o SQL optimization, indexing strategies for large datasets. Python Development: o Experience building backend services using FastAPI or Flask. o Proficiency with async programming and API integrations. Observability and DevOps (Supportive) System monitoring for AI workflows using Prometheus, Grafana, OpenTelemetry. Familiarity with Docker, Kubernetes-based deployment pipelines. Preferred Experience (Bonus but not Required) Working with large-scale scientific or healthcare datasets. Exposure to clinical standards like SDTM, ADaM (advantageous for biopharma workflows). Experience integrating domain-specific ontologies into retrieval systems. Familiarity with fine-tuning LLMs on private knowledge bases. What Were Looking For AI Problem Solver: You are excited by combining retrieval, reasoning, and generative capabilities to solve real-world problems. Backend and Data Specialist: You understand database performance and scalable architectures for retrieval and serving. Builder's Mindset: You thrive in dynamic, evolving environments where you can architect and implement end-to-end solutions. What We Offer Meaningful Impact: Build AI systems that accelerate workflows in the critical biopharma space. Technical Growth: Deepen your expertise in retrieval-augmented generation and scalable AI systems. Remote Flexibility: Results-driven work culture with location flexibility. Competitive Compensation: Attractive salary, benefits, and learning opportunities. Join Us Help us revolutionize how biopharma manages and accesses knowledge through the power of AI. How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: K3-Innovations is redefining clinical research with a strategic scaling approach, blending AI-powered automation, adaptive clinical resourcing, and advanced data science. As a next-generation CRO, we provide flexible FSP models, regulatory-compliant statistical programming, and AI-driven analytics to accelerate clinical trial execution and regulatory submissions. About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!
Posted 2 months ago
3.0 - 5.0 years
15 - 20 Lacs
Pune
Work from Office
About the job Sarvaha would like to welcome Kafka Platform Engineer (or a seasoned backend engineer aspiring to move into platform architecture) with a minimum of 4 years of solid experience in building, deploying, and managing Kafka infrastructure on Kubernetes platforms. Sarvaha is a niche software development company that works with some of the best funded startups and established companies across the globe. Please visit our website at What Youll Do - Deploy and manage scalable Kafka clusters on Kubernetes using Strimzi, Helm, Terraform, and StatefulSets - Tune Kafka for performance, reliability, and cost-efficiency - Implement Kafka security: TLS, SASL, ACLs, Kubernetes Secrets, and RBAC - Automate deployments across AWS, GCP, or Azure - Set up monitoring and alerting with Prometheus, Grafana, JMX Exporter - Integrate Kafka ecosystem components: Connect, Streams, Schema Registry - Define autoscaling, resource limits, and network policies for Kubernetes workloads - Maintain CI/CD pipelines (ArgoCD, Jenkins) and container workflows You Bring - BE/BTech/ MTech (CS/IT or MCA), with an emphasis in Software Engineering - Strong foundation in the Apache Kafka ecosystem and internals (brokers, ZooKeeper/ KRaft, partitions, storage) - Proficient in Kafka setup, tuning, scaling, and topic/partition management - Skilled in managing Kafka on Kubernetes using Strimzi, Helm, Terraform - Experience with CI/CD, containerization, and GitOps workflows - Monitoring expertise using Prometheus, Grafana, JMX - Experience on EKS, GKE, or AKS preferred - Strong troubleshooting and incident response mindset - High sense of ownership and automation-first thinking - Excellent collaboration with SREs, developers, and platform teams - Clear communicator, documentation-driven, and eager to mentor/share knowledge.
Posted 2 months ago
3.0 - 5.0 years
4 - 8 Lacs
Bengaluru
Work from Office
Role Purpose The purpose of the role is to resolve, maintain and manage client’s software/ hardware/ network based on the service requests raised from the end-user as per the defined SLA’s ensuring client satisfaction Do Ensure timely response of all the tickets raised by the client end user Service requests solutioning by maintaining quality parameters Act as a custodian of client’s network/ server/ system/ storage/ platform/ infrastructure and other equipment’s to keep track of each of their proper functioning and upkeep Keep a check on the number of tickets raised (dial home/ email/ chat/ IMS), ensuring right solutioning as per the defined resolution timeframe Perform root cause analysis of the tickets raised and create an action plan to resolve the problem to ensure right client satisfaction Provide an acceptance and immediate resolution to the high priority tickets/ service Installing and configuring software/ hardware requirements based on service requests 100% adherence to timeliness as per the priority of each issue, to manage client expectations and ensure zero escalations Provide application/ user access as per client requirements and requests to ensure timely solutioning Track all the tickets from acceptance to resolution stage as per the resolution time defined by the customer Maintain timely backup of important data/ logs and management resources to ensure the solution is of acceptable quality to maintain client satisfaction Coordinate with on-site team for complex problem resolution and ensure timely client servicing Review the log which Chat BOTS gather and ensure all the service requests/ issues are resolved in a timely manner Deliver NoPerformance ParameterMeasure1. 100% adherence to SLA/ timelines Multiple cases of red time Zero customer escalation Client appreciation emails Mandatory Skills: AIOPS Grafana Observability. Experience3-5 Years.
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
31458 Jobs | Dublin
Wipro
16542 Jobs | Bengaluru
EY
10788 Jobs | London
Accenture in India
10711 Jobs | Dublin 2
Amazon
8660 Jobs | Seattle,WA
Uplers
8559 Jobs | Ahmedabad
IBM
7988 Jobs | Armonk
Oracle
7535 Jobs | Redwood City
Muthoot FinCorp (MFL)
6170 Jobs | New Delhi
Capgemini
6091 Jobs | Paris,France