Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 10.0 years
14 - 18 Lacs
Bengaluru
Work from Office
As a Senior IT Systems & DevOps Engineer, you will be responsible for IT systems, incident, change, and release management. You will ensure the seamless deployment of new releases, maintaining system stability, security, and compliance in a regulated environment. You will collaborate with cross-functional teams, manage stakeholder communications, and drive automation and optimization across cloud-based platforms supporting drug discovery and development. Key Responsibilities: Oversee incident, change, and release management for IT systems, ensuring compliance with regulatory standards. Manage Azure AD (Entra ID) for identity and access management, including authentication flows for internal and third-party services. Implement and advocate DevOps best practices, including CI/CD, automation, and observability across platforms. Collaborate with cross-functional teams to influence architectural decisions and align infrastructure with business goals. Ensure compliance and security within a regulated (GxP) environment, implementing RBAC, secrets management, and monitoring frameworks. Design, develop, test, and document business requirements related to IT systems and infrastructure. Coordinate and perform system management tasks, ensuring alignment with quality and compliance standards. Autonomous review and clarification of business requirements, creation of technical designs, and stakeholder alignment. Manage and optimize cloud infrastructure (AWS & Azure), including cost management and performance tuning. Deploy and manage containerized applications using Docker, Kubernetes, Helm, and ArgoCD (GitOps). Implement Infrastructure as Code (IaC) using Terraform and AWS CloudFormation. Automate workflow and integration using Python and configuration management tools like Ansible. Ensure observability, including logging, monitoring, and tracing with tools like Prometheus, Grafana, ELK Stack, and AWS-native solutions. Participate in compliance activities, including audits, patch management, and cybersecurity initiatives. Provide technical guidance and support for IT systems, assisting users and resolving incidents efficiently. Your Skills & Experience: Must-Have: 5+ years of experience in IT systems management, DevOps, cloud infrastructure, and automation. Strong expertise in Change, Release, Incident, and Problem Management. Hands-on experience with Azure DevOps (project configurations, repositories, pipelines, environments). Strong knowledge of AWS services (VPC, IAM, STS, EKS, RDS, EC2, ECS, Route53, CloudWatch, CloudTrail, Secrets Manager, S3, API Gateway, Lambda, MWAA). Experience with Linux, Windows servers, and Oracle PLSQL. Strong understanding of IT landscapes and a willingness to work on diverse IT systems. Hands-on experience with containerization and Kubernetes (Docker, Helm, ArgoCD). Proficiency in Python scripting for automation and workflow integration. Experience with Infrastructure as Code (IaC) using Terraform and AWS CloudFormation. Strong experience in observability, security, and compliance frameworks, including RBAC, secrets management, and monitoring tools. Global stakeholder management experience with excellent English communication skills (written & oral). Good to Have: Experience in a regulated industry (GxP, Pharma, Life Sciences). Familiarity with Agile ways of working. Knowledge of Next.js, Storybook, Tailwind, TypeScript for front-end development. Experience with PostgreSQL, OpenSearch/Elasticsearch for data management. Familiarity with R and SAS for data analysis and statistical modeling. Understanding of AWS billing practices and cost optimization tools. Why Join Us? Work in a high-impact role contributing to cutting-edge R&D in drug discovery and development. Be part of a multicultural, agile team with high autonomy in decision-making. Exposure to a diverse tech stack combining Azure, AWS, Kubernetes, Python, and CI/CD tools. Opportunities for career growth and skill development in cloud computing, security, and automation. Work in a collaborative and innovative environment with global teams in the US, Europe, and India
Posted 1 month ago
5.0 - 9.0 years
14 - 16 Lacs
Chennai
Work from Office
• Cloud networking and security • Linux and Windows experience • Scripting and automation skills (e.g., Python, PowerShell) • Proficiency in infrastructure as code (IaC) and configuration management tools. (e.g.,Terraform, Ansible) Required Candidate profile • Containerization and orchestration technologies (e.g., Docker, Kubernetes, Rancher) • Monitoring and logging tools (e.g., Nagios, Prometheus, ELK stack) • Virtualization, VPN, RDP, SSO, Kafka
Posted 1 month ago
7.0 - 9.0 years
25 - 30 Lacs
Chennai
Work from Office
Responsibilities- Java ,Spring, ELK, Java Multithreading 7 to 10 years of experience in Java / Spring boot development. Solid understanding of Java Multithreading. Good exposure to ELK usage and ELK APIs. Exposure to CI/CD infra preferably Concourse. Ability to lead teams and lease with customers directly.
Posted 1 month ago
5.0 - 10.0 years
40 - 100 Lacs
Pune, Bengaluru, Delhi / NCR
Hybrid
Experience in Site Reliability Engineering, DevOps,managing teams, including mentoring and developing engineers.Prometheus, Grafana, ELK Stack, Splunk, Datadog, New Relic, AWS, GCP, Azure,Docker, Kubernetes,Python, Go, Bash, or simila.
Posted 1 month ago
5.0 - 10.0 years
6 - 16 Lacs
Hyderabad, Bengaluru, Delhi / NCR
Work from Office
Role: Senior Test Automation Lead/Manager Experience: 5 years and above Location: Pan India Job Description: Senior Test Automation Lead/Manager Responsible for project leadership, client communication, requirement gathering, and change management Basic Networking (Familiarity with OpenStack and device connectivity)(Desired), Experience with ELK Stack, Prometheus (Required), Linux(Required), Scripting Knowledge (Preferably Python)(Required). LMA is a framework used to track system health, diagnose issues, and ensure observability across infrastructure and services. Logging: Collecting and storing logs for analysis and troubleshooting. Monitoring: Continuously tracking system performance, resource usage, and metrics. Alerting: Notifying teams when predefined thresholds or anomalies are detected. This is to enhance reliability and operational efficiency of different site across both Pre-pod and Prod environments, where the underlying infrastructure is based on OpenStack. Thanks & Regards, Md Shahid, TCS
Posted 1 month ago
2.0 - 4.0 years
4 - 7 Lacs
Chennai
Work from Office
Hi All, PFB Opening. Position- ELK support Engineer Experience-2 to 4 Years CTC-upto 7 LPA Location - Chennai-Beach railway station. 24*7 Support Immediate joiners only Job Description 2+ years hands-on experience with Elasticsearch, Logstash, and Kibana Experience with Beats (Filebeat, Metricbeat) Hands-on experience deploying applications on OpenShift (or Kubernetes) Familiarity with container logs, pod metrics, and application tracing Good understanding of JSON, regex, and log parsing techniques Experience writing Elasticsearch queries (DSL) and Kibana visualizations Basic knowledge of Linux shell scripting, YAML, and Git Interested can share your profile on monika.salvi@nusummit.com Regards Monika
Posted 1 month ago
0.0 - 5.0 years
2 - 7 Lacs
Bengaluru
Work from Office
Ensemble Energy is an exciting startup in industrial IoT space focused on energy. Our mission is to accelerate the clean energy revolution by making it more competitive using the power of data. Ensembles AI enabled SaaS platform provides prescriptive analytics to power plant operators by combining the power of machine learning, big data, and deep domain expertise. As a Full Stack/IOT Intern, you will be participating in developing and deploying frontend/backend applications, creating vizualization dashboards and developing ways to integrate high frequency data data from devices onto our platform. Required Skills & Experience: React/Redux, HTML5, CSS3, JavaScript, Python, Django and REST APIs. BS or MS in Computer Science or related field. Strong foundation in Computer Science, with deep knowledge of data structures, algorithms, and software design. Experience with GIT, CI/CD tools, Sentry, Atlassian software and AWS CodeDeploy a plus Contribute with ideas to overall product strategy and roadmap. Improve codebase with continuous refactoring. Self-starter to take ownership of the platform engineering and application development. Work on multiple projects simultaneously and get things done. Take products from prototype to production. Collaborate with team in Sunnyvale, CA to lead 24x7 product development. Bonus: If you have worked on one or more below then highlight those projects when applying: Experience with Time Series DB - M3DB, Prometheus, InfluxDB, OpenTSDB, ELK Stack Experience with visualization tools like Tableau, KeplerGL etc. Experience with MQTT or other IoT communication protocols a plus
Posted 1 month ago
11.0 - 21.0 years
30 - 45 Lacs
Mumbai Suburban, Navi Mumbai, Mumbai (All Areas)
Work from Office
Min 11 to 20 yrs with exp in tools like Azure DevOps Jenkins GitLab GitHub Docker Kubernetes Terraform Ansible Exp on Dockerfile & Pipeline codes Exp automating tasks using Shell Bash PowerShell YAML Exposure in .NET Java ProC PL/SQL Oracle/SQL REDIS Required Candidate profile Exp in DevOps platform from ground up using tools at least for 2 projects Implement in platform for Req tracking cod mgmt release mgmt Exp in tools such as AppDynamics Prometheus Grafana ELK Stack Perks and benefits Addnl 40% Variable + mediclaim
Posted 1 month ago
0.0 - 1.0 years
0 Lacs
Ahmedabad
Work from Office
Job Title: DevOps Intern Location: Ahmedabad (Work from Office) Duration: 3 to 6 Months Start Date: Immediate or As per Availability Company: FX31 Labs Role Overview: We are looking for a motivated and detail-oriented DevOps Intern to join our engineering team. As a DevOps Intern, you will assist in designing, implementing, and maintaining CI/CD pipelines, automating workflows, and supporting infrastructure deployments across development and production environments. Key Responsibilities: Assist in building and maintaining CI/CD pipelines using tools like GitHub Actions, Jenkins, or GitLab CI. Help in provisioning and managing cloud infrastructure (AWS, Azure, or GCP). Collaborate with developers to automate software deployment processes. Monitor and optimize system performance, availability, and reliability. Write basic scripts to automate repetitive DevOps tasks. Document internal processes, tools, and workflows. Support containerization (Docker) and orchestration (Kubernetes) initiatives. Required Skills: Basic understanding of Linux/Unix systems and shell scripting. Familiarity with version control systems like Git. Knowledge of DevOps concepts like CI/CD, Infrastructure as Code (IaC), and automation. Exposure to tools like Docker, Jenkins, Kubernetes (even theoretical understanding is a plus). Awareness of at least one cloud platform (AWS, Azure, or GCP). Strong problem-solving attitude and willingness to learn. Good to Have: Hands-on project or academic experience related to DevOps. Knowledge of Infrastructure as Code tools like Terraform or Ansible. Familiarity with monitoring tools (Grafana, Prometheus) or logging tools (ELK, Fluentd). Eligibility Criteria: Pursuing or recently completed a degree in Computer Science, IT, or related field. Available to work full-time from the Ahmedabad office for the duration of the internship. Perks: Certificate of Internship & Letter of Recommendation (on successful completion). Opportunity to work on real-time projects with mentorship. PPO opportunity for high-performing candidates. Hands-on exposure to industry-level DevOps tools and cloud platforms. About FX31 Labs: FX31 Labs is a fast-growing tech company focused on building innovative solutions in AI, data engineering, and product development. We foster a learning-rich environment and aim to empower individuals through hands-on experience in real-world projects.
Posted 1 month ago
7.0 - 9.0 years
20 - 22 Lacs
Chennai
Work from Office
Java ,Spring, ELK, Java Multithreading 7 to 10 years of experience in Java / Spring boot development. Solid understanding of Java Multithreading. Good exposure to ELK usage and ELK APIs. Exposure to CI/CD infra preferably Concourse. Ability to lead teams and lease with customer directly.
Posted 1 month ago
6.0 - 8.0 years
6 - 15 Lacs
Hyderabad, Secunderabad
Work from Office
Hands-on experience with CI/CD pipelines (e.g., Jenkins, GitLab CI, Azure DevOps). Knowledge of Terraform, CloudFormation, or other infrastructure automation tools. Experience with Docker, and basic knowledge of Kubernetes. Familiarity with monitoring/logging tools such as CloudWatch, Prometheus, Grafana, ELK.
Posted 1 month ago
1.0 - 6.0 years
6 - 13 Lacs
Bengaluru
Work from Office
Position Summary: We are seeking an experienced and highly skilled Lead LogicMonitor Administrator to architect, deploy, and manage scalable observability solutions across hybrid IT environments. This role demands deep expertise in LogicMonitor and a strong understanding of modern IT infrastructure and application ecosystems, including on premises, cloud-native, and hybrid environments. The ideal candidate will play a critical role in designing real-time service availability dashboards, optimizing performance visibility, and ensuring comprehensive monitoring coverage for business-critical services. Role & Responsibilities: Monitoring Architecture & Implementation Serve as the subject matter expert (SME) for LogicMonitor, overseeing design, implementation, and continuous optimization. Lead the development and deployment of monitoring solutions that integrate on premise infrastructure, public cloud (AWS, Azure, GCP), and hybrid environments. Develop and maintain monitoring templates, escalation chains, and alerting policies that align with business service SLAs. Ensure monitoring solutions adhere to industry standards and compliance requirements. Real-Time Dashboards & Visualization Design and build real-time service availability dashboards to provide actionable insights for operations and leadership teams. Leverage Logic Monitor’s APIs and data sources to develop custom visualizations, ensuring a single-pane-of-glass view for multi-layered service components. Collaborate with applications and service owners to define KPIs, thresholds, and health metrics. Proficient in interpreting monitoring data and metrics related to uptime and performance. Automation & Integration Automate onboarding/offboarding of monitored resources using LogicMonitor’s REST API, Groovy scripts, and Configuration Modules. Integrate LogicMonitor with ITSM tools (e.g., ServiceNow, Jira), collaboration platforms (e.g., Slack, Teams), and CI/CD pipelines. Enable proactive monitoring through synthetic transactions and anomaly detection capabilities. Streamline processes through automation and integrate monitoring with DevOps practices. Operations & Optimization Perform ongoing health checks, capacity planning, tools version upgrades, and tuning monitoring thresholds to reduce alert fatigue. Establish and enforce monitoring standards, best practices, and governance models across the organization. Lead incident response investigations, root cause analysis, and post-mortem reviews from a monitoring perspective. Optimize monitoring strategies for effective resource utilization and cost efficiency. Qualification Minimum Educational Qualifications: Bachelor’s degree in computer science, Information Technology, Engineering, or a related field Required Skills & Qualifications: 8+ years of total experience. 5+ years of hands-on experience with LogicMonitor, including custom DataSources, Property Sources, dashboards, and alert tuning. Proven expertise in IT infrastructure monitoring: networks, servers, storage, virtualization (VMware, Nutanix), and containerization (Kubernetes, Docker). Strong understanding of cloud platforms (AWS, Azure, GCP) and their native monitoring tools (e.g., CloudWatch, Azure Monitor). Experience in scripting and automation (e.g., Python, PowerShell, Groovy, Bash). Familiarity with observability stacks: ELK, Grafana, is a strong plus. Proficient with ITSM and incident management processes, including integrations with ServiceNow. Excellent problem-solving, communication, and documentation skills. Ability to work collaboratively in cross-functional teams and lead initiatives. Preferred Qualifications: LogicMonitor Certified Professional (LMCA and LMCP) or similar certification. Experience with APM tools (e.g., SolarWinds, AppDynamics, Dynatrace, Datadog) and log analytics platforms and logicmonitor observability Knowledge of DevOps practices and CI/CD pipelines. Exposure to regulatory/compliance monitoring (e.g., HIPAA, PCI, SOC 2). Experience with machine learning or AI-based monitoring solutions. Additional Information Intuitive is an Equal Employment Opportunity Employer. We provide equal employment opportunities to all qualified applicants and employees, and prohibit discrimination and harassment of any type, without regard to race, sex, pregnancy, sexual orientation, gender identity, national origin, color, age, religion, protected veteran or disability status, genetic information or any other status protected under federal, state, or local applicable laws. We will consider for employment qualified applicants with arrest and conviction records in accordance with fair chance laws.
Posted 1 month ago
8.0 - 13.0 years
20 - 30 Lacs
Bangalore Rural, Bengaluru
Work from Office
Immediate Hiring: Java + Observability Engineer (Apache Storm) Location : Bengaluru | Architect Level Only Immediate Joiners We are looking for a skilled and experienced Java + Observability Engineer with expertise in Apache Storm to join our team in Bengaluru . This is an exciting opportunity for professionals passionate about modern observability stacks and distributed systems. Key Skills Required : Java (Version 8/11 or higher) Observability Tools : Prometheus, Grafana, OpenTelemetry, ELK, Jaeger, Zipkin, New Relic Containerization : Docker, Kubernetes CI/CD Pipelines Experience designing and building scalable systems as an Architect Hands-on with Apache Storm Note : This role is open to immediate joiners only . If you're ready to take on a challenging architect-level role and make an impact, send your resume to sushil@saisservices.com
Posted 1 month ago
1.0 - 4.0 years
1 - 4 Lacs
Hyderabad
Work from Office
Working knowledge of Jenkins, Terraform, Puppet & JIRA Basic knowledge of AWS or Azure Linux Administration & Shell scripting Moderate Knowledge in Grafana and ELK Good Communication Skill required
Posted 1 month ago
1.0 - 4.0 years
1 - 4 Lacs
Hyderabad
Work from Office
Working knowledge of Jenkins, Terraform, Puppet & JIRA Basic knowledge of AWS or Azure Linux Administration & Shell scripting Moderate Knowledge in Grafana and ELK Good Communication Skill required
Posted 1 month ago
5.0 - 10.0 years
0 - 1 Lacs
Noida
Work from Office
Location: Noida Only Primary Skills: Terraform, Jenkins, Artifactory ,Kubernetes, GCP, ELK stack for monitoring Experience: 5 to 12 Years
Posted 1 month ago
7.0 - 12.0 years
14 - 24 Lacs
Chennai
Work from Office
Job Description Bachelors degree in computer science, computer engineering, or related technologies. Seven years of experience in systems engineering within the networking industry. Expertise in Linux deployment, scripting and configuration. Expertise in TCP/IP communications stacks and optimizations Experience with ELK (Elasticsearch, Logstash, Kibana), Grafana data streaming (e.g., Kafka), and software visualization. Experience in analyzing and debugging code defects in the Production Environment. Proficiency in version control systems such as GIT. Ability to design comprehensive test scenarios for systems usability, execute tests, and prepare detailed reports on effectiveness and defects for production teams. Full-cycle Systems Engineering experience covering Requirements capture, architecture, design, development, and system testing. Demonstrated ability to work independently and collaboratively within cross-functional teams. Proficient in installing, configuring, debugging, and interpreting performance analytics to monitor, aggregate, and visualize key performance indicators over time. Proven track record of directly interfacing with customers to address concerns and resolve issues effectively. Strong problem-solving skills, capable of driving resolutions autonomously without senior engineer support. Experience in configuring MySQL and PostgreSQL, including setup of replication, troubleshooting, and performance improvement. Proficiency in networking concepts such as network architecture, protocols (TCP/IP, UDP), routing, VLANs, essential for deploying new system servers effectively. Proficiency in scripting language Shell/Bash, in Linux systems. Proficient in utilizing, modifying, troubleshooting, and updating Python scripts and tools to refine code. Excellent written and verbal communication skills. Ability to document processes, procedures, and system configurations effectively. Ability to Handle Stress and Maintain Quality. This includes resilience to effectively manage stress and pressure, as well as a demonstrated ability to make informed decisions, particularly in high-pressure situations. Excellent written and verbal communication skills. It includes the ability to document processes, procedures, and system configurations effectively. It is required for this role to be on-call 24/7 to address service-affecting issues in production. It is required to work during the business hours of Chicago, aligning with local time for effective coordination and responsiveness to business operations and stakeholders in the region.
Posted 1 month ago
10.0 - 15.0 years
22 - 37 Lacs
Bengaluru
Work from Office
Who We Are At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. The Role As an ELK (Elastic, Logstash & Kibana) Data Engineer, you would be responsible for developing, implementing, and maintaining the ELK stack-based solutions for Kyndryl’s clients. This role would be responsible to develop efficient and effective, data & log ingestion, processing, indexing, and visualization for monitoring, troubleshooting, and analysis purposes. Responsibilities: Design, implement, and maintain scalable data pipelines using ELK Stack (Elasticsearch, Logstash, Kibana) and Beats for monitoring and analytics. Develop data processing workflows to handle real-time and batch data ingestion, transformation and visualization. Implement techniques like grok patterns, regular expressions, and plugins to handle complex log formats and structures. Configure and optimize Elasticsearch clusters for efficient indexing, searching, and performance tuning. Collaborate with business users to understand their data integration & visualization needs and translate them into technical solutions Create dynamic and interactive dashboards in Kibana for data visualization and insights that can enable to detect the root cause of the issue. Leverage open-source tools such as Beats and Python to integrate and process data from multiple sources. Collaborate with cross-functional teams to implement ITSM solutions integrating ELK with tools like ServiceNow and other ITSM platforms. Anomaly detection using Elastic ML and create alerts using Watcher functionality Extract data by Python programming using API Build and deploy solutions in containerized environments using Kubernetes. Monitor Elasticsearch clusters for health, performance, and resource utilization Automate routine tasks and data workflows using scripting languages such as Python or shell scripting. Provide technical expertise in troubleshooting, debugging, and resolving complex data and system issues. Create and maintain technical documentation, including system diagrams, deployment procedures, and troubleshooting guides If you're ready to embrace the power of data to transform our business and embark on an epic data adventure, then join us at Kyndryl. Together, let's redefine what's possible and unleash your potential. Your Future at Kyndryl Every position at Kyndryl offers a way forward to grow your career. We have opportunities that you won’t find anywhere else, including hands-on experience, learning opportunities, and the chance to certify in all four major platforms. Whether you want to broaden your knowledge base or narrow your scope and specialize in a specific sector, you can find your opportunity here. Who You Are You’re good at what you do and possess the required experience to prove it. However, equally as important – you have a growth mindset; keen to drive your own personal and professional development. You are customer-focused – someone who prioritizes customer success in their work. And finally, you’re open and borderless – naturally inclusive in how you work with others. Required Technical and Professional Experience: Minimum of 5 years of experience in ELK Stack and Python programming Graduate/Postgraduate in computer science, computer engineering, or equivalent with minimum of 10 years of experience in the IT industry. ELK Stack : Deep expertise in Elasticsearch, Logstash, Kibana, and Beats. Programming : Proficiency in Python for scripting and automation. ITSM Platforms : Hands-on experience with ServiceNow or similar ITSM tools. Containerization : Experience with Kubernetes and containerized applications. Operating Systems : Strong working knowledge of Windows, Linux, and AIX environments. Open-Source Tools : Familiarity with various open-source data integration and monitoring tools. Knowledge of network protocols, log management, and system performance optimization. Experience in integrating ELK solutions with enterprise IT environments. Strong analytical and problem-solving skills with attention to detail. Knowledge in MySQL or NoSQL Databases will be added advantage Fluent in English (written and spoken). Preferred Technical and Professional Experience “Elastic Certified Analyst” or “Elastic Certified Engineer” certification is preferrable Familiarity with additional monitoring tools like Prometheus, Grafana, or Splunk. Knowledge of cloud platforms (AWS, Azure, or GCP). Experience with DevOps methodologies and tools. Being You Diversity is a whole lot more than what we look like or where we come from, it’s how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we’re not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you – and everyone next to you – the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That’s the Kyndryl Way. What You Can Expect With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Kyndryl cares about your well-being and prides itself on offering benefits that give you choice, reflect the diversity of our employees and support you and your family through the moments that matter – wherever you are in your life journey. Our employee learning programs give you access to the best learning in the industry to receive certifications, including Microsoft, Google, Amazon, Skillsoft, and many more. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you, we want you to succeed so that together, we will all succeed. Get Referred! If you know someone that works at Kyndryl, when asked ‘How Did You Hear About Us’ during the application process, select ‘Employee Referral’ and enter your contact's Kyndryl email address.
Posted 1 month ago
4.0 - 5.0 years
20 - 25 Lacs
Bengaluru
Work from Office
Bachelor’s degree in a technical field: Computer Science, Engineering, or similar Experience working in an external customer facing technical support role Experience troubleshooting complex technical issues Excellent written and verbal communication skills in English Fundamental L2/L3 networking knowledge: network stacks, switching, routing, firewalls, etc. Experience working with Linux Ability to work in a dynamic, high pressure customer facing environment Ability to manage and prioritize numerous customer issues simultaneously. Additional Skills Considered a Plus Experience with Docker and Kubernetes Experience with ELK stack Experience working with Telecommunication Providers Experience with scripting: Ansible, Bash, Python
Posted 1 month ago
8.0 - 13.0 years
10 - 15 Lacs
Bengaluru
Work from Office
We are seeking a Senior DevOps Engineer to build pipeline automation, integrating DevSecOps principles and operations of product build and releases. Mentor and guide DevOps teams, fostering a culture of technical excellence and continuous learning. What You'll Do Design & Architecture: Architect and implement scalable, resilient, and secure Kubernetes-based solutions on Amazon EKS. Deployment & Management: Deploy and manage containerized applications, ensuring high availability, performance, and security. Infrastructure as Code (IaC): Develop and maintain Terraform scripts for provisioning cloud infrastructure and Kubernetes resources. CI/CD Pipelines: Design and optimize CI/CD pipelines using tools like Jenkins, GitHub Actions, GitLab CI/CD, or ArgoCD along with automated builds, tests (unit, regression), and deployments. Monitoring & Logging: Implement monitoring, logging, and alerting solutions using Prometheus, Grafana, ELK stack, or CloudWatch. Security & Compliance: Ensure security best practices in Kubernetes, including RBAC, IAM policies, network policies, and vulnerability scanning. Automation & Scripting: Automate operational tasks using Bash, Python, or Go for improved efficiency. Performance Optimization: Tune Kubernetes workloads and optimize cost/performance of Amazon EKS clusters. Test Automation & Regression Pipelines - Integrate automated regression testing and build sanity checks into pipelines to ensure high-quality releases. Security & Resource Optimization - Manage Kubernetes security (RBAC, network policies) and optimize resource usage with Horizontal Pod Autoscalers (HPA) and Vertical Pod Autoscalers (VPA) . Collaboration: Work closely with development, security, and infrastructure teams to enhance DevOps processes. Minimum Qualifications Bachelor's degree (or above) in Engineering/Computer Science. 8+ years of experience in DevOps, Cloud, and Infrastructure Automation in a DevOps engineer role. Expertise with Helm charts, Kubernetes Operators, and Service Mesh (Istio, Linkerd, etc.) Strong expertise in Amazon EKS and Kubernetes (design, deployment, and management) Expertise in Terraform, Jenkins and Ansible Expertise with CI/CD tools (Jenkins, GitHub Actions, GitLab CI/CD, ArgoCD, etc.) Strong experience with monitoring and logging tools (Prometheus, Grafana, ELK, CloudWatch) Proficiency in Bash, Python, for automation and scripting
Posted 1 month ago
7.0 - 10.0 years
11 - 16 Lacs
Mumbai, Hyderabad, Pune
Work from Office
Key Responsibilities: Design, build, and maintain CI/CD pipelines for ML model training, validation, and deployment Automate and optimize ML workflows, including data ingestion, feature engineering, model training, and monitoring Deploy, monitor, and manage LLMs and other ML models in production (on-premises and/or cloud) Implement model versioning, reproducibility, and governance best practices Collaborate with data scientists, ML engineers, and software engineers to streamline end-to-end ML lifecycle Ensure security, compliance, and scalability of ML/LLM infrastructure Troubleshoot and resolve issues related to ML model deployment and serving Evaluate and integrate new MLOps/LLMOps tools and technologies Mentor junior engineers and contribute to best practices documentation Required Skills & Qualifications: 8+ years of experience in DevOps, with at least 3 years in MLOps/LLMOps Strong experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes, Docker) Proficient in CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.) Hands-on experience deploying and managing different types of AI models (e.g., OpenAI, HuggingFace, custom models) to be used for developing solutions. Experience with model serving tools such as TGI, vLLM, BentoML, etc. Solid scripting and programming skills (Python, Bash, etc.) Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK stack) Strong understanding of security and compliance in ML environments Preferred Skills: Knowledge of model explainability, drift detection, and model monitoring Familiarity with data engineering tools (Spark, Kafka, etc. Knowledge of data privacy, security, and compliance in AI systems. Strong communication skills to effectively collaborate with various stakeholders Critical thinking and problem-solving skills are essential Proven ability to lead and manage projects with cross-functional teams
Posted 1 month ago
7.0 - 10.0 years
8 - 13 Lacs
Mumbai, Hyderabad, Pune
Work from Office
Key Responsibilities: Design, build, and maintain CI/CD pipelines for ML model training, validation, and deployment Automate and optimize ML workflows, including data ingestion, feature engineering, model training, and monitoring Deploy, monitor, and manage LLMs and other ML models in production (on-premises and/or cloud) Implement model versioning, reproducibility, and governance best practices Collaborate with data scientists, ML engineers, and software engineers to streamline end-to-end ML lifecycle Ensure security, compliance, and scalability of ML/LLM infrastructure Troubleshoot and resolve issues related to ML model deployment and serving Evaluate and integrate new MLOps/LLMOps tools and technologies Mentor junior engineers and contribute to best practices documentation Required Skills & Qualifications: 8+ years of experience in DevOps, with at least 3 years in MLOps/LLMOps Strong experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes, Docker) Proficient in CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.) Hands-on experience deploying and managing different types of AI models (e.g., OpenAI, HuggingFace, custom models) to be used for developing solutions. Experience with model serving tools such as TGI, vLLM, BentoML, etc. Solid scripting and programming skills (Python, Bash, etc.) Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK stack) Strong understanding of security and compliance in ML environments Preferred Skills: Knowledge of model explainability, drift detection, and model monitoring Familiarity with data engineering tools (Spark, Kafka, etc. Knowledge of data privacy, security, and compliance in AI systems. Strong communication skills to effectively collaborate with various stakeholders Critical thinking and problem-solving skills are essential Proven ability to lead and manage projects with cross-functional teams
Posted 1 month ago
1.0 - 3.0 years
10 - 15 Lacs
Bengaluru
Work from Office
SRE 1 (Clouds Op) Locations: B'lore & Pune Exp - 1 to 3 yrs Candiates only from B2C product companies Exp - GCP, Prometheus, Grafana, ELK, Newrelic, Pingdom, or Pagerduty , Kubernets Experience with CI/CD tools 5 days week Rotational Shift
Posted 1 month ago
4.0 - 9.0 years
0 - 1 Lacs
Bengaluru
Hybrid
Design and implement highly scalable ELK (ElasticSearch, Logstash, and Kibana) stack and ElastiCache solutions Grafana: Create different visualization and dashboards according to the Client needs Experience of scripting languages like JavaScript, Python, PowerShell, etc. Should be able to work with API, shards etc. in Elasticsearch. Architecting data structures using Elastic Search and ElastiCache Query languages and writing complex queries with joins that deals with a large amount of data End to end Low-level design, development, administration, and delivery of ELK based reporting solutions Strong exposure to writing talend queries. Elastic query for data Analysis Creating Elasticsearch index templates. Index life cycle management Managing and monitoring Elasticsearch cluster Experience with Analyzers & Shards Experience in solving performance issues on large set of data indexes Strong expertise in Python scripting Strong experience in installing and configuring ELK on bare metal and clouds (GCP, AWS & AZURE) Strong experience in using Elastic search Indices, Elastic search APIs, Kibana Dashboards, Log stash and Log Beats Good experience in using or creating plugins for ELK like authentication and authorization plugins Good experience in enhancing Open-source ELK for custom capabilities Experience in provisioning automation frameworks such as Kubernetes or docker Experience working with JSON
Posted 1 month ago
5.0 - 6.0 years
12 - 18 Lacs
Gurugram
Hybrid
Job Summary: We are seeking a skilled and proactive Technical Support Engineer Java & Cloud Applications to join our dynamic team. The ideal candidate will bring extensive experience in supporting enterprise-grade applications across Java, Cloud platforms, and DevOps tools . You will play a critical role in ensuring application stability, performance tuning, troubleshooting complex issues, and working closely with cross-functional teams globally. Key Responsibilities: Provide L2/L3 level support for Java/J2EE-based applications , including Spring, Spring Boot, RESTful services . Troubleshoot and resolve issues in applications built using React.js, NodeJS, Python, Angular . Support cloud-hosted platforms on AWS, PaaS, and Heroku environments . Manage API Gateways such as Apigee and Layer7 for smooth service operations. Perform UNIX and Shell scripting for automation and operational tasks. Monitor and optimize system performance using tools like ELK, New Relic . Administer and tune Oracle Databases , write and optimize SQL queries, stored procedures, and functions. Handle job scheduling via Control-M , ensuring timely and efficient batch processing. Manage CI/CD pipelines with tools like GitHub/Bitbucket, Jenkins, UrbanDeploy . Apply Data Warehousing principles in troubleshooting and optimizing data-related operations. Follow ITIL practices for incident, change, and problem management processes. Collaborate with development, infrastructure, and business teams for smooth delivery and support. Personal Attributes: Proactive, self-motivated, and results-driven Strong problem-solving and analytical abilities Excellent communication and collaboration skills Adaptable to cross-cultural and cross-functional teams Willingness to accommodate occasional UK/Canada meeting timings
Posted 1 month ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough