Home
Jobs
Companies
Resume

209 Elk Jobs - Page 3

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 - 12.0 years

10 - 17 Lacs

Navi Mumbai

Work from Office

Naukri logo

We are looking for a skilled Application Support and Production Support Engineer with 7+ years of experience in the BFSI domain, specializing Production Support, and Application Maintenance. The ideal candidate will have strong expertise in Unix/Linux, Oracle SQL, WebLogic administration basics, and enterprise-scale system monitoring using tools like Dynatrace and ELK. Experience in supporting banking clients and a solid understanding of DC-DR activities and card-based transaction systems are essential. Core Responsibilities - Application support and maintenance, handling large enterprise B2C application. - SQL development for data extraction, reporting, and issue resolution for banking applications. - UNIX/Linux server administration tasks such as file encryption, decryption, logs movement, and basic troubleshooting. - Testing and hardening activities in UAT and production server deployments. - Monitoring web services request/responses and application performance using Dynatrace and ELK monitoring tools. - Hands-on experience with WebLogic deployment, server health checks, and web application troubleshooting. - Conducting and participating in DC-DR (Disaster Recovery) planning, simulation, and execution activities. - Engage in incident management, change management, release management, and problem management following ITIL best practices. - Work closely with L1/L2/L3 teams and development/testing teams for RCA (Root Cause Analysis) and resolution. Preferred Skills - Strong knowledge of SQL (Oracle SQL Developer, Toad) for database queries and reporting. - Proficient in Unix/Linux scripting and server management. - Basic administration skills in WebLogic Server deployments and configurations. - Familiarity with Autosys Scheduler, Cron jobs, and working with FTP, SFTP tools. - Exposure to monitoring and analysis using Dynatrace and ELK Stack tools. - Good understanding of ITIL-based processes for support operations. Desired Skills - Bachelor's or Master's Degree in Computers or IT. - Strong background in Production Support for BFSI clients. - Knowledge of application deployment practices (WAR/JAR deployments) and server performance analysis (AWR Reports). - Basic understanding of Kafka or other message queuing systems (MQ servers). - Ability to participate in business continuity and disaster recovery exercises.

Posted 2 weeks ago

Apply

5.0 - 7.0 years

7 - 11 Lacs

Jaipur, Bengaluru

Work from Office

Naukri logo

In Time Tec is an award-winning IT & software company. In Time Tec offers progressive software development services, enabling its clients to keep their brightest and most valuable talent focused on innovation. In Time Tec has a leadership team averaging 15 years in software/firmware R&D, and 20 years building onshore/offshore R&D teams. We are looking for rare talent to join us. People having a positive mindset and great organizational skills will be drawn to the position. Your capacity to take initiative and solve problems as they emerge, flexibility, and honesty, will be key factors for your success at In Time Tec. We’re looking for an Interactive Backend Engineer – Python & DevOps who will be responsible for managing the release pipeline. This person will not just be involved in the scripting but also in the development and will be directly supporting the development and content teams that are creating and publishing content on most trafficked websites. The ideal candidate is someone who has worked in a build/release role previously, has strong communication skills, and who knows how to handle the unexpected scenarios. Roles and Responsibilities Backend Engineer – Python & DevOps Skills: Strong programming experience in Python (not just scripting — real development). Experience with CI/CD tools like Jenkins. Proficient in Git and source control workflows. Experience with Docker , Kubernetes , and Linux environments . Familiarity with scripting languages like Bash , optionally Groovy or Go . Knowledge of web application servers and deployment processes. Good understanding of DevOps principles , cloud environments, and automation. Nice to Have: Experience with monitoring/logging tools (e.g., Prometheus, Grafana, ELK stack). Exposure to configuration management tools like Ansible . Experience in performance tuning and scaling backend systems.

Posted 2 weeks ago

Apply

5.0 - 6.0 years

15 - 16 Lacs

Chennai

Work from Office

Naukri logo

Job Description: We are looking for a highly skilled DevOps Engineer with strong experience in Red Hat OpenShift Container Platform (v4.x) and related DevOps tools like Argo CD , Jenkins , and Red Hat Data Grid . The ideal candidate will be responsible for automation, managing containerized environments, and ensuring robust CI/CD pipelines across hybrid cloud infrastructure supporting our fintech solutions. Key Responsibilities: OpenShift Platform Engineering: Deploy, manage, and maintain apps on OpenShift v4.x. Manage Operators, Helm charts, and OpenShift GitOps (Argo CD). Handle Red Hat Data Grid deployments. Perform OCP upgrades, patching, and troubleshooting. CI/CD & Automation: Implement CI/CD pipelines using Jenkins, Argo CD, GitHub Actions. Ensure seamless code integration and automated deployment. Infrastructure as Code (IaC): Automate infrastructure using Terraform, Ansible, CloudFormation. Manage infrastructure on AWS, Azure, or GCP. Monitoring & Optimization: Set up observability stacks (Prometheus, Grafana, ELK, Splunk). Troubleshoot and optimize system performance. Security & Collaboration: Apply DevSecOps best practices and ensure compliance. Collaborate with development and DevOps teams for solution implementation. Desired Candidate Profile: Technical Skills: Red Hat OpenShift (v4.x) administration & operations. CI/CD tools: Jenkins, Argo CD, GitHub Actions, GitLab CI/CD. Kubernetes, Docker, Helm, GitOps. Red Hat Data Grid or other in-memory data grids. IaC tools: Terraform, Ansible, CloudFormation. Monitoring tools: Prometheus, Grafana, ELK, Splunk. Scripting: Bash, Python, or Shell. Soft Skills: Excellent analytical and problem-solving skills. Strong communication and collaboration abilities. Ability to work independently and with customer DevOps teams. Education: BE / B.Tech / MCA or equivalent in Computer Science or related fields. Work Location: Chennai

Posted 2 weeks ago

Apply

5.0 - 7.0 years

15 - 27 Lacs

Bangalore Rural, Bengaluru

Work from Office

Naukri logo

DevOps, Site Reliability Engineering,loud platforms,GCP,Infrastructure as Code tools (Terraform, Ansible, CloudFormation), Prometheus, Grafana, ELK stack,Python, Bash, Go, Istio, Linkerd

Posted 2 weeks ago

Apply

1.0 - 5.0 years

8 - 15 Lacs

Bengaluru

Work from Office

Naukri logo

Junior DevOps Engineer / DevOps Engineer Location: Bengaluru South, Karnataka, India Experience: 1.53 Years Compensation: 815 LPA Employment Type: Full-Time | Work From Office Only ________________________________________ Are you an aspiring DevOps professional ready to work on a transformative platform? Join a purpose-led team building India’s most disruptive ecosystem at the intersection of technology, property, and sustainability. This role is ideal for engineers who are eager to learn, automate, and contribute to building reliable, scalable, and secure infrastructure. Key Responsibilities Assist in designing, implementing, and managing CI/CD pipelines using tools like Jenkins or GitLab CI to automate build, test, and deployment processes. Support the deployment and management of cloud infrastructure, primarily on AWS, with exposure to Azure or GCP. Contribute to infrastructure as code practices using Terraform, CloudFormation, or Ansible. Participate in maintaining and operating containerized applications using Docker and Kubernetes. Implement and manage monitoring and logging solutions using Grafana, Loki, Prometheus, or ELK stack. Collaborate with engineering and QA teams to streamline release pipelines, ensuring high availability and performance. Develop basic automation scripts in Python or Bash to optimize and streamline operational tasks. Gain exposure to serverless and event-driven architectures under guidance from senior engineers. Troubleshoot infrastructure issues and contribute to system security and performance optimization. Requirements 1.5 to 3 years of experience in DevOps, SRE, or related infrastructure roles. Solid understanding of cloud environments (AWS preferred; Azure/GCP a plus). Basic to intermediate scripting knowledge in Python or Bash. Familiarity with CI/CD concepts and tools such as Jenkins, GitLab CI, etc. Working knowledge of Docker and introductory experience with Kubernetes. Exposure to monitoring and logging stacks (Grafana, Loki, Prometheus, ELK). Understanding of infrastructure as code using tools like Terraform or Ansible. Familiarity with networking, DNS, firewalls, and system security practices. Strong problem-solving skills and a learning mindset. Preferred Qualifications Certifications in AWS, Azure, or GCP. Exposure to serverless architectures and event-driven systems. Experience with additional monitoring tools or scripting languages. Familiarity with geospatial systems, virtual mapping, or sustainability-oriented platforms. Passion for eco-conscious technology and impact-driven development. Why You Should Join Contribute to a next-gen PropTech platform promoting sustainable and inclusive land ownership. Work closely with senior engineers committed to mentorship and ecosystem building. Join a team where your ideas are valued, your skills are sharpened, and your work has real-world impact. Be part of a vibrant, office-first culture that encourages innovation, collaboration, and growth.

Posted 2 weeks ago

Apply

8.0 - 13.0 years

10 - 20 Lacs

Hyderabad, Bengaluru, Thiruvananthapuram

Work from Office

Naukri logo

Job Requirements Software development and enhancements on applications primarily using Java/JEE and ELK Stack (Elasticsearch, Logstash, and Kibana). The team will be making changes for the core product related to data analysis and reporting. Key Responsibilities Software Enhancement: Engage in application enhancement using Java/J2EE and ELK stack (Elasticsearch, Logstash, and Kibana) to support data analysis and reporting. Understand the current architecture and design of various modules, identify modules / functions to be modified. Implement solutions focusing on reuse and industry standards at a program, enterprise, or operational scope. Leadership: Lead design/development efforts across multiple functions ensuring adherence to established architecture, design patterns, policies, standards and best practices. Implement solutions focusing on reuse and industry standards at a program, enterprise, or operational scope. Design: Generate detailed design of enhancements, participate in code reviews. Expected to be a self-starter who can implement very complex systems with no supervision. Team Working: Work closely with the core development teams, coordinate with them to understand the requirements and take guidance to complete development tasks. Communicate with and work effectively with all team members. Core Tasks Perform software development work on applications Participate and lead efforts in requirements gathering, estimating, and system analysis Generate system designs, both at high and low levels Participate in code reviews Provide the required support to post-development phases of projects, such as acceptance testing and integration with other software applications. Liaise with members of other teams both internal and external Provide technical leadership Work Experience Application Requirements What you need to succeed Degree in Software Engineering, Computer Science, or an equivalent Engineering degree. Substantial experience in development of Warehouse Management Systems 8+ yrs of experience in Java application development Experience in design and integration of applications across multiple enterprise and third-party software systems Should be proficient in Core Java, J2EE, OODesign and Java architectures Experience in Elasticsearch, Logstash, and Kibana Experience in writing unit tests and integration tests. Experience in DevOps tools like git, maven, ssh Experience in Agile development practices Strong verbal and written communication skills, and ability to work well across teams. Strong organizational skills. Ability to work with all levels of management Good to have: Experience in data analysis.

Posted 3 weeks ago

Apply

5.0 - 8.0 years

15 - 20 Lacs

Chennai, Bengaluru

Work from Office

Naukri logo

Key Responsibilities: Design, implementation, and maintenance of technology infrastructure. This includes the software development platform, servers, and applications Design continuous integration and development (CI/CD) pipelines to allow teams to collaborate on building, testing, and releasing new features. Participate in the day to day running of the incidents, problems and issues relative to the Platform Engineering tooling and processes. Execute the maintenance schedule for platform engineering tooling and processes, aligning with the company goals and objectives. Part of the team covering the required supporting hours for tools and services. Collaborate with cross-functional teams when issues arise on tooling and processes. Enforce best practices for automation, deployment, monitoring, and operations to improve efficiency and reliability. Drive innovation and continuous improvement within the platform engineering team, fostering a culture of learning, experimentation, and knowledge sharing. Foster a collaborative and inclusive work environment where team members feel empowered to contribute ideas, challenge assumptions, and drive positive change. Keep abreast of the latest industry trends and best practices in software delivery. Knowledge of modern, end-to-end systems development life cycles 5 Plus years of experience as DevOps Engineer Good understanding on CI/CD concepts for the SDLC lifecycle. Good understanding on Infrastructure as code Terraform Cloud Good understanding on Building and maintaining an internal developer platform (IDP) Experience of working with Containerization (Docker/Kubernetes) Kubernetes on-premise and/or cloud Good understanding of JFrog Platform GitHub Enterprise TeamCity Octopus Deploy Chef Sonar Cloud ELK Linux OS (Redhat, CentOS) Identity Management (AD, ADFS, AAD) Storage platforms (Pure Flash Array, Flash Blades, Cohesity, MDS Fabric switches, HP Tape Libraries) Enterprise server virtualization (VMWare 6.7/7.0, vCenter, Host profiles, Auto Deployment) Cisco UCS Platforms (Blade and Standalone C Series) HP blade Citrix Netscaler and Cloud Apps Cloud IaaS and PaaS (Azure Compute, Azure VMs, AAD, Exchange Online, SharePoint Online & M365) Application delivery (Citrix, Wide Area Optimizers) Monitoring solutions (App Dynamics, Grafana, Control UP, SolarWinds, SCOM) .Net, Java and MSSQL database development Excellent written and verbal communication skills. ConfigureBroad familiarity with delivery practices and management methodologies, including ITIL, and proficiency in ITIL tools such as ServiceNow. Hands-on experience leveraging Agile methodologies to achieve goals and outcomes, including collaborative work in Agile teams. Proven experience in development, infrastructure, IT operations, platform engineering, DevOps, or a related role, with a track record of successfully leading high-performing teams. Proven track record of driving innovation and continuous improvement, with a passion for staying current with emerging technologies and industry trends

Posted 3 weeks ago

Apply

8.0 - 12.0 years

35 - 60 Lacs

Pune

Work from Office

Naukri logo

About the Role: We are seeking a skilled Site Reliability Engineer (SRE) / DevOps Engineer to join our infrastructure team. In this role, you will design, build, and maintain scalable infrastructure, CI/CD pipelines, and observability systems to ensure high availability, reliability, and security of our services. You will work cross-functionally with development, QA, and security teams to automate operations, reduce toil, and enforce best practices in cloud-native environments. Key Responsibilities: Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using Infrastructure as Code (Terraform). Maintain and improve CI/CD pipelines using tools like circleci, GitLab CI, or ArgoCD. Ensure high availability and performance of services using Kubernetes (GKE/EKS/AKS) and container orchestration. Implement monitoring, logging, and alerting using Prometheus, Grafana, ELK, or similar tools. Collaborate with developers to optimize application performance and deployment processes. Manage and automate security controls such as IAM, RBAC, network policies, and vulnerability scanning. Basic Qualifications: Strong knowledge of Linux Experience with scripting languages such as Python, Bash, or Go. Experience with cloud platforms (GCP preferred, AWS or Azure acceptable). Proficient in Kubernetes operations, including Helm, operators, and service meshes. Experience with Infrastructure as Code (Terraform). Solid experience with CI/CD pipelines (GitLab CI, Circleci, ArgoCD, or similar). Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.). Experience with scripting languages such as Python, Bash, or Go. Knowledge of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls). Preferred Qualifications Experience with advanced networking solutions. Familiarity with SRE principles such as SLOs, SLIs, and error budgets. Exposure to multi-cluster or hybrid-cloud environments. Knowledge of service meshes (Istiol). Experience participating in incident management and postmortem processes.

Posted 3 weeks ago

Apply

3.0 - 5.0 years

7 - 10 Lacs

Kolkata

Hybrid

Naukri logo

Intermediate understanding of Docker & Kubernetes Fundamental understanding of Python & Java. Exp in working on Ansible. Good knowledge of shell scripting. Exp in working on Linux-based architecture, RDBMS,Spark,Elastic Search,NoSql

Posted 3 weeks ago

Apply

5.0 - 10.0 years

6 - 16 Lacs

Chennai

Work from Office

Naukri logo

Roles and responsibilities: Design & Implementation: Understand the customer requirement, Architect, Design and implement scalable ELK solutions. Develop Design documentations HLD and LLD ELK components Installation Configure ELK components as per best practices. ELK Operations: Lead Log onboarding activities Configuration of Logstash, FileBeats, MetricsBeats, elastic agent, etc., to collect and process data efficiently. Configure Elasticsearch components to efficiently store various kinds of data by optimizing performance and ensuring high availability. Configuration of Kibana visualizations as per requirement Configuration management User management activities Build integrations with upstream and downstream applications as necessary. Platform troubleshooting activities / Work with OEM to fix product level issues. Continuously document lessons learnt as part of troubleshooting activities. Health Monitoring Preferred Qualifications 5+ years of experience deploying and managing a large scale ELK solutions for enterprise customers. Experience working in SOC analysis / Incident response teams Strong understanding of cybersecurity technologies, protocols and applications ELK certifications Knowledge on Python scripting, Dockers, Kubernetes, Ansible for Run book Automation.

Posted 3 weeks ago

Apply

8.0 - 12.0 years

25 - 40 Lacs

Kolkata, Hyderabad, Bengaluru

Hybrid

Naukri logo

Job Title: ELK Developer Experience Required: 8 - 12 Years Location: Hyderabad, Bangalore (Preferred) Also open to Chennai, Mumbai, Pune, Kolkata, Gurgaon Work Mode: On-site / Hybrid Job Summary: We are seeking a highly experienced ELK Developer with a strong background in designing and implementing monitoring, logging, and visualization solutions using the ELK Stack (Elasticsearch, Logstash, Kibana) . The ideal candidate should also have hands-on expertise with Linux/Solaris administration , scripting for automation, and performance testing. Additional experience with modern DevOps tools and monitoring platforms like Grafana and Prometheus is a plus. Primary Responsibilities: Design, implement, and maintain solutions using ELK Stack Elasticsearch , Logstash , Kibana , and Beats Create dashboards and visualizations in Kibana to support real-time data analysis and operational monitoring Define and apply indexing strategies , configure log forwarding , and manage log parsing with Regex Set up and manage data aggregation, pipeline testing, and performance evaluation Develop and maintain custom rules for alerting, anomaly detection, and reporting Troubleshoot log ingestion, parsing, and query performance issues Automate jobs and notifications through scripts (Bash, PowerShell, Python, etc.) Perform Linux/Solaris system administration tasks: Monitor services and system health Manage memory and disk usage Schedule jobs, update packages, and maintain uptime Work closely with DevOps, Infrastructure, and Application teams to ensure system integrity and availability Must-Have Skills: Strong hands-on experience with the ELK Stack (Elasticsearch, Logstash, Kibana) Proficient in Regex , SQL , JSON , YAML , XML Deep understanding of indexing , aggregation , and log parsing Experience in AppDynamics and related observability platforms Proven skills in Linux/Solaris system administration Proficiency in scripting (Shell, Python, PowerShell, Bash) for log handling, jobs, and notifications Experience in performance testing and optimization Good-to-Have / Secondary Skills: Experience with Grafana and Prometheus for metrics and visualization Knowledge of web and middleware components: HTTP server , HAProxy , Keepalived , Tomcat , NGINX Familiarity with DevOps tools: Git, Bitbucket, GitHub, Helm charts, Terraform, JMeter Programming/Scripting experience in Perl , Java , JavaScript Hands-on with CI/CD tools: TeamCity , Octopus , Nexus Working knowledge of Agile methodologies and JIRA Education: Bachelors or Master’s degree in Computer Science, Engineering, or a related field

Posted 3 weeks ago

Apply

10.0 - 13.0 years

35 - 50 Lacs

Chennai

Work from Office

Naukri logo

Cognizant Hiring Payments BA!!! Location: Chennai, Bangalore, Hyderabad JD: Job Summary Atleast 10yrs of experience in the BA role and in that a couple of years of experience as BA lead role good domain knowledge in SWIFT/ISO 20022 Payment background and stakeholders management Java Microservices and Spring boot Technical Knowledge: Java / Spring Boot Kafka Streams REST JSON Netflix Micro Services suite ( Zuul Eureka Hystrix etc)12 Factor Apps Oracle PostgresSQL Cassandra & ELK Ability to work with geographically dispersed and highly varied stakeholders Responsibilities Strategy Develop the strategic direction and roadmap for our flagship payments platform aligning with Business Strategy Tech and Ops Strategy and investment priorities. Tap into latest industry trends innovative products & solutions to deliver effective and faster product capabilities Support CASH Management Operations leveraging technology to streamline processes enhance productivity reduce risk and improve controls Business Work hand in hand with Payments Business taking product programs from investment decisions into design specifications solutioning development implementation and hand-over to operations securing support and collaboration from other teams Ensure delivery to business meeting time cost and high quality constraints Support respective businesses in growing Return on investment commercialization of capabilities bid teams monitoring of usage improving client experience enhancing operations and addressing defects & continuous improvement of systems Thrive an ecosystem of innovation and enabling business through technology Processes Responsible for the end-to-end deliveries of the technology portfolio comprising key business product areas such as Payments Clearing etc. Own technology delivery of projects and programs across global markets that a develop/enhance core product capabilities b ensure compliance to Regulatory mandates c support operational improvements process efficiencies and zero touch agenda d build payments platform to align with latest technology and architecture trends improved stability and scale Interface with business & technology leaders of other systems for collaborative delivery.

Posted 3 weeks ago

Apply

6.0 - 11.0 years

8 - 13 Lacs

Hyderabad

Work from Office

Naukri logo

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster As one of the fastest-growing SaaS companies in history, we surpassed $2B in revenue in our last fiscal year with extensive growth potential ahead, At the heart of Veeva are our values: Do the Right Thing, Customer Success, Employee Success, and Speed We're not just any public company we made history in 2021 by becoming a public benefit corporation (PBC), legally bound to balancing the interests of customers, employees, society, and investors, As a Work Anywhere company, we support your flexibility to work from home or in the office, so you can thrive in your ideal environment, Join us in transforming the life sciences industry, committed to making a positive impact on its customers, employees, and communities, The Role Do you want to be part of an engineering team that strives to build simple solutions to complex problemsVeeva is looking for a passionate engineering manager for the Vault Automation Platform & Tools team This is a great opportunity to put your creativity and problem-solving skills to the test You would be working as part of a team that constantly strives to turn innovative ideas into reality using bleeding-edge technology and a bouquet of programming languages, What You'll Do Responsible for the timely & quality delivery of projects related to the Automation platform Contribute to the operational excellence of Veeva Hyderabad Manage a team of engineers and ensure successful and timely deliverables Single point of contact for QA management across various global offices of Veeva w r to tools and framework Ensure communication is quick and timely across time zones Help the engineering team in Hyderabad to be integrated well in the global team, Facilitate collaboration between various automation teams and key stakeholders to ensure effective issue resolution and technical solutions, Work closely with the engineers to identify opportunities to simplify and scale up the test automation architecture, Support engineering sprints for product releases (planning, grooming, etc ) Contribute to the hiring & onboarding efforts of Veeva Hyderabad Collaborate and contribute to state-of-the-art automation framework and cloud-based test infrastructure that can operate at scale with 24/7 availability Participate in code review and provide good coding practices Requirements Experience (at least 2+ years) managing a team of engineers & leads involved in test Automation and Development projects Total experience of 12+ years Lead by example: Be a hands-on and technical leader with effective communication skills Experience in agile & scrum processes and understanding the role of a scrum master Work with a team of 5 to 10 members to ensure quality deliverables Experience with KPIs that measure the success of the team and the projects Experience in creating, documenting, and refining the SW engineering process Bachelor's / Masters degree in Computer Science or related field relevant experience building tools and/or test automation framework Solid programming skills in Java Curious to learn and adapt to a fast-paced environment Excellent written and verbal communication skills Nice to Have Experience with the following tools/technologies: Test Automation: TestNG/Cucumber, Infrastructure: AWS, Reporting: ELK Stack, Orchestration: Jenkins, Build: Maven, or Other Tools: Gitlab/Jira Veevas headquarters is located in the San Francisco Bay Area with offices in more than 15 countries around the world, Veeva is an equal opportunity employer All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances,

Posted 3 weeks ago

Apply

2.0 - 6.0 years

4 - 8 Lacs

Pune

Work from Office

Naukri logo

About The Role The Ad Server and RTB Production Infrastructure is pivotal to ensuring our software applications reliability, availability, and overall excellence As an SRE Engineer, you will be responsible for the Ad Server and RTB Production Infrastructure Your essential duties encompass ensuring the seamless operation and optimal performance of large-scale distributed software applications Your role revolves around maintaining a robust and high-performing environment, contributing to the reliability of our services, and innovating solutions to guarantee 24/7 availability By leveraging your technical expertise and dedication, you contribute to maintaining a seamless experience for our users while upholding the highest standards of operational excellence Your specific responsibilities include: What You'll Do Operational Support Be a primary point of contact for operational support of multiple large-scale distributed software applications in the Ad Server environment, Monitor availability of applications, promptly detect anomalies, analyze the impact, debug the problems in production, and follow up for the resolution by working closely with the engineering team, Maintain services once they are live by measuring and monitoring availability, latency, and overall system health, Diligently work with the engineering team to expedite the resolution of incidents and ensure a swift return to normal operations, Be innovative in building dashboards, adding metrics, writing automation scripts to reduce operation toil, and streamlining processes to enhance system reliability and stability, Design and construct software and systems to effectively manage the Ad Serving platform, its underlying infrastructure, and applications, On Call Availability and Support Work in shifts to provide continuous on-call support for the production systems and resolve issues on your own by using predefined handbooks, Show a sense of urgency for high-priority issues and arrange war rooms to resolve the problems, Provide timely updates for high-priority issues and do handovers when a problem needs to be worked out 24*7, Conduct post-incident reviews to identify root causes, recommend preventive measures, and contribute to a culture of learning and improvement, We'd Love for You to Have Three plus years experience in software development, Ability to program using programming languages like C or C++, Scripting languages like Shell or Python, Good to have prior experience in technical engineering, A proactive approach to identify the problems, performance bottlenecks, and areas of improvement, Must know, Networking, Database (MySQL) and Linux System concepts, Debugging and analyzing the core dumps, Hands-on experience with monitoring and observability tools like Grafana, Nagios, Influx, ELK, etc Familiarity with orchestration tools like Docker and Grafana and incident management systems like Zenduty, Excellent communication and collaboration skills, with the ability to work effectively across teams, Self-motivated and positive mindset to examine any incidents, Excellent interpersonal, written, and verbal communication skills, Should have a bachelors degree in engineering (CS / IT) or equivalent degree from well-known Institutes / Universities, Additional Information Return to Office: PubMatic employees throughout the global have returned to our offices via a hybrid work schedule (3 days ?in office? and 2 days ?working remotely?) that is intended to maximize collaboration, innovation, and productivity among teams and across functions, Benefits: Our benefits package includes the best of what leading organizations provide, such as paternity/maternity leave, healthcare insurance, broadband reimbursement As well, when were back in the office, we all benefit from a kitchen loaded with healthy snacks and drinks and catered lunches and much more! Diversity and Inclusion: PubMatic is proud to be an equal opportunity employer; we dont just value diversity, we promote and celebrate it We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status, About PubMatic PubMatic is one of the worlds leading scaled digital advertising platforms, offering more transparent advertising solutions to publishers, media buyers, commerce companies and data owners, allowing them to harness the power and potential of the open internet to drive better business outcomes, Founded in 2006 with the vision that data-driven decisioning would be the future of digital advertising, we enable content creators to run a more profitable advertising business, which in turn allows them to invest back into the multi-screen and multi-format content that consumers demand,

Posted 3 weeks ago

Apply

4.0 - 9.0 years

3 - 8 Lacs

Noida, Gurugram, Delhi / NCR

Work from Office

Naukri logo

Role & responsibilities Site Reliability Engineer Requirements: We are seeking a proactive and technically strong Site Reliability Engineer (SRE) to ensure the stability, performance, and scalability of our Data Engineering Platform. You will work on cutting-edge technologies including Cloudera Hadoop, Spark, Airflow, NiFi, and JOB DESCRIPTIONS 2 Kubernetesensuring high availability and driving automation to support massive-scale data workloads, especially in the telecom domain. Key Responsibilities • Ensure platform uptime and application health as per SLOs/KPIs • Monitor infrastructure and applications using ELK, Prometheus, Zabbix, etc. • Debug and resolve complex production issues, performing root cause analysis • Automate routine tasks and implement self-healing systems • Design and maintain dashboards, alerts, and operational playbooks • Participate in incident management, problem resolution, and RCA documentation • Own and update SOPs for repeatable processes • Collaborate with L3 and Product teams for deeper issue resolution • Support and guide L1 operations team • Conduct periodic system maintenance and performance tuning • Respond to user data requests and ensure timely resolution • Address and mitigate security vulnerabilities and compliance issues Technical Skillset • Hands-on with Spark, Hive, Cloudera Hadoop, Kafka, Ranger • Strong Linux fundamentals and scripting (Python, Shell) • Experience with Apache NiFi, Airflow, Yarn, and Zookeeper • Proficient in monitoring and observability tools: ELK Stack, Prometheus, Loki • Working knowledge of Kubernetes, Docker, Jenkins CI/CD pipelines • Strong SQL skills (Oracle/Exadata preferred) Job Description: • Familiarity with DataHub, DataMesh, and security best practices is a plus • Strong problem-solving and debugging mindset • Ability to work under pressure in a fast-paced environment. • Excellent communication and collaboration skills. • Ownership, customer orientation, and a bias for action Preferred candidate profile Immediate Joiner

Posted 3 weeks ago

Apply

3.0 - 5.0 years

15 - 17 Lacs

Bengaluru

Work from Office

Naukri logo

About the Role Own the deployment, scaling and hardening of our Kubernetes-based infrastructure. Automate end-to-end provisioning, ensure security and high availability, and troubleshoot production incidents. Key Responsibilities Kubernetes: Deploy, manage & optimize clusters (on-prem, EKS/GKE/AKS) IaC & GitOps: Automate with Terraform, Helm charts & Argo CD (or similar) CI/CD: Build/maintain pipelines (Jenkins, GitHub Actions, etc.) Monitoring: Implement Prometheus, Grafana & ELK for metrics, logs & alerts Troubleshooting: Diagnose container networking, storage & performance issues Security: Enforce RBAC, network policies & image-scanning best practices DR & Optimization: Define backup/restore strategies and cost-control measures Collaboration: Partner with dev teams on containerization and CI/CD workflows Required Qualifications 3-5 yrs in infrastructure, SRE or DevOps roles Hands-on Kubernetes (cluster lifecycle, Helm, CRDs) Linux administration & Bash scripting; networking tools (ip, netstat, tcpdump) IaC with Terraform/Ansible; deep Docker knowledge Monitoring with Prometheus/Grafana & ELK Automation scripting in Bash, Python or Go; Git proficiency; production debugging Preferred Skills Managed K8s services (EKS/GKE/AKS) Advanced IaC/GitOps (Argo CD, Terraform, Helm) Service mesh (Istio, Linkerd) Container security (Trivy, Clair) Custom tooling via Bash/Python automation

Posted 3 weeks ago

Apply

4 - 8 years

16 - 25 Lacs

Bengaluru

Work from Office

Naukri logo

Position: Java / Spring Boot with Python Location: Bangalore (Onsite) Key Skills: Core Java, Spring boot, Elastic Search and Python. Good Experience on core java/Python. Experience on any web framework like Spring Boot/Fastapi. Experience on indexing and querying data from Elasticsearch with variety of data sources. Experience writing data discovery, analytics and Visualization application using Elasticsearch. Experience implementing search solutions using Elasticsearch. Experience implementing AI/ML use cases. Experience working migration from Solr to Elasticsearch preferred. Basic knowledge of bigdata systems like Databrick, Snowflake will be added advantage. Self-motivated, fast learner, individual contributor preferred.

Posted 1 month ago

Apply

12 - 19 years

37 - 60 Lacs

Chennai, Bengaluru, Mumbai (All Areas)

Hybrid

Naukri logo

Location NCR MumbaiBangalore Chennai Job Description SRE Architect The SRE Architect will play a critical role in designing and implementing Observable Scalable Reliable and Resilient systems and applications that ensure the highest levels of availability and performance for the applications and services This role requires a deep understanding of software engineering system architecture and operations along with a passion to automate repetitive tasks with GenAI tools and scripts Key Responsibilities System Design and Architecture Lead the design and architecture of scalable and reliable systems that meet the needs of our growing user base and business requirements Automation and Tooling Develop and maintain automation tools and frameworks that streamline operations and improve system reliability Monitoring and Observability Implement and enhance monitoring logging and alerting systems to ensure proactive detection and resolution of issues Capacity Planning Conduct capacity planning and performance tuning to ensure systems can handle current and future demands Incident Management Lead incident response efforts perform root cause analysis and implement corrective actions to prevent recurrence Collaboration and Mentorship Work closely with software engineers DevOps and other stakeholders to promote best practices in reliability engineering and provide mentorship to junior team members Continuous Improvement Identify areas for improvement in existing systems and processes and drive initiatives to enhance system reliability and performance Skillset Experience Overall 14 years of experience along with minimum of 7 years of experience in site reliability engineering DevOps or a related field with a proven track record of designing and implementing reliable systems at scale Technical Skills Strong programming skills in languages such as Python Go or JavaNet Indepth knowledge of cloud platforms AWS GCP Azure and container orchestration Kubernetes Docker Experience with infrastructure as code Terraform Ansible Puppet Proficiency in monitoring and observability tools Prometheus Grafana Splunk ELK stack Solid understanding of networking security and system performance tuning Soft Skills Strong problemsolving and analytical skills Excellent communication and collaboration abilities Ability to work in a fastpaced environment and manage multiple priorities Passion for continuous learning and staying uptodate with industry trends and technologies Preferred Skillset Experience with chaos engineering and resilience testing Familiarity with service mesh architectures Istio Linkerd Certifications in cloud platforms Azure Certified Architect AWS Certified Architect Google Cloud Professional Architect etc. Location - Chennai/Bangalore/Hyderabad/Mumbai/Pune/Kolkata/Delhi/Noida

Posted 1 month ago

Apply

5 - 10 years

13 - 19 Lacs

Chennai, Bengaluru, Hyderabad

Hybrid

Naukri logo

Work Package Description To onboard and integrate new data sources into the chosen SIEM platform, collaborating on design, delivery, and onboarding, with documentation produced as required. Deliverables: Lead the onboarding process of new data sources into the SIEM platform, ensuring proper data normalization and correlation. Continuously improve SIEM performance, efficiency, and scalability. Maintain detailed documentation of SIEM configurations, onboarding procedures, and incident response playbooks. Collaborate with cross-functional teams to identify security requirements and integrate new security technologies into the SIEM. Stay informed about emerging threats, vulnerabilities, and security best practices, and incorporate this knowledge into SIEM operations. Ensure that SIEM configurations and operations comply with relevant industry regulations and standards. Accountable for implementation and continuous improvement of operational monitoring framework Accountable for the integration of platforms to the Elastic Stack infrastructure, following industry best practices Supporting the evaluation of new design methods and technologies to protect against existing and emerging security threats Engage with Security Architecture and Security Design teams to define and develop automated capabilities Work cooperatively within the Security Operations Centre and other Cyber Security teams to establish and maintain a strong and supportive relationship with customers Reporting Requirements: Status and project deliverable updates to be provided in a single global monthly report. Please see list of requirements for Onboarding Engineer. Good to have: ========= Cribl Edge/Stream Strong preference: ============ Logstash Parsing/Grok Regex Azure/Log Analytics/Log

Posted 2 months ago

Apply

4 - 5 years

9 - 15 Lacs

Pune, Mumbai (All Areas)

Work from Office

Naukri logo

Elasticsearch (APIs, Queries, Index, Shards) Logstash pipelines & connectors Kibana components & layouts

Posted 2 months ago

Apply

5 - 9 years

16 - 20 Lacs

Pune

Work from Office

Naukri logo

The duties of a Site Reliability Engineer will be to support and maintain various Cloud Infrastructure Technology Tools in our hosted production/DR environments. He/she will be the subject matter expert for specific tool(s) or monitoring solution(s). Will be responsible for testing, verifying and implementing upgrades, patches and implementations. He/She will also partner with the other service and/or service functions to investigate and/or improve monitoring solutions. May mentor one or more tools team members or provide training to other cross functional teams as required. May motivate, develop, and manage performance of individuals and teams while on shift. May be assigned to produces regular and adhoc management reports in a timely manner. Proficient in Splunk/ELK, and Datadog. Experience with observability tools such as Prometheus/InfluxDB, and Grafana. Possesses strong knowledge of at least one scripting language such as Python, Bash, Powershell or any other relevant languages. Design, develop, and maintain observability tools and infrastructure. Collaborate with other teams to ensure observability best practices are followed. Develop and maintain dashboards and alerts for monitoring system health. Troubleshoot and resolve issues related to observability tools and infrastructure. Bachelors Degree in information systems or Computer Science or related discipline with relevant experience of 5-8 years Proficient in Splunk/ELK, and Datadog. Experience with Enterprise Software Implementations for Large Scale Organizations Exhibit extensive experience about the new technology trends prevalent in the market like SaaS, Cloud, Hosting Services and Application Management Service Monitoring tools like : Grafana, Prometheus, Datadog, Experience in deployment of application & infrastructure clusters within a Public Cloud environment utilizing a Cloud Management Platform Professional and positive with outstanding customer-facing practices Can-do attitude, willing to go the extra mile Consistently follows-up and follows-through on delegated tasks and actions

Posted 2 months ago

Apply

10 - 14 years

8 - 12 Lacs

Pune

Work from Office

Naukri logo

Site Reliability Engineers at UKG are team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation. Site Reliability Engineers must have a passion for learning and evolving with current technology trends. They strive to innovate and are relentless in their pursuit of a flawless customer experience. They have an automate everything mindset, helping us bring value to our customers by deploying services with incredible speed, consistency and availability. Primary/Essential Duties and Key Responsibilities: Proficient in Splunk/ELK, and Datadog. Experience with observability tools such as Prometheus/InfluxDB, and Grafana. Possesses strong knowledge of at least one scripting language such as Python, Bash, Powershell or any other relevant languages. Design, develop, and maintain observability tools and infrastructure. Collaborate with other teams to ensure observability best practices are followed. Develop and maintain dashboards and alerts for monitoring system health. Troubleshoot and resolve issues related to observability tools and infrastructure. Engage in and improve the lifecycle of services from conception to EOL, including: system design consulting, and capacity planning Define and implement standards and best practices related to: System Architecture, Service delivery, metrics and the automation of operational tasks Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response. Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis Collaborate closely with engineering professionals within the organization to deliver reliable services Identify and eliminate operational toil by treating operational challenges as a software engineering problem Actively participate in incident response, including on-call responsibilities Partner with stakeholders to influence and help drive the best possible technical and business outcomes Guide junior team members and serve as a champion for Site Reliability Engineering Engineering degree, or a related technical discipline, and 10+years of experience in SRE. Experience coding in higher-level languages (e.g., Python, Javascript, C++, or Java) Knowledge of Cloud based applications & Containerization Technologies Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing Ability to analyze current technology utilized and engineering practices within the company and develop steps and processes to improve and expand upon them Working experience with industry standards like Terraform, Ansible. (Experience, Education, Certification, License and Training) Must have hands-on experience working within Engineering or Cloud. Experience with public cloud platforms (e.g. GCP, AWS, Azure) Experience in configuration and maintenance of applications & systems infrastructure. Experience with distributed system design and architecture Experience building and managing CI/CD Pipelines

Posted 2 months ago

Apply

3 - 5 years

5 - 10 Lacs

Hyderabad

Remote

Naukri logo

Job Summary: We are seeking a skilled DevOps Engineer with 3-5+ years of experience in CI/CD, cloud infrastructure, automation, and SQL database management. The ideal candidate should have expertise in DevOps practices, cloud platforms, and database optimization. Key Responsibilities: DevOps Responsibilities: Develop, implement, and maintain CI/CD pipelines for seamless deployment. Automate infrastructure provisioning using Terraform, Ansible, or similar tools. Manage and monitor cloud-based environments ( AWS, Azure, or GCP ). Implement containerization using Docker and Kubernetes. Optimize logging, monitoring, and alerting using tools like Prometheus, Grafana, or ELK Stack. Collaborate with development teams to improve deployment efficiency and system reliability. SQL & Database Responsibilities: Design, maintain, and optimize SQL databases for performance and scalability. Write, troubleshoot, and optimize complex SQL queries and stored procedures. Ensure database security, backups, and disaster recovery strategies. Monitor database performance and resolve performance bottlenecks. Qualifications & Skills: Experience: 3-5+ years in DevOps and SQL database management. Cloud Platforms: AWS, Azure, or GCP. Automation & CI/CD: Jenkins, GitHub Actions, GitLab CI/CD, or similar tools. Configuration Management: Ansible, Terraform, or Chef. Containerization: Docker & Kubernetes. Database Expertise: MySQL, PostgreSQL, SQL Server, or similar databases. Monitoring & Logging: Prometheus, Grafana, ELK Stack, or CloudWatch. Scripting: Python, Bash, or PowerShell for automation. Security & Compliance: Understanding of best practices in cloud security & database access control. Preferred Qualifications: Certification in AWS, Azure, or Google Cloud is a plus. Experience in NoSQL databases (MongoDB, Redis) is an added advantage. Knowledge of high-availability architectures and disaster recovery strategies. Work Mode: Remote

Posted 2 months ago

Apply

5 - 10 years

10 - 18 Lacs

Pune, Mumbai (All Areas)

Work from Office

Naukri logo

Role & responsibilities Minimum of 5 years of experience in software development and project management roles. Strong experience in implementing and managing DevOps practices and Project Manager role for 3 years. Proficiency with the Dynatrace, or ELK & APM tool stack (like Elasticsearch, Logstash, Kibana). In-depth knowledge of system administration, Linux operating systems, and networking. Experience with cloud platforms (e.g., AWS, Azure. GCP) and containerization technologies (e.g., Docker, Kubernetes). Hands-on experience with scripting languages (e.g., Bash, Python, Perl) and configuration management tools (e.g., Ansible, Puppet, Chef). Familiarity with cloud platforms such as AWS, Azure, or Google Cloud Platform. Excellent communication and interpersonal skills. Strong problem-solving and analytical thinking abilities. Experience with agile project management methodologies (e.g., Scrum, Kanban, PMP, PRINCE2 ). Experience with other monitoring and log management tools (e.g., Grafana, Prometheus, Splunk). Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation.

Posted 2 months ago

Apply

8 - 13 years

45 - 50 Lacs

Chennai, Pune, Delhi

Work from Office

Naukri logo

We build our expertise in latest cloud SW technologies, we are even more focused on building our understanding and awareness of product aspects like performance/footprint, ease of use, robustness, security and the like. We are pragmatic in our approach we believe that technologies should, at the end sub serve the cause of building great products. We are looking for Passionate Developers/Architects, who are good in resolving and finding solutions in innovative effective way. Also driving the Code refactory and Improvements for these product in functional/non-functional areas and also accountable for code quality. You have: Min 8+ years of relevant RD Development experience on Telco grade products. Expert Core Java or C++ Programmer Strong analytical and debugging skills. Good hands-on experience in Linux, Kubernetes, VMWare, OpenStack and working with CSF Assets Blueprints like BVNF/BCMT/BELK Should have hands on experience in design, architect of problem solutions Handon experience in Signalling portfolio esp Policy, PCRF, Diameter protocols, etc Experience with Cloud Native, Microservices, Containers and Virtualization Technologies like Docker/Container/POD, HTTP/2, JSON, Kubernetes(K8s), Oauth, etc, Helm, Envoy, Consul, Redis, gRPC and Open Source Integration. Experience with Container Management, Component Life Cycle Management, Elastic Stack, Logstash, ETCD, KeyCloak, Kafka Messaging Experience in designing and developing applications using Core Java Experience on Linux, Linux Containers, Linux Namespaces, Linux CGroups and large-scale production systems Experience with open source PaaS environments such as Openshift and Kubernetes Deep understanding of 4/5G, Signalling 3GPP, RFCs, Hyperscale cloud capabilities - Azure, AWS, GCP,OCP, RedHat It would be nice if you also had: Innovation, creative and built to scale mindset Inspirational leadership Influential communication and networking Obsessively plans and aligns Resources utilization and efficiency Customer focused Ability to drive engagement and collaboration with key Services Practices and Business partners Balanced under stress, fearless of ambiguity, adaptable Ambitious to keep building and adapting for the future while delivering Experience in real time, high performance, multi-threaded system programming, fault tolerant systems, HA Concepts and Distributed Architecture knowledge. Lead perform development activities of high complexity features. Lead technically and support a small team/ multiple features in the completion of a project/stream. Lead technical discussions with peers about enhancements/improvements in own area(s) of expertise. Create parts of architecture (small/basic) with focus on performance scale. Drive non-functional requirements within team. Effectively handles complex customer issues. Improves the code base with the outcome measurable product behavior. Owns complex features and ensure delivery completion with quality. Willing to work in lean and extremely agile environment / start-up work culture to achieve stiff and challenging targets Responsible for requirement analysis, component design based on cloud native principles and leading development Ownership of SW/HW architecture at system component level Work independently, deploying, testing and troubleshooting Cloud native application Leading the end to end development of Features and EPICs Communicate Competence Development needs to Line organization Should have good experience with cloud native architecture, cloud security and cloud patterns Strong skills on containerization using Docker, Kubernetes Familiarity with design patterns, domain driven design, component-based architecture, and evolutionary architecture.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies