Jobs
Interviews

1087 Monitoring Tools Jobs - Page 8

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

2.0 - 4.0 years

4 - 6 Lacs

pune

Work from Office

The Prompt Engineer optimizes prompts to generative AI models across NiCEs Illuminate applications. As part of the Illuminate Research team, the Prompt Engineer works with several groups in the business to help our applications deliver the highest quality customer experience. The Prompt Engineer partners with global development teams to help diagnose and resolve prompt-based issues. This includes helping to define and execute tests for LLM-based systems that are difficult to evaluate with traditional test automation tools. The Prompt Engineer also helps educate the development teams on advances in prompt engineering and helps update production prompts to evolving industry best practices. How will you make an impact Regularly review production metrics and specific problem cases to find opportunities for improvement. Help diagnose and resolve issues with production prompts in English. Refine prompts to generative AI systems to achieve customer goals. Collect and present quantitative analysis on solution success. Work with application developers to implement new production monitoring tools and metrics. Work with architects and Product Managers to implement prompts to support new features. Meet regularly with teams working in United States Mountain and Pacific time zones (UTC-7:00 and UTC-8:00). Review new prompts and prompt changes with Machine Learning Engineers. Consult with Machine Learning Engineers for more challenging problems. Stay informed about new advances in prompt engineering. Have you got what it takes Fluent in written and spoken English. BS in technology-related field such as computer science, business intelligence/analytics, or finance. 2-4 years work experience in a technology-related industry or position. Familiarity with best practices in prompt engineering, to include differences in prompts between major LLM vendors. Ability to develop and maintain good working relationships with cross-functional teams. Ability to clearly communicate and present to internal and external stakeholders. Experience with Python and at least one web app framework for prototyping, eg , Streamlit or Flask. You will have an advantage if you also have Basic AWS resource management, including microservice deployment. Containerization via Docker. Experience with both standard and AI-based testing frameworks such as PyTest and DeepEval. Exposure to generative AI application frameworks like LangChain, LlamaIndex, and griptape

Posted 1 week ago

Apply

6.0 - 8.0 years

12 - 16 Lacs

bengaluru

Work from Office

Design, implement, and manage scalable and highly available cloud infrastructure on AWS or GCP. Containerize applications using Docker, and manage orchestration with Kubernetes. Collaborate with developers and QA teams to integrate CI/CD pipelines and automate deployment processes. Ensure system reliability, uptime, and performance by leveraging industry-leading monitoring tools such as Grafana, Dynatrace, etc. Troubleshoot system failures, conduct root cause analysis, and provide long-term solutions to prevent recurrence. Script and automate operational tasks using Python or Java to improve system efficiency. Maintain documentation of system architecture, procedures, and configurations. Participate in incident response and on-call support rotation if required. Required Skills & Qualifications Minimum 5 years of hands-on experience in a DevOps/SRE role. Strong expertise in AWS or Google Cloud Platform (GCP). Deep understanding and practical experience with Docker and Kubernetes in production environments. Proficient in Java or Python for scripting, automation, and integrations. Experience with monitoring tools such as Grafana, Dynatrace, Prometheus, etc. Strong problem-solving skills and ability to work in a fast-paced environment. Excellent communication and documentation skills. Preferred Attributes Prior experience in large-scale enterprise systems. Ability to work independently and take ownership of DevOps processes. Exposure to Agile/Scrum methodologies. Location : Hyderabad Bangalore Trivandrum Pune

Posted 1 week ago

Apply

5.0 - 10.0 years

7 - 12 Lacs

bengaluru

Work from Office

Educational Requirements MCA,MTech,Master of Business Administration,Bachelor of Engineering,BCA,BTech Service Line Cloud & Infrastructure Services Responsibilities As Tools SME Tools in SolarWinds/Splunk/Dynatrace/Devpops tool will work on Design, Setup and Configuration of Observability Platforms with Correlation, Anomaly Detection, Visualization and Dashboards, AI ops, Devops, Tool Integration : Collaborate with DevOps architects, development teams, and operations teams to understand their tool requirements and identify opportunities for optimizing the DevOps toolchain. Evaluate and recommend new tools and technologies that can enhance our DevOps capabilities context, considering factors like cost, integration, and local support. Lead the implementation, configuration, and integration of various DevOps tools, including CI/CD platforms (e.g., Jenkins, GitLab CI, Azure DevOps), infrastructure-as-code (IaC) tools (e.g., Terraform, Ansible), containerization and orchestration tools (e.g., Docker, Kubernetes), monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack), and testing framework Establish standards and best practices for the usage and management of the DevOps toolset Ensure the availability, performance, and stability of the DevOps toolchain Perform regular maintenance tasks, including upgrades, patching, and backups of the DevOps tools. Provide technical support and troubleshooting assistance to development and operations teams regarding the usage of the DevOps tools. Monitor the health and performance of the toolset and implement proactive measures to prevent issues. Design and implement integrations between different tools in the DevOps pipeline to create seamless and automated workflows Develop automation scripts and utilities to streamline tool provisioning, configuration, and management within the environment. Work with development teams to integrate testing and security tools into the CI/CD pipeline. Additional Responsibilities: Besides the professional qualifications of the candidates, we place great importance in addition to various forms personality profile. These include: High analytical skills A high degree of initiative and flexibility High customer orientation High quality awareness Excellent verbal and written communication skills Technical and Professional Requirements: At least 6+ years of experience in Solarwinds or Splunk or Dynatrace or Devlops Toolset Proven experience with several key DevOps tools, including CI/CD platforms (e.g., Jenkins, GitLab CI, Azure DevOps), IaC tools (e.g., Terraform, Ansible), containerization (Docker, Kubernetes), and monitoring tools (e.g., Prometheus, Grafana, ELK stack). Good level knowledge of Linux environment Good working knowledge on YAML and Python Good working knowledge in Event correlation and Observability Good Communication skills Good analytical and problem-solving skills Preferred Skills: Technology->Infra_ToolAdministration-Others->Solarwinds Technology->Infra_ToolAdministration-Others->Splunk Admin Technology->DevOps->DevOps Architecture Consultancy Technology->Dynatrace->Digital Performance Management Tool

Posted 1 week ago

Apply

5.0 - 9.0 years

15 - 20 Lacs

noida, bengaluru

Hybrid

Job Summary: We are seeking a highly skilled and certified professional with deep expertise in Zabbix, ELK stack, modern monitoring & observability tools, and DevOps practices. The ideal candidate will have strong experience in infrastructure automation, 3rd-party integrations, multi-database environments, and cloud-native deployments. This role demands a proactive problem-solver with a passion for performance, reliability, and scalable solutions in hybrid and cloud environments. Key Responsibilities: Design, implement, and maintain robust monitoring solutions using Zabbix, Grafana, and the ELK stack (Elasticsearch, Logstash, Kibana). Develop automation scripts and CI/CD pipelines using Python, Golang, Jenkins, and GitOps practices. Collaborate with development and operations teams to implement observability standards across environments (on-prem & cloud). Maintain, optimize, and troubleshoot integrations with BMC Remedy, ServiceNow, or other ticketing systems. Manage and support diverse database environments: MariaDB, MongoDB, MySQL, PostgreSQL, Oracle. Provide support for both Linux and Windows based systems in production. Design and support microservice deployment workflows using GCP, GKE, and container orchestration tools. Work closely with platform and application teams to ensure end-to-end visibility and service reliability. Drive best practices for infrastructure-as-code, environment provisioning, and configuration management. Lead efforts on third-party integrations to enhance platform functionality and automation. Required Skills & Experience: Proven experience with Zabbix, ELK, and Grafana for real-time monitoring and logging. Hands-on experience with DevOps tools and CI/CD pipelines (Jenkins, Git, etc.). Strong scripting and automation experience in Python, Shell, Golang, or Java Spring Boot. Solid understanding of GCP services and GKE orchestration; GCP certification is highly preferred. In-depth experience working with databases like MongoDB, MySQL, MariaDB, PostgreSQL, Oracle. Working knowledge of BMC ITSM, Remedy, or similar ITSM/ticketing platforms. Proficiency in administering Linux and Windows environments. Experience with infrastructure monitoring, alerting, log management, and performance tuning. Familiarity with third-party integrations (REST APIs, Webhooks, etc.) and system interoperability. Strong understanding of cloud-native architectures, service reliability, and infrastructure scaling. Preferred Qualifications: GCP Professional or GKE certification. Experience with Terraform, Ansible, or similar IAC tools. Familiarity with container security and compliance frameworks. Exposure to enterprise-grade SRE/Observability practices. CP Professional Cloud Architect / GKE Certified Experience Level : 5+ Years

Posted 1 week ago

Apply

3.0 - 5.0 years

7 - 15 Lacs

navi mumbai

Work from Office

Key Responsibilities: Understanding of Kubernetes Fundamentals: Basic knowledge of Kubernetes concepts like pods, deployments, services, namespaces, and kubectl commands. Linux Fundamentals: Proficiency in Linux command-line operations. Scripting: Basic scripting skills (e.g., PowerShell, Bash) are helpful for automation. System Maintenance: Performing routine tasks like system updates, patching, and security checks. Escalation: Identifying and escalating complex problems to higher-level engineers. Monitoring and Alerting: -Monitoring Kubernetes cluster health, resource utilization, and application performance, and responding to alerts. Familiarity with system monitoring tools. Keeping an eye on system performance, resource usage, and security logs. Initial Troubleshooting: -Performing initial diagnosis and troubleshooting of common issues related to pods, deployments, services, and other Kubernetes resources. Strong analytical and problem-solving skills. Troubleshooting a Linux server experiencing high CPU usage. Monitoring server disk space usage and taking corrective action. Diagnosing and resolving hardware and software issues Incident Management: - Logging and tracking incidents, escalating complex issues to higher-level support (L2/L3) when necessary, and communicating with affected users. Basic Operations: -Performing routine tasks such as checking logs, restarting pods, and verifying resource status. Documentation: -Maintaining and updating documentation for common issues and resolution steps. Collaboration: -Working with L2/L3 teams to resolve issues and contribute to continuous improvement. Problem-Solving Skills: Ability to identify and diagnose basic issues and follow established troubleshooting procedures. Communication Skills: Clear and effective communication for documenting issues and interacting with team members. Relevant Experience: Entry-level experience in IT support, operations, or a related field, preferably with some exposure to cloud environments. Specific Skills and Technologies: Operating Systems: Proficiency in both Windows and Linux (e.g., RHEL, CentOS). Networking: Basic understanding of networking concepts like TCP/IP, DNS Hardware: Familiarity with server hardware components and troubleshooting.

Posted 1 week ago

Apply

3.0 - 7.0 years

0 Lacs

indore, madhya pradesh

On-site

As a Technical Support Engineer - NOC, your primary responsibility will be to monitor infrastructure and application alerts in ME Monitoring tools, as well as monitor mail alerts. You will be required to log incidents, run bridge calls, and execute end-of-day (EOD) jobs for the Bank. Your qualifications should include holding any degree and certifications in ITIL Foundation, AZ-900, MCSE, and CCNA. The ideal candidate for this role should possess 3-5 years of experience in Infrastructure Monitoring and have L1 knowledge in the IT infrastructure of a Bank. Your key responsibilities will include hands-on experience with Monitoring tools such as Manage Engine APM & OPM, as well as a good understanding of ITIL processes and ITSM Tools for managing incidents, changes, service requests, and work orders. You should have practical experience in a banking environment, particularly in EOD and SOD (Finacle Preferred) operations. Familiarity with AS400 operation (at least basic) and Mimix knowledge would be considered an added advantage. Additionally, fluency in written and verbal English communication is essential. You will be expected to perform incident analysis for recurring incidents, conduct backup monitoring, and oversee a wide range of information and network systems including telecommunications circuits, LAN/WAN systems, routers, switches, firewalls, VoIP systems, servers, storage, backup, operating systems, and core applications. Documenting all actions in accordance with company policies, notifying customers and third-party service providers of issues, outages, and remediation status, creating/updating knowledge base articles, and generating MIS reports are also part of your role. The role also requires previous experience working with Senior leadership team members, supporting multiple technical teams in 24/7 NOC operational environments with high uptime requirements, and being comfortable working day/night shifts. Mandatory requirements include ITIL Foundation certification, hands-on experience in incident & problem management, proficiency in using ME monitoring (APM & OPM) and ITSM (BMC Remedy) tools, a minimum of three years of experience supporting and monitoring network systems, servers, or storage in an enterprise environment, and understanding key network monitoring protocols. In summary, as a Technical Support Engineer - NOC, you will play a crucial role in ensuring the smooth operation of infrastructure monitoring and support functions within a banking environment, with a focus on incident management, network monitoring, and maintaining high uptime standards.,

Posted 1 week ago

Apply

5.0 - 9.0 years

0 Lacs

maharashtra

On-site

As a Networking Managed Services Engineer (L4) at NTT DATA, your primary responsibility will be to ensure the operational efficiency of our clients" IT infrastructure and systems. This will involve managing, monitoring, investigating, and resolving escalated technical incidents and problems to ensure continuous functionality. Your expertise will be crucial in providing immediate and permanent solutions within our service level agreements (SLAs) by reviewing client requests and applying your technical knowledge effectively. Working independently, you will collaborate with clients, stakeholders, team leads, or senior managers to perform operational tasks and address escalated incidents and requests promptly. Your role will also entail offering high-level support for complex issues, documenting resolution procedures, contributing to the enhancement of client infrastructure, and sharing knowledge with other engineering teams. Your proactive approach will involve identifying, investigating, and analyzing issues before they arise to ensure timely resolution and proper documentation of incidents. You will be responsible for providing support for escalated incidents, troubleshooting, managing tickets from third-line engineers, and documenting and sharing solutions for future reference. Effective communication will be key in your role, as you will engage with teams and clients to provide support and act as an emergency contact for critical issues. During shift handovers, you will ensure a seamless transition by highlighting key escalated tickets and upcoming operational tasks for continuous support. In addition to daily troubleshooting, you will actively contribute to optimization efforts by collaborating with automation teams, developing diagnostic procedures for unique client environments, and coaching Service desk and Operations Center teams to enhance technical expertise. To excel in this role, you should possess deep technical skills in relevant networking functions, experience with network management tools, advanced Managed Services experience, proficiency in change management processes, strong client service orientation, excellent communication skills, adaptability to changing circumstances, and the ability to work both independently and collaboratively. Your active listening skills and positive outlook will help in understanding client requirements and creating a positive client experience throughout their journey. NTT DATA is a trusted global innovator in business and technology services, committed to helping clients innovate, optimize, and transform for long-term success. With a focus on R&D and a diverse global workforce, we aim to move confidently into the digital future. Being part of the NTT Group, headquartered in Tokyo, we offer a wide range of services including business and technology consulting, data and artificial intelligence solutions, and digital infrastructure services. As an Equal Opportunity Employer, we value diversity and inclusion in our workplace.,

Posted 1 week ago

Apply

5.0 - 10.0 years

0 Lacs

haryana

On-site

The Customer Solutions Architect acts as a trusted advisor partnering with the customers on their needs. You understand and capture the critical inputs of stakeholders, translating them into effective requirements and solutions. You specify and design effective end-to-end solutions, including customer-specific adaptations by utilizing the Nokia portfolio of products, systems, and / or services, as well as 3rd party products where required. You apply solution architecture standards, processes, and principles to create and maintain a solution's (technical) integrity over time. Design and develop customer-specific solutions by capturing functional and non-functional requirements, translating stakeholder inputs into effective architectures. Create high-level and low-level designs to ensure end-to-end solutions meet customer needs, align with industry best practices, and maintain solution integrity over time. Collaborate across business groups and organizations to drive initial solution development, achieve workable outcomes, and support strategic decision-making. Solve complex problems using advanced analytical skills, contributing to innovation, professional direction, and long-term strategic goals. Provide technical leadership by guiding teams, managing resources, and serving as a trusted expert and best-practice reference in solution architecture. Key Skills And Experience You have: - A Bachelor's or Master's degree in Engineering or equivalent degree and 5 -10 years of experience with OpenShift/GKE (preferred), and other CaaS platforms. - Understanding of cloud-native networking (CNI, SR-IOV, DPDK, Multus) and storage architectures. - Awareness of resource policies (NUMA, CPU pinning, hugepages) relevant for telco-grade CNFs. - Hands-on experience in Kubernetes/OpenShift deployment, Day-2 operations, and troubleshooting. - Familiarity with logs, traces, monitoring tools (Prometheus, Grafana, ELK) to identify infra or CNF-level issues. - Ability to work with R&D/PLM teams to drive fixes and customer-specific adaptations. - Strong documentation and presentation skills for architecture HLD/LLD, CIQ, deployment guides, etc. - Ability to influence decisions in strategic customer discussions. It would be nice if you also have: - Familiarity with MANO/Orchestration frameworks. About Us Come create the technology that helps the world act together. Nokia is committed to innovation and technology leadership across mobile, fixed, and cloud networks. Your career here will have a positive impact on people's lives and will help us build the capabilities needed for a more productive, sustainable, and inclusive world. We challenge ourselves to create an inclusive way of working where we are open to new ideas, empowered to take risks, and fearless to bring our authentic selves to work. What we offer Nokia offers continuous learning opportunities, well-being programs to support you mentally and physically, opportunities to join and get supported by employee resource groups, mentoring programs, and highly diverse teams with an inclusive culture where people thrive and are empowered. Nokia is committed to inclusion and is an equal opportunity employer. Nokia has received recognitions for its commitment to inclusion & equality including being recognized as one of the World's Most Ethical Companies by Ethisphere and in the Gender-Equality Index by Bloomberg. We are committed to a culture of inclusion built upon our core value of respect. Join us and be part of a company where you will feel included and empowered to succeed. About The Team As Nokia's growth engine, we create value for communication service providers and enterprise customers by leading the transition to cloud-native software and as-a-service delivery models. Our inclusive team of dreamers, doers, and disruptors push the limits from impossible to possible.,

Posted 1 week ago

Apply

8.0 - 12.0 years

0 Lacs

karnataka

On-site

As an AWS DevOps Engineer at Autodesk, you will be responsible for designing, implementing, and optimizing CI/CD pipelines and automation workflows for our mission-critical cloud applications. Your role will involve collaborating with developers, QA engineers, and cloud architects to ensure fast, reliable, and secure delivery processes. This hands-on position will give you significant influence over release processes, deployment strategies, and DevOps tooling. Your responsibilities will include architecting and maintaining highly available CI/CD pipelines using tools such as AWS Code Pipeline, Jenkins, GitHub Actions, or GitLab CI. You will automate infrastructure provisioning and configuration using Terraform or AWS CloudFormation and integrate pipelines with automated testing frameworks for quality assurance. Implementing security checks, compliance scans, and performance tests as part of the release workflow will also be a key part of your role. To succeed in this position, you should have a minimum of 8 years of experience in DevOps with a focus on pipeline creation and automation. Strong experience with AWS services such as EC2, ECS/EKS, S3, Lambda, CloudWatch, IAM, CodeBuild, and CodePipeline is essential. You should also have a deep understanding of CI/CD tools, build orchestration, strong scripting skills in Python, Bash, or Groovy, and proficiency in Infrastructure-as-Code with Terraform or AWS CloudFormation. Experience with Docker, container orchestration (Kubernetes or ECS), Git, branching/release strategies, DevSecOps, and integrating security into pipelines are also required. Preferred qualifications include experience with serverless architectures, monitoring/observability tools, and knowledge of advanced deployment strategies. Join Autodesk, where amazing things are created every day with innovative software. Our culture is at the core of everything we do, guiding our interactions with each other, our customers, and partners. If you are ready to do meaningful work that helps build a better world designed and made for all, come shape the world and your future with us. Autodesk offers a competitive compensation package based on experience and geographic location, including base salaries, annual bonuses, commissions, stock grants, and comprehensive benefits. We take pride in fostering a culture of belonging where everyone can thrive. Learn more about our commitment to diversity and belonging at Autodesk.,

Posted 1 week ago

Apply

0.0 - 4.0 years

0 Lacs

pune, maharashtra

On-site

Job Description As a member of the team at WNS (Holdings) Limited, you will be responsible for utilizing monitoring tools to track application performance, identify potential issues, and proactively communicate about them. Your role will involve writing reusable, testable, and efficient code to meet the specified requirements and deliver high-quality solutions. You will play a key part in identifying and resolving bugs through thorough testing and debugging processes. Managing and prioritizing support tickets to ensure timely resolution within established Service Level Agreements (SLAs) will also be a crucial aspect of your responsibilities. Additionally, you will be tasked with creating and maintaining technical documentation, including knowledge base articles to facilitate knowledge sharing within the team. Your role will involve identifying opportunities to enhance automation in monitoring and support activities, contributing to the overall efficiency of the team. Furthermore, you will assist with the deployment and configuration of new applications or application updates, actively participating in the growth and development of the technology landscape within the organization. Qualifications To excel in this role, you should hold a degree in computer science and possess a solid understanding of machine learning concepts, algorithms, and frameworks. Proficiency in programming languages like Python, along with hands-on experience in scripting, will be essential for success in this position. Your technical expertise and ability to collaborate effectively within a team setting will enable you to contribute significantly to the innovative and transformative solutions that we co-create with our clients at WNS (Holdings) Limited.,

Posted 1 week ago

Apply

5.0 - 9.0 years

0 - 0 Lacs

noida, uttar pradesh

On-site

As a Site Reliability Engineer with over 8 years of experience, you will be responsible for ensuring the efficient and reliable operation of our systems. You should have a notice period of only 15-30 days and be willing to work in either Noida or Mumbai with a maximum package of 7LPA. Your essential skills should include: - Having a minimum of 5 years of experience as an SRE or Application Support Lead. - Proficiency in Agile methodology, Kanban, and tools like Jira. - Ability to define key performance indicators for business performance. - Hands-on experience in collecting and analyzing performance data, troubleshooting, and tuning. - Quick mastery of new software and tools to conduct technical research analysis for incidents in the production environment. - Identifying process gaps and implementing improvements or automations to enhance existing processes. - Training new team members and establishing clear expectations regarding SOPs and SLAs. - Proficiency in troubleshooting, debugging environmental issues, and providing day-to-day support for production. - Proven track record in driving issue resolutions, documenting RCA & CAPA, and ensuring adherence to SLAs. - Preventing any slippage or deviation from SLAs/KPIs. - Sharing knowledge and educating team members to contribute effectively and suggest best practices. - Strong leadership skills to collaborate with clients and other teams to address challenges, blockers, and new requirements. - Excellent analytical skills with the ability to adapt to changing work priorities. - Team player with self-motivation to work effectively both independently and within a team. - Organizing and leading daily priorities for yourself and a team of 3-5 members. In terms of technical expectations, you should have experience with monitoring and reporting tools like Grafana (Preferred), Coralogix, and Datadog. Additionally, you should be able to configure alerts and tracing for continuous monitoring using APM tools such as ELK, Splunk, and other log aggregation tools. Familiarity with using Postman for API troubleshooting is also required. Desirable skills include the ability to contribute to the development and maintenance of automation tools to streamline manual work and optimize performance, as well as knowledge of AWS cloud services.,

Posted 1 week ago

Apply

10.0 - 14.0 years

0 Lacs

thane, maharashtra

On-site

As a Network Support Engineer, you will be responsible for a wide range of technical tasks related to network infrastructure. Your primary responsibilities will include: - Demonstrating strong hands-on experience in Load balancer F5 LTM & APM, as well as Routing & Switching. - Working with technologies and platforms such as VRFs, Nexus 9k, 7K and 5K switches. - Possessing a deep knowledge and providing extensive support for Routing Protocols/Technologies like EIGRP, BGP, IS-IS, OSPF, Logical Overlay, IOS-XR, MPLS VPN, and Multicast. - Participating in problem management processes with engineering or architectural elements. - Contributing to network design and strategy forums, setting directions, and providing recommendations. - Operating data communication systems, including LANs and WANs. - Planning, designing, and implementing networked systems, configurations, and supporting/troubleshooting network problems. - Developing and evaluating network performance criteria and measurement methods. - Developing network technology roadmaps and industry best practices, and presenting recommendations to clients. - Providing hands-on support for various Data Center technologies with a focus on Routing and Switching. - Conducting TCP/IP Network traces/packet captures and interpreting results effectively. - Demonstrating confidence and competence in communicating on bridges. - Managing vendors effectively to drive incident resolution. - Utilizing monitoring tools and strategies for network performance. - Mentoring junior staff in areas of process, technology, and execution. - Analyzing packet captures using tools like Wireshark. - Acting as an extension of the leadership team for workload management and guidance. - Participating in complex maintenance or deployment activities. - Creating or modifying documentation based on new events and learnings. - Providing constructive feedback for improvement opportunities. - Assisting management in identifying suitable training and technical needs within the team. You should possess excellent communication skills, including written, oral, and presentation abilities. Additionally, you will be responsible for ensuring new systems or processes introduced into production are effectively monitored, documented, and supported by the Operations team. For this role, you should have 10-12 years of network support experience in a mid to large-sized organization. Having two or more certifications like F5, CCNP, or CCIE is strongly preferred. Other essential soft skills include excellent communication, interpersonal skills, ownership, accountability, client relationship management, mentoring, motivation, independent decision-making, problem-solving, and conflict resolution.,

Posted 1 week ago

Apply

4.0 - 8.0 years

0 Lacs

karnataka

On-site

You have 3.58 years of experience and hold a degree in BE/BTech/MCA in IT or Computer Science. As a Java Application & Production Support Analyst (L2/L3) at our client, a leading IT services company in Bangalore with over 20,000 employees, you will be responsible for providing L2/L3 production support for Java-based enterprise applications. Your role will involve monitoring system performance and application health, conducting root cause analysis, debugging code, executing SQL queries, and ensuring adherence to ITIL and ITSM best practices. Key Responsibilities - Providing L2/L3 production support for Java-based enterprise applications. - Monitoring system performance and application health using tools such as AppDynamics, Control-M, Splunk, Dynatrace, Kibana, and ServiceNow. - Conducting root cause analysis (RCA) and collaborating with development teams for permanent resolutions. - Performing code-level debugging and resolution of production issues. - Executing SQL queries for data analysis and issue resolution. - Working in Linux/Unix environments using CLI tools for diagnostics and support tasks. - Ensuring adherence to ITIL and ITSM best practices across incident, problem, and change management processes. - Maintaining and updating support runbooks, knowledge base documentation, and incident records. - Collaborating effectively with internal stakeholders and cross-functional teams. Required Skills - 3.58 years of hands-on experience in Java, specifically in production support use cases such as bug fixing and automation. - Proficiency in at least one or more of the following tools: AppDynamics, Control-M, Splunk, Dynatrace, Kibana, ServiceNow. - Strong skills in SQL including query writing, data validation, and troubleshooting. - Proven ability in code-level debugging and fixing production issues. - Competence in Linux/Unix command-line operations. - Familiarity with ITIL/ITSM process frameworks. - Strong communication skills (written and verbal). - Ability to analyze and resolve issues under pressure with a focus on long-term resolution. Additional Details This is a full-time, on-site role (5 days a week) at the Kadubeesanahalli location in Bangalore. Candidates are expected to demonstrate a problem-solving mindset and be comfortable in a fast-paced production support environment.,

Posted 1 week ago

Apply

4.0 - 12.0 years

0 Lacs

hyderabad, telangana

On-site

Join our mission at Amgen to serve patients living with serious illnesses. Since 1980, we have been at the forefront of the biotech industry, focusing on Oncology, Inflammation, General Medicine, and Rare Diseases to reach millions of patients every year. As a member of the Amgen team, you will play a vital role in researching, manufacturing, and delivering innovative medicines that improve the quality of life for patients. Your responsibilities will include designing, developing, and maintaining software data and analytics applications that support Research operations. You will collaborate closely with business analysts, scientists, and engineers to create scalable solutions while automating operations and monitoring system health. The ideal candidate should have experience in the pharmaceutical or biotech industry, possess strong technical skills, and have full stack software engineering experience. Key responsibilities include taking ownership of software projects, managing delivery scope and timeline, providing technical guidance to junior developers, contributing to both front-end and back-end development, and utilizing generative AI technologies to develop innovative solutions. You will also conduct code reviews, maintain documentation on software architecture and operations, identify and resolve technical challenges, and stay updated with industry trends. Basic qualifications for this role include a Doctorate, Masters, Bachelors degree, or Diploma in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics, or a related field, along with relevant years of experience. Preferred qualifications include experience in implementing and supporting biopharma scientific research data analytics platforms. Essential skills for this role include hands-on experience with Full Stack software development, proficiency in programming languages such as Node.js, Python, React.js, SQL, PostgreSQL, experience in automated testing tools, understanding of Agile and Scrum methodologies, and strong problem-solving and analytical skills. Additional beneficial skills include familiarity with cloud platforms, DevOps practices, big data technologies, API integration, serverless architecture, monitoring tools, infrastructure as code tools, version control systems, and Java development. Soft skills required for this role include excellent analytical and troubleshooting skills, strong communication skills, ability to work effectively in global teams, self-motivation, time management skills, team-oriented mindset, and presentation abilities. Amgen is an equal opportunity employer committed to providing reasonable accommodations for individuals with disabilities throughout the application and employment process. Contact us to request accommodation and be a part of our transformative team dedicated to serving patients and advancing healthcare.,

Posted 1 week ago

Apply

6.0 - 10.0 years

0 Lacs

noida, uttar pradesh

On-site

You are looking for a talented Principal Cloud Operations Specialist to join our team at UKG, a leading U.S.-based private software company with a global presence. As part of the Cloud Operations team, you will play a crucial role in monitoring and supporting Kronos Private Cloud and hosted environments remotely. Your responsibilities will include monitoring system performance, SQL database health, application services, and server resource utilization for both Windows and Linux servers. In this role, you will be responsible for responding to alerts from monitoring tools, troubleshooting server and application performance issues, and handling Level 1 escalations as per the defined escalation matrix. You will collaborate with internal teams to ensure high availability and performance of hosted services, participate in incident response, and actively contribute to incident documentation and resolution. To be successful in this position, you should have at least 6-7 years of hands-on experience in the IT industry and a strong understanding of cloud infrastructure, virtualization, and hybrid environments. Additionally, you must possess excellent communication, analytical, and problem-solving skills. Familiarity with monitoring tools such as DataDog, Grafana, and Splunk, as well as Privilege Access Management (PAM) tools like CyberArk or Saviynt, is preferred. The ideal candidate will have a background in private cloud (VMware) or public cloud platforms like AWS, GCP, or Azure. Experience with incident, problem, and change management tools like JIRA and ServiceNow is a plus. You should be willing to work in rotational shifts, including nights and weekends, and be comfortable working on-site in an office at least 3 days per week. If you are ready to take on the challenge of working in a dynamic environment where your contributions will make a significant impact, we invite you to join us at UKG. As we continue to grow and innovate, you will have the opportunity to be part of something truly special and shape the future of workforce management and human capital management. UKG is an equal opportunity employer that values diversity and inclusion in the workplace. We are committed to providing accommodations for individuals with disabilities throughout the application and interview process. If you require assistance, please reach out to us at UKGCareers@ukg.com.,

Posted 1 week ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

chennai

Work from Office

We are looking for an IT Analyst 3+ years of experience in handling inbound calls for international customers, preferably from US or Canada. Requirements: Hands on experience in diagnosing and resolving issues relating to general computer problems, printing, VPN connectivity, email clients, network drives, company websites, software applications, hardware, etc. Hands on experience in resolving commonly reported issues with Office365 desktop and mobile applications. Experience in unlocking user accounts and performing password resets either directly via Active Directory or using an Enterprise Identity Management system. Hands on experience in handling a minimum of 25-30 calls per day and documenting the incidents in a ticketing system. Experience in identifying and escalating major incidents to the appropriate support teams as defined in the major incident management process. Experience in monitoring the availability and uptime of production systems, infrastructure devices and business critical applications using diversified monitoring tools. Hands on experience in any IT monitoring tools like SolarWinds, Splunk, Dynatrace etc., Hands on experience in User access provisioning and SAP role provisioning. Experience in creating documentation with step-by-step instructions on how to resolve some of the most common computer problems. Flexible to work in a 24/7 & rotational shift pattern (including weekends); shift rotation will be monthly. In-depth knowledge and/or experience in ServiceNow application will be an added advantage. #LI-Onsite #LI-BG1

Posted 1 week ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

hyderabad

Work from Office

ServiceNow is changing the way people work. With a service-orientation toward the activities, tasks and processes that make up day-to-day work life, we help the modern enterprise operate faster and be more scalable than ever before. We re disruptive. We work hard but try not to take ourselves too seriously. We are highly adaptable and constantly evolving. We are passionate about our product, and we live for our customers. We have high expectations and a career at ServiceNow means challenging yourself to always be better. As a Sr. Software Engineer in the ETG Product Operations and Innovation Sustaining Engineering (POISE) team what you get to do in this role Responsibilities: As part of the ETG Product Ops Integrations team, proactively work on resolving L2/L3 support issues for Enterprise Integrations Applications. Ensure all the Incidents and requests are tracked and addressed in a timely manner with a sense of urgency or if need to be escalated to appropriate Engineering teams Track Key performance metrics using ETG and DT Ops dashboard (SLAs for response time, resolution time, customer satisfaction) and ensure all the SLA metrics are met Ensure smooth communication with other teams in DT or ServiceNow on Product support needs and goals Work closely with Engineering teams to gain understanding on the new feature releases for Integrations, so Operations Support team can handle issues from day 1 of feature release Gather and share customer feedback or recurring support issues with Engineering teams for potential feature improvements Provide insights and analytics on metrics, recurring issues, and prioritization of new features. Drive continuous improvement in operational efficiency and effectiveness. To be successful in this role you have: 7+ years of total experience in building and operationalizing enterprise-grade systems and services Expert-level proficiency in Java or a similar OO language Strong experience with RESTful API design, Microservices architecture, database technologies (SQL/NoSQL) Experience working across multiple technology stacks, including: Strong experience with Integration Platforms like Boomi, SAP PI/PO Cloud/Containerization Platforms: Azure, AWS, GCP and Docker, Kubernetes Data Streaming Platforms: Kafka, Flink Web Infrastructure: API Gateways Kong, Azure APIM, Azure App Gateway Prior experience integrating Applications with ServiceNow platform or familiarity with ServiceNow platform is preferred Experience with monitoring tools, dashboards, and analytics Experience with AI/ML and automation in product operations is preferred Ability to work in a fast-paced and dynamic environment with a sense of urgency towards resolving issues and growth mindset and interest to learn and upskill Strong interpersonal skills, customer centric attitude, ability to deal with cultural diversity Knowledge of industry best practices in product support and operations.

Posted 1 week ago

Apply

5.0 - 9.0 years

7 - 11 Lacs

bengaluru

Work from Office

Education Qualification : Engineer - B.E / B.Tech / MCA Skills : Primary -> Technology | Custom Automation Development | Developing automated workflows for alerting, healing, etc. | 5 - Expert Primary -> Technology | Monitoring and Observability Implementation | Building and deploying observability tools and solutions | 5 - Expert Secondary -> Technology | API Integration Such as ITSM, Monitoring Tools, Notification Mechanisms | Integrating with ITSM, notification tools, monitoring platforms | 5 - Expert Secondary -> Technology | Gap Identification and Collaboration with Ops | Works with operational teams to find and fix observability gaps | 3 - Experienced Tertiary -> Technology | Programming Python and JavaScript | Core development skills for integration, logic handling, and scripting | 5 - Expert Tertiary -> Technology | Database Technologies Neo4J, MongoDB | Supports data-driven observability (graph relationships, log stores) | 3 - Experienced Certification : Technology | DevOps Foundation (DOFD by DevOps Institute), CompTIA Server+, Red Hat Certified System Administrator (RHCSA), ITIL 4 Foundation Responsibilities: Implements observability solutions. develops integrations with monitoring tools, ITSM platforms, notification mechanisms. develops custom automations. Responsible for the accuracy and completeness of the Observability environment. works along with operational teams in identifying and addressing gaps in Observability. Has to be very proficient in Python and JavaScript, API integration, multiple database technologies including Neo4J and MongoDB. Has to have experience with stream processing pipelines. Has to have experience working with multiple monitoring tools. Ideally should have worked on Observability tools.

Posted 1 week ago

Apply

5.0 - 10.0 years

7 - 12 Lacs

kolkata, mumbai, new delhi

Work from Office

NOC Analyst (Technical Incident Management) What you can expect As a NOC Analyst at Zoom, youll work in a 24/7 operational environment with rotating shifts, monitoring system alerts and managing incidents in real-time. Youll collaborate with global teams using advanced monitoring tools, handle high-pressure situations, and make quick decisions to maintain service stability. The role demands multitasking and communication skills while offering growth opportunities through hands-on experience with enterprise systems and cloud technologies. About the Team As a member of the global NOC team, you will be responsible for maintaining Zoom services operational and monitoring Infrastructure & AWS cloud technologies. You will ensure the uptime of Zoom applications and utilize analytical tools to evaluate internal & external KPI/SLA metrics. Our monitoring tools are essential for identifying, detecting, and resolving issues to maintain smooth operations. What we re looking for 5+ years of proven incident management expertise in handling customer-impacting situations. Have demonstrated ability to manage high-intensity incidents and drive effective resolutions. Have the ability and understanding of infrastructure, including multi-cloud environments. Have solid knowledge of application deployments in DevOps environments. Have advanced proficiency with enterprise monitoring tools (Grafana, Splunk). Have experience with ITSM workflow and process implementation. Have practical experience in Jira report creation. Be able to track record in managing change control and incident review meetings. Ways of Working Benefits

Posted 1 week ago

Apply

8.0 - 13.0 years

3 - 7 Lacs

nashik

Work from Office

- Seeking Cloud Administrator with 8 + years of managing enterprise cloud infrastructure and Strong hands on experience on Azure environments and some AWS experiences. - Strong experiences with Active Directory, Group Policy, logon scripts, SAML, OAUTH and MFA management and domain Trust, AD Sync. - Windows and Linux Vulnerabilities remediation experiences. - Experiences Developing methods for automated application installation and configuration - Maintain various system health monitoring tools in Azure. - Demonstrated expertise in providing guidance, building highly available/fault-tolerant enterprise class infrastructure with multiple-region and multi-AZ models. - Exceptional server administration and application installation experiences. - Lead operational implementation of cloud identity and access management solutions enforcing security guidelines. - Serves as SME on building, implementing hybrid cloud architecture AWS and Azure - Experiences administrating with RADIUS server. - Deletion of unattached volumes, RI recommendations and Deletion of Unused resources and unallocated ips. - Server administration, application installation experiences. - Deploying and managing Azure Host pools and VDIs, installing apps, remote app configuration etc.. - Review, plan and facilitate implementation of best practices recommended by the cloud provider - Experiences and solid knowledge on Azure and AWS sg\ NSG, route tables, vpn gateway, load balancer, DNS, application gateway, integration, Vnet, Peering. - Good skills with azure network troubleshooting. - Configuring and maintaining adequate security parameters in in Azure and in AWS - Operating, managing, and deploying Azure vm system

Posted 1 week ago

Apply

8.0 - 13.0 years

3 - 7 Lacs

hyderabad

Work from Office

- Seeking Cloud Administrator with 8 + years of managing enterprise cloud infrastructure and Strong hands on experience on Azure environments and some AWS experiences. - Strong experiences with Active Directory, Group Policy, logon scripts, SAML, OAUTH and MFA management and domain Trust, AD Sync. - Windows and Linux Vulnerabilities remediation experiences. - Experiences Developing methods for automated application installation and configuration - Maintain various system health monitoring tools in Azure. - Demonstrated expertise in providing guidance, building highly available/fault-tolerant enterprise class infrastructure with multiple-region and multi-AZ models. - Exceptional server administration and application installation experiences. - Lead operational implementation of cloud identity and access management solutions enforcing security guidelines. - Serves as SME on building, implementing hybrid cloud architecture AWS and Azure - Experiences administrating with RADIUS server. - Deletion of unattached volumes, RI recommendations and Deletion of Unused resources and unallocated ips. - Server administration, application installation experiences. - Deploying and managing Azure Host pools and VDIs, installing apps, remote app configuration etc.. - Review, plan and facilitate implementation of best practices recommended by the cloud provider - Experiences and solid knowledge on Azure and AWS sg\ NSG, route tables, vpn gateway, load balancer, DNS, application gateway, integration, Vnet, Peering. - Good skills with azure network troubleshooting. - Configuring and maintaining adequate security parameters in in Azure and in AWS - Operating, managing, and deploying Azure vm system

Posted 1 week ago

Apply

8.0 - 13.0 years

3 - 7 Lacs

bengaluru

Work from Office

- Seeking Cloud Administrator with 8 + years of managing enterprise cloud infrastructure and Strong hands on experience on Azure environments and some AWS experiences. - Strong experiences with Active Directory, Group Policy, logon scripts, SAML, OAUTH and MFA management and domain Trust, AD Sync. - Windows and Linux Vulnerabilities remediation experiences. - Experiences Developing methods for automated application installation and configuration - Maintain various system health monitoring tools in Azure. - Demonstrated expertise in providing guidance, building highly available/fault-tolerant enterprise class infrastructure with multiple-region and multi-AZ models. - Exceptional server administration and application installation experiences. - Lead operational implementation of cloud identity and access management solutions enforcing security guidelines. - Serves as SME on building, implementing hybrid cloud architecture AWS and Azure - Experiences administrating with RADIUS server. - Deletion of unattached volumes, RI recommendations and Deletion of Unused resources and unallocated ips. - Server administration, application installation experiences. - Deploying and managing Azure Host pools and VDIs, installing apps, remote app configuration etc.. - Review, plan and facilitate implementation of best practices recommended by the cloud provider - Experiences and solid knowledge on Azure and AWS sg\ NSG, route tables, vpn gateway, load balancer, DNS, application gateway, integration, Vnet, Peering. - Good skills with azure network troubleshooting. - Configuring and maintaining adequate security parameters in in Azure and in AWS - Operating, managing, and deploying Azure vm system

Posted 1 week ago

Apply

8.0 - 13.0 years

3 - 7 Lacs

mumbai

Work from Office

- Seeking Cloud Administrator with 8 + years of managing enterprise cloud infrastructure and Strong hands on experience on Azure environments and some AWS experiences. - Strong experiences with Active Directory, Group Policy, logon scripts, SAML, OAUTH and MFA management and domain Trust, AD Sync. - Windows and Linux Vulnerabilities remediation experiences. - Experiences Developing methods for automated application installation and configuration - Maintain various system health monitoring tools in Azure. - Demonstrated expertise in providing guidance, building highly available/fault-tolerant enterprise class infrastructure with multiple-region and multi-AZ models. - Exceptional server administration and application installation experiences. - Lead operational implementation of cloud identity and access management solutions enforcing security guidelines. - Serves as SME on building, implementing hybrid cloud architecture AWS and Azure - Experiences administrating with RADIUS server. - Deletion of unattached volumes, RI recommendations and Deletion of Unused resources and unallocated ips. - Server administration, application installation experiences. - Deploying and managing Azure Host pools and VDIs, installing apps, remote app configuration etc.. - Review, plan and facilitate implementation of best practices recommended by the cloud provider - Experiences and solid knowledge on Azure and AWS sg\ NSG, route tables, vpn gateway, load balancer, DNS, application gateway, integration, Vnet, Peering. - Good skills with azure network troubleshooting. - Configuring and maintaining adequate security parameters in in Azure and in AWS - Operating, managing, and deploying Azure vm system

Posted 1 week ago

Apply

1.0 - 5.0 years

4 - 9 Lacs

pune

Work from Office

Job Summary: We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) focused on Monitoring to join our observability team. The candidate will be responsible for ensuring the reliability, availability, and performance of our systems through robust monitoring, alerting, and incident response practices. Key Responsibilities: Monitor Application, IT infrastructure environment Drive the end-to-end incident response and resolution Design, implement, and maintain monitoring and alerting systems for infrastructure and applications. Continuously improve observability by integrating logs, metrics, and traces into a unified monitoring platform. Collaborate with development and operations teams to define and track SLIs, SLOs, and SLAs. Analyze system performance and reliability data to identify trends and potential issues. Participate in incident response, root cause analysis, and post-mortem documentation. Automate repetitive monitoring tasks and improve alert accuracy to reduce noise. Required Skills & Qualifications: 2+ years of experience in application/system monitoring, SRE, or DevOps roles. Proficiency with monitoring tools such as Prometheus, Grafana, ELK, APM, Nagios, Zabbix, Datadog, or similar. Strong scripting skills (Python, Bash, or similar) for automation. Experience with cloud platforms (AWS, Azure) and container orchestration (Kubernetes). Solid understanding of Linux/Unix systems and networking fundamentals. Excellent problem-solving and communication skills.

Posted 1 week ago

Apply

6.0 - 11.0 years

5 - 9 Lacs

jaipur

Work from Office

Dynatrace Specialist Banking Domain Role Summary : We are looking for a skilled Dynatrace Specialist with strong experience in Application Performance Monitoring (APM), Dynatrace SaaS implementation, and cloud observability. The ideal candidate will have a solid background in banking domain environments, migration from legacy monitoring tools, and a strong understanding of DevOps, CI/CD, and Agile delivery practices. Key Responsibilities : - Implement and manage Dynatrace SaaS for application performance monitoring - Migrate legacy monitoring solutions to next-gen observability solutions - Implement logging services with Dynatrace and Grail Datalake - Diagnose and optimize application, middleware, and infrastructure performance - Monitor and report on business metrics, customer experience, and digital product optimization - Work with agile software engineering teams to integrate observability into CI/CD and DevOps pipelines - Configure and manage event management processes in alignment with ITIL - Develop and maintain automation scripts (Ansible, Shell, Bash, Perl, PowerShell) for monitoring requirements - Collaborate with stakeholders to design monitoring solutions for complex applications and architectures Mandatory Skills & Experience : - Bachelors degree in IT, Computer Science, or related field - 5+ years of experience in Application Performance Monitoring using enterprise-standard tools - Proven Dynatrace SaaS implementation experience - Experience migrating from legacy monitoring solutions to modern observability platforms - Cloud observability experience - Logging services implementation with Dynatrace & Grail Datalake - 3+ years in Agile software engineering practices - 3+ years in CI/CD, automation, and DevOps - Strong knowledge of application architecture, OSI layers, software design methodologies - Proven performance tuning expertise across application, middleware, and infrastructure components - Familiarity with ADO, SharePoint, Confluence, MS Office tools - Event Management and ITIL Foundations (certification preferred) - Scripting experience : Ansible, Shell, Bash, Perl, PowerShell Preferred Skills : - Advanced Excel, Power BI, and reporting/analytics tools - Banking domain experience in digital product monitoring and optimization

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies