Jobs
Interviews

1098 Monitoring Tools Jobs - Page 27

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 12.0 years

20 - 30 Lacs

Bengaluru

Hybrid

Key Responsibilities: Monitoring Architecture & Implementation Serve as the subject matter expert (SME) for LogicMonitor, overseeing design, implementation, and continuous optimization. Lead the development and deployment of monitoring solutions that integrate on[1]premise infrastructure, public cloud (AWS, Azure, GCP), and hybrid environments. Develop and maintain monitoring templates, escalation chains, and alerting policies that align with business service SLAs. Real-Time Dashboards & Visualization Design and build real-time service availability dashboards to provide actionable insights for operations and leadership teams. Leverage LogicMonitors APIs and data sources to develop custom visualizations, ensuring a single-pane-of-glass view for multi-layered service components. Collaborate with application and service owners to define KPIs, thresholds, and health metrics. Automation & Integration Automate onboarding/offboarding of monitored resources using LogicMonitors REST API, Groovy scripts, and Configuration Modules. Integrate LogicMonitor with ITSM tools (e.g., ServiceNow, Jira), collaboration platforms (e.g., Slack, Teams), and CI/CD pipelines. Enable proactive monitoring through synthetic transactions and anomaly detection capabilities. Operations & Optimization Perform ongoing health checks, capacity planning, and tuning of monitoring thresholds to reduce alert fatigue. Establish and enforce monitoring standards, best practices, and governance models across the organization. Lead incident response investigations, root cause analysis, and post-mortem reviews from a monitoring perspective. Required Skills & Qualifications: 5+ years of hands-on experience with LogicMonitor, including custom DataSources, PropertySources, dashboards, and alert tuning. Proven expertise in IT infrastructure monitoring: networks, servers, storage, virtualization (VMware), and containerization (Kubernetes, Docker). Strong understanding of cloud platforms (AWS, Azure) and their native monitoring tools (e.g., CloudWatch, Azure Monitor). Experience in scripting and automation (e.g., Python, PowerShell, Groovy, Bash). Familiarity with observability stacks: ELK, , Grafana, is a strong plus. Proficient with ITSM and incident management processes, including integrations with ServiceNow. Excellent problem-solving, communication, and documentation skills. Preferred Qualifications: LogicMonitor Certified Professional (LMCP) or similar certification. Experience with APM tools (e.g., AppDynamics, Dynatrace, Datadog) and log analytics platforms. Knowledge of DevOps practices and CI/CD pipelines. Exposure to regulatory/compliance monitoring (e.g., HIPAA, PCI, SOC 2)

Posted 1 month ago

Apply

2.0 - 6.0 years

4 - 8 Lacs

Chennai

Work from Office

Eligibility Criteria : - Bachelors degree in Engineering (3-5 years of experience) Technology : Huawei, Ericsson ,Cisco, VAS (SMSC , MMSC, CRBT, etc.) Job Description : - Network Monitoring for Alarms in Huawei Network NMS- VAS - SMSC MMSC, Bulk Message Server BMS, USSD, OTA(Over the Air), Messaging Gateway Server, CRBT, VMS Voice Mail Server, Miss call alert node level alarm monitoring and troubleshootingHand on experience in below mentioned VAS nodes :1. SMSC2. MMSC3. MissU Miss call Alert4. BMS - Bulk Message Server5. CMS6. VOM Voice Mail Server7. OTA - Over the Air activation8. Collectcall and USSD9. DOB (Handled by SLA)10. RBT - (Handled by the Huawei)11. BULKMAN- Reacting at the right time to the service affecting issues- Manages/Uses the Alarm Management processes and tools to detect, Assign the problems in line with Operator/other SLAs and internal OLAs- Opening trouble tickets from network alarms or external fault reports, escalating trouble tickets, both technical and management.- Liaises with Customer Care organisations regarding Network outages- Supports end-end support, coordination and control of assigned Trouble Tickets- Supports Major service outage investigations and follow up- Ensures planned outages are carried out/rolled back in maintenance window- Ensures Operator Customer Care is fully updated for service affecting outages- Alarm Monitoring, Fault Localization/Correction/Verification - Corrective Maintenance (centralized routines) - Liaises with subcontractors and 3rd parties to resolve faults- Liaises with other service providers regarding network outages- Should be open to work in night shifts (24/7 support)- Engineer should be ready to provide emergency support during out of office hours Processes & Tools : - ITIL Foundation recommended- Knowledge of processes (e.g. : incident restoration, network change management, network optimization process, incident and problem management) preferred.- Knowledge of tools : BMC REMEDY/ITSM, NetCool, CANVAS Converged Service Platform, Seibel other platform tools preferred. Key skills : - Good Communication and inter personal Skills- Experience in Operations and preferably Managed Services- Analytical Skills.- Process Orientation , Time Management Software Skill : - SQL and Oracle databases (preferably DBA)- Visual basic, MS Acces, Microsoft office

Posted 2 months ago

Apply

4.0 - 8.0 years

10 - 20 Lacs

Bengaluru, Mumbai (All Areas)

Hybrid

Each of Gracenotes verticals faces distinct challenges. From servicing billions of requests a day for the music business to providing the data powering global TV programming information in the Video business, to dealing with the dynamics of Real-Time Sports Data. To meet the 24/7/365 requirements of our customers, and those of internal groups like Engineering and Customer Care; Gracenote has created the Service Operation Support (SOS) team. With a goal of meeting the company’s internal and external needs, the SOS group is working to create a modern, scalable and highly automated platform for comprehensive monitoring, alerting, and troubleshooting for our heterogeneous infrastructure, and a large number of application components. Our customers demand accurate and up to date data, and the SOS group has been built to ensure that we deliver it. Job Purpose: In Service Operation Support , the successful candidate will be responsible for ensuring reliable and on-time product data delivery, and supporting the overall operation of the SOS group. Job Description: Resolving or escalating issues from submission/detection to fulfillment/resolution. You will be working a rotating schedule (including on-call shifts) within your team, and your primary duties include proactive and reactive service monitoring, technical triaging, incident management, and providing status updates and communications to management as required. This candidate should have a passion for automation and quality. An understanding of the interaction between Legacy and Current Systems, On-Premises and Cloud, and most importantly the desire to constantly make things better. Other responsibilities would include: Identifying common issues and working toward long term solutions Excellent problem solving/analytical skills Maintaining & monitoring production applications and systems. Internal documentation Working closely with management, DevOps, and Engineering teams to execute on tasks. Work on defined SLA’s to make sure that our client receives the best of the services Identifying opportunities for process improvement. Must be able to manage and prioritize multiple work requirements Ability to work independently and in a team environment On-Call duties as required. Role Requirements / Desired Skills : Bachelor’s degree in Computer Science, Information Technology or related field 6+ years of experience working in role of Support Engineer / Product Support Engineer / Application Support Engineer Ability to read and write various programming languages such as Java, .Net, SQL. Experience with and knowledge of, both Relational and Non Relational Databases and Data Stores. (MSSQL, MongoDB, Postgres, Kafka, etc.) Experience with AWS or other cloud platforms A proficiency in Networking in both Physical and Cloud Environments. Superior written and oral communication Ability to work independently and as part of a team Problem solving skills necessary Ability to plan and delegate incidents. Excellent time management skills Passion for data, attention to detail, intellectual curiosity, and a love of problem-solving Have knowledge of other technologies like orchestration tools, database optimisation, server/application optimization Experience with DevOps practices and software Additional skill set (Good to have): Knowledge of GIT, Jenkins, and other Continuous Integration services Experience with knowledge base creation and Technical documentation skills would be considered an asset Have an affinity with Video, Music & Sports domain Like to understand and brainstorm about architecture A passion for exploring / understanding new programming languages

Posted 2 months ago

Apply

2.0 - 7.0 years

6 - 7 Lacs

Bengaluru

Work from Office

Req ID: 332351 NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a Site Reliability Engineer - Observability Focus to join our team in Bangalore, Karn taka (IN-KA), India (IN). Key Responsibilities: Act as an observability specialist, responsible for implementing and managing a comprehensive observability platform across distributed systems. Utilize tools such as AppDynamics , Splunk SignalFX , DataDog , SiteScope , and ThousandEyes to monitor application performance, infrastructure health, and end-user experience. Develop and maintain real-time dashboards and alerts to proactively detect and respond to system anomalies and performance issues. Triage and resolve production incidents using ServiceNow , ensuring timely communication and root cause analysis. Collaborate with development, infrastructure, and operations teams to identify performance bottlenecks and drive continuous improvement. Define and implement observability strategies that align with business goals and enhance system reliability and scalability. Provide leadership and guidance on observability best practices, tool adoption, and performance monitoring frameworks. Contribute to the development of SRE standards, runbooks, and incident response protocols. Required Skills: Proven experience with observability and monitoring tools such as AppDynamics , Splunk , DataDog , SignalFX , SiteScope , and ThousandEyes . Strong understanding of incident management and triage processes using ServiceNow or similar ITSM tools. Experience in building and maintaining dashboards, alerts, and monitoring configurations. Solid understanding of SRE principles, including SLAs, SLOs, and error budgets. Excellent problem-solving skills and the ability to work under pressure in a production support environment.

Posted 2 months ago

Apply

1.0 - 4.0 years

3 - 6 Lacs

Pune

Work from Office

We are seeking a skilled TIBCO Developer (Software Engineer) with 1-4 years of experience in developing and implementing enterprise integration solutions using TIBCO BusinessWorks Container Edition (BWCE) The ideal candidate should also have experience with TIBCO EMS and TIBCO ActiveSpaces Additional experience with Docker, Kubernetes, OpenShift (OCP), and CI/CD is plus Key Responsibilities: Develop and deploy integration solutions using TIBCO BWCE Manage messaging solutions using TIBCO EMS Implement and optimize distributed caching solutions using TIBCO ActiveSpaces Work on Docker and Kubernetes for containerized deployments Deploy and manage TIBCO based solutions on OpenShift Container Platform (OCP) Automate deployment and monitoring using CI/CD pipelines Debug and troubleshoot integration and performance issues Collaborate with architects, business analysts, and other stakeholders to understand integration requirements Ensure best practices, security, and compliance in integration solutions Required Skills & Experience: 1-4 years of hands on experience in TIBCO BWCE, TIBCO EMS, and TIBCO ActiveSpaces, Boomi API Gateway Strong understanding of integration patterns and enterprise messaging Experience with Docker and Kubernetes for containerized applications Exposure to OpenShift Container Platform (OCP) Familiarity with CI/CD tools (e.g, Jenkins, Git, ArgoCD, Tekton, etc) Strong debugging and troubleshooting skills Experience in API development and integration (REST/SOAP) Knowledge of cloud platforms (AWS, Azure, or GCP) is a plus Ability to work in an agile development environment Good to Have: Experience in cloud based integration solutions Knowledge of monitoring tools like Prometheus, Grafana, or ELK stack Exposure to DevOps methodologies Certification in TIBCO technologies or cloud platforms Education & Certifications: Bachelor s or master s degree in computer science, Information Technology, or equivalent degree TIBCO certification is a plus

Posted 2 months ago

Apply

7.0 - 8.0 years

6 - 10 Lacs

Bengaluru

Work from Office

Total Yrs. of Experience 8 years Relevant Yrs. of experience 7-8 years Detailed JD Performance tester with 7 to 8 years of experience and should have knowledge on different tools like JMeter, LoadRunner, Neoload. Monitoring professional in identifying the bottle necks. Good communication and stakeholder management Should be of a level of lead/architect Mandatory skills Jmeter, LoadRunner, Monitoring tools knowledge Work Location Hyderabad

Posted 2 months ago

Apply

4.0 - 8.0 years

8 - 14 Lacs

Bengaluru

Work from Office

We are actively seeking an experienced for Server Management Administrator (Linux with RedHat & Networking Troubleshooting) to join our team. Job Title : Server Management Administrator (Linux with RedHat & Networking Troubleshooting) Experience : 4 to 8 Years Location : Pan India Notice Period : Immediate Joiners Job Summary : We are hiring a Server Management Administrator with expertise in Linux (RedHat) and Networking Troubleshooting to manage and maintain IT infrastructure. The role involves server administration, networking, security, and performance monitoring to ensure smooth operations. Key Responsibilities : - Linux Server Administration : Install, configure, and maintain RedHat-based servers. - Networking Troubleshooting : Resolve network issues (DNS, DHCP, TCP/IP, Firewalls, VPNs, Load Balancers). - System Monitoring & Security : Monitor performance, apply patches, and enhance security. - User & Access Management : Manage accounts, roles, and permissions. - Backup & Recovery : Implement data backup and recovery strategies. - Virtualization & Cloud Exposure (Preferred) : Experience with AWS, Azure, or GCP is a plus. - Incident & Problem Management : Collaborate with IT teams to resolve issues. Required Skills : - Strong experience in RedHat Linux administration. - Hands-on experience in network troubleshooting. - Knowledge of security & compliance standards. - Familiarity with automation tools (Ansible, Puppet, Shell scripting - preferred). - Exposure to virtualization (VMware, KVM - a plus). Qualifications : - Bachelor's degree in IT, Computer Science, or related field. - Preferred Certifications : RHCE, RHCSA, CCNA. - Immediate joiners preferred! Apply now if you have the required skills.

Posted 2 months ago

Apply

3.0 - 8.0 years

3 - 8 Lacs

Chennai

Work from Office

Job Description: Shift Timing - 5PM IST - 2AM IST We are seeking a System Administrator to join our team. The ideal candidate will have a solid foundation in networking, Linux & Windows experience, and strong English communication skills. This role offers the opportunity to gain hands-on training in advanced monitoring tools, firewall management, and SSL certificate platforms. Key Responsibilities: Monitor network infrastructure and services to ensure uptime and performance. Respond to alerts and escalate issues as per defined procedures. Perform Linux & Windows system checks and log analysis. Work closely with senior engineers and customer teams for issue resolution. Maintain documentation related to network incidents, changes, and monitoring procedures. Participate in regular training sessions to develop skills in the following areas: BEST Monitoring Platform End User Troubleshooting Firewall Management IT Infra Security Incident Management DigiCert SSL Certificate Management Required Skills & Qualifications: Proficient English communication skills (both verbal and written). Basic hands-on experience with Linux (any distribution). Basic hands-on experience with Windows OSs Strong understanding of networking fundamentals, including: TCP/IP, DNS, DHCP, Subnetting Routers, switches, firewalls (concepts) Network troubleshooting tools (ping, traceroute, netstat, etc.) Strong analytical and problem-solving abilities. Willingness to learn and work in a customer-focused environment. Flexible to work in shifts (Eastern Time Zone) Preferred (Good to Have): Experience in using or exposure to any network monitoring tools. Basic understanding of firewall configurations and SSL certificates. Any certification such as CompTIA Network+, CCNA, or RHCSA is a plus. Interested can reach us at careers.tag@techaffinity.com

Posted 2 months ago

Apply

1.0 - 6.0 years

3 - 8 Lacs

Bengaluru

Work from Office

At Amazon, we strive to be Earth s most customer-centric company where people can find and discover anything they want to buy online. We hire the world s brightest minds, offering them an environment in which they can relentlessly improve the experience for customers. Innovation and creativity are built into the DNA of the company and are encouraged at all levels of employment. Every day we solve complex technical and business problems with ingenuity and simplicity. We re making history and the good news is we ve only just begun. Amazon, one of the top 100 companies in the United States, has an immediate opening for an IT Support Associate . IT Service Desk (SD) is the centralized IT support organization within OTS Global IT Delivery located across America, Europe/Prague, India. The team utilizes an omni-channel contact center to provide efficient, streamline 24x7 IT support to Worldwide (WW) Operations (Ops) associates and internal/external support for Amazon Lockers. Overall, SD plays a critical role in ensuring the smooth functioning of Amazon sites globally and thereby has a direct impact on Amazon s ability to serve its customers on time. Responsibilities include, but are not limited to: 1.Effective Communication Skills: Demonstrating proficiency in clear and concise communication. This role needs interaction with Amazon Internal customers it includes APAC/EMEA/AMER Operation /IT team/Customer support. 2.Adherence to standard operating procedures (SOPs) is fundamental to maintaining consistency and efficiency in daily operations. 3.Basic Knowledge on IT Troubleshooting on end user devices: Competence in resolving issues on various client devices, including desktops, laptops, printers, and scanners. 4.Basic Understanding and troubleshooting skills on Various Operating system specifically on Windows and Linux 5.This position requires a flexible work schedule involving rotational shifts. Providing real-time customer experience by working in 24*7 operating environment. 6.Adherence to OTS Service Desk Goals: Meeting targets for Response and Resolution SLA, CSAT, and effectively managing incidents. 7.This role will be working from Bangalore (BLR18) Corporate office. Bachelor s degree Minimum of 1+ year of work experience. Good communication skills Basic Understanding of ITIL-Based Ticketing Tools and Monitoring Tools Basic Understanding and troubleshooting skills on Network

Posted 2 months ago

Apply

4.0 - 9.0 years

15 - 27 Lacs

Gurugram

Hybrid

Dear Candidate, We have an urgent openings for Site Reliability Engineer- Gurgaon Exp-4 to 9 yrs Np-60 days(Max) Location- Gurgaon( only local candidates can apply) Interview Mode- F2F Shift- Yes Mandatory Skill set- Production support +Monitoring tools Job description:- In this role, you will focus on optimising existing systems, building infrastructure, and eliminating work through automation. Additionally, you will monitor our systems' capacity and performance, enhancing the reliability and availability of software systems within a DevOps/SRE framework. You will collaborate with key stakeholders to investigate issues, analyse defects, implement solutions, and drive improvements. Furthermore, you will provide guidance to build automated solutions that help reduce TOIL and reserve capacity We love hearing from anyone inspired to build a better future with us, if you're excited about the role or working at Macquarie we encourage you to apply. 4+ years of experience. Basic knowledge of Java, C, and C++ is required, and familiarity with cloud technologies such as AWS, GCP, and Azure. Understanding data processing pipelines is valuable, and a good grasp of DevOps tools like GIT, Bitbucket, Jenkins, etc., is preferred. Good knowledge of SLOs/SLIs, Error Budget, A Experience with APM tools like Dynatrace, AppDynamics, DataDog, etc., and log monitoring tools such as Sumo Logic, Splunk, etc., is desirable.

Posted 2 months ago

Apply

3.0 - 6.0 years

10 - 18 Lacs

Gurugram

Hybrid

Interested candidates can directly apply via below link: https://jobs.amdocs.com/careers/job/563431001831462 Required Technical Competencies: Working knowledge of Microsoft tools like Office, Word, Excel. Working knowledge of incident management tool like Jira and monitoring and logs analysis tools like Splunk, Argos , Grafana , SOAP UI will be an advantage. ITIL/ITSM knowledge and certification would be an added advantage. Having exposure to telecom domain. Excellent Communication Skills. Willingness to learn drive issues towards resolution. Infrastructure Background: Experience in managing server deployments, ensuring server health, and monitoring certificate validity. Proficiency in configuration management and troubleshooting infrastructure-related issues. Strong understanding of log analysis using tools like Splunk, Argos, Grafana or similar logging solutions. Ability to perform advanced triaging by analyzing logs to identify root causes of infrastructure issues. Proficiency in manual testing for rapid issue verification and basic sanity flow checks, utilizing tools like Postman/curl for API testing. Experience in working in ambiguous situations, working under pressure, and flexible work hours (across multiple time zones) Required Behavioral Competencies : Effective Communication & Stakeholder Management: Ability to independently lead war-room discussions with multiple stakeholders and provide rapid, clear responses to customer queries. Adaptability & Resilience: Ability to work effectively in ambiguous situations, under pressure, and with flexible work hours. Sense of Urgency & Ownership: Production-oriented with a strong sense of urgency and sensitivity to production requirements. Analytical Thinking: Good analytical skills , coupled with the ability to systematically approach and resolve complex problems. Collaboration & Teamwork: Ability to work effectively within a team environment, fostering cooperation and knowledge sharing. Incident management often requires coordinated efforts across multiple teams. Proactive Learning & Continuous Improvement: Demonstrated commitment to learning from incidents, identifying areas for improvement, and implementing changes to prevent recurrence. Decision-Making & Judgment: Ability to make sound decisions under pressure, often with limited information. This includes prioritizing tasks and determining the best course of action.

Posted 2 months ago

Apply

7.0 - 10.0 years

13 - 18 Lacs

Hyderabad

Work from Office

Location: Hyderabad Mode: Work From Office (WFO) Experience: 7-10 years Responsibilities AWS Expertise AWS Infrastructure: Design and manage AWS infrastructure using Infrastructure as Code (IAC) tools such as Terraform or AWS Cloud Formation. Cloud Services: Implement and optimize AWS services, ensuring high availability, scalability, and security. Networking: Develop and manage networking architecture within AWS, including VPCs, subnets, and security groups. CI/CD Pipeline CI/CD Implementation: Build and maintain robust CI/CD pipelines for automated software delivery and deployment. Containerization: Containerize applications using Docker and orchestrate with tools like Kubernetes. Monitoring and Automation: Implement monitoring solutions and automation to ensure system performance and reliability. Security and Compliance Security Best Practices: Enforce security best practices and compliance standards within AWS environments. Security Audits: Conduct security audits and vulnerability assessments to proactively address potential risks. Collaboration and Documentation Team Collaboration: Collaborate with cross-functional teams, including developers and system administrators, to streamline development and operations processes. Documentation: Maintain comprehensive documentation of DevOps processes, configurations, and architecture. Requirements and Technical Skills Bachelors degree in Computer Science, Information Technology, or a related field (Masters preferred). 7+ years of professional experience in DevOps, with a strong focus on AWS. Expertise in AWS services and solutions, including EC2, S3, RDS, and VPC. Proficiency in Infrastructure as Code (IAC) tools like Terraform or AWS CloudFormation. Strong knowledge of containerization technologies (Docker, Kubernetes). Experience with CI/CD tools such as Jenkins, Travis CI, or GitLab CI/CD. Networking and security architecture experience within AWS.\ Familiarity with monitoring tools (e.g., Prometheus, Grafana) and automation frameworks. Excellent problem-solving and troubleshooting skills. Strong communication and interpersonal skills.

Posted 2 months ago

Apply

4.0 - 9.0 years

15 - 16 Lacs

Bengaluru

Work from Office

Job Description: Senior Engineer (Openshift, Kafka, Devops, ML) Job Location: Hyderabad / Bangalore / Chennai / Kolkata / Noida/ Gurgaon / Pune / Indore / Mumbai Candidate must involve in design, deploy, and sustain machine learning solutions at the edge, integrated with robust cloud infrastructure. You will be responsible for operationalizing ML models on OpenShift with REST APIs, enabling real-time insights and monitoring through Azure and observability tools, while leveraging CI/CD automation and data integration services. Key Responsibilities: Model Deployment & Management: Deploy and manage ML models on OpenShift as REST APIs with real-time managed endpoints. Utilize Microsoft Azure ML for experimentation, model training, versioning, and lifecycle management. Configure private endpoints and security protocols for compliance and data protection. Work with Confluent Kafka for real-time streaming and event-driven architectures at the edge and cloud. Infrastructure as Code: Automate infrastructure provisioning using Terraform for consistent deployments across environments. Observability & Monitoring: Implement observability using Managed Prometheus, Grafana, Azure Application Insights, and AppInsights for edge health, performance, and usage metrics. DevOps & MLOps: Maintain CI/CD pipelines using Azure DevOps and ArgoCD for seamless model and infrastructure delivery. Monitor model performance post-deployment and handle model retraining as needed. Required Skills and Experience: 5+ years of experience in ML/AI/DevOps engineering, including Edge deployment. Strong proficiency in OpenShift, Azure ML, and Terraform. Hands-on experience with Kafka, Snowflake, and Function Apps. Proven experience with CI/CD pipelines, preferably Azure DevOps and Argo. Good understanding of monitoring tools (Prometheus, Grafana, AppInsights). Experience in secure deployments and managing private endpoints in Azure. At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We re committed to fostering an inclusive environment where everyone can thrive. Recruitment fraud is a scheme in which fictitious job opportunities are offered to job seekers typically through online services, such as false websites, or through unsolicited emails claiming to be from the company. These emails may request recipients to provide personal information or to make payments as part of their illegitimate recruiting process. DXC does not make offers of employment via social media networks and DXC never asks for any money or payments from applicants at any point in the recruitment process, nor ask a job seeker to purchase IT or other equipment on our behalf. More information on employment scams is available here .

Posted 2 months ago

Apply

1.0 - 7.0 years

6 - 7 Lacs

Bengaluru

Work from Office

Job Description: Senior Data Engineer (Azure, Snowflake, ADF) Job Location: Hyderabad / Bangalore / Chennai / Kolkata / Noida/ Gurgaon / Pune / Indore / Mumbai Key Responsibilities: Data Integration & Orchestration: Integrate with Snowflake for scalable data storage and retrieval. Use Azure Data Factory (ADF) and Function Apps for orchestrating and transforming data pipelines. Streaming & Messaging: 5+ years of experience in ML/AI/DevOps engineering, including Edge deployment. Strong proficiency in OpenShift, Azure ML, and Terraform. Hands-on experience with Kafka, Snowflake, and Function Apps. Proven experience with CI/CD pipelines, preferably Azure DevOps and Argo. Good understanding of monitoring tools (Prometheus, Grafana, AppInsights). Experience in secure deployments and managing private endpoints in Azure. At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We re committed to fostering an inclusive environment where everyone can thrive. Recruitment fraud is a scheme in which fictitious job opportunities are offered to job seekers typically through online services, such as false websites, or through unsolicited emails claiming to be from the company. These emails may request recipients to provide personal information or to make payments as part of their illegitimate recruiting process. DXC does not make offers of employment via social media networks and DXC never asks for any money or payments from applicants at any point in the recruitment process, nor ask a job seeker to purchase IT or other equipment on our behalf. More information on employment scams is available here .

Posted 2 months ago

Apply

6.0 - 10.0 years

6 - 10 Lacs

Tiruchirapalli

Work from Office

Role Overview: We are seeking a Technical Product Manager to lead and manage the entire software product development lifecycle from concept to delivery. This role is hands-on and requires a strong engineering background in backend development and modern data technologies, with demonstrated experience in building and delivering complex software products. You will work closely with internal stakeholders, developers, QA, and DevOps teams to ensure each product is planned, developed, tested, and released with precision. Key Responsibilities: Project & Product Lifecycle Management Lead and manage the full product development lifecycle: planning, requirement gathering, validation, estimation, development, testing, and release. Collaborate with stakeholders to define product scope, technical feasibility, and delivery timelines. Conduct technical validation of requirements, helping guide architecture and technology decisions. Own project budgeting, resource allocation, and delivery tracking. Establish and manage sprint plans, task assignments, and ensure timely execution across development teams. Engineering Oversight & Technical Leadership Provide technical leadership to the software development team in: - Node.js, Express.js, React.js, MongoDB, Radis DB - Time-series databases (e.g., OpenSearch, ClickHouse, or Cassandra) experience with any one is required - RESTful API development, WebSocket-based communication Basic understanding of AI/ML concepts and how they integrate into modern applications Assist in code reviews, technical issue resolution, and performance optimization Ensure architectural alignment with business and scalability goals. Process Governance & Delivery Assurance Manage task tracking, sprint velocity, QA cycles, and release planning. Implement robust bug tracking, test coverage reviews, and UAT readiness. Oversee the successful delivery of software builds, ensuring they meet quality and timeline expectations. Prepare and maintain project documentation and release notes. Stakeholder Communication & Reporting Serve as the single point of contact between engineering and leadership for project progress, blockers, and releases. Provide weekly progress reports, metrics, and risk escalations. Facilitate cross-functional communication with QA, DevOps, design, and support teams. Required Qualifications (Must-Have Skills) 6 - 10 years of experience in software product development, including 3+ years in a product/project management or technical lead role. Strong hands-on experience in Node.js, Express.js, React.js and MongoDB. Experience with at least one time-series database (OpenSearch, ClickHouse, or Cassandra). Solid understanding of RESTful APIs, WebSocket protocols, and microservice development. Familiarity with core AI/ML concepts and integration patterns in modern applications. Proven success in delivering at least two software products end-to-end to enterprise or mid-market clients. Strong understanding of Agile/Scrum, sprint planning, backlog grooming, and release cycles. Preferred Skills Experience in building SaaS-based platforms, monitoring tools, or infrastructure management products. Familiarity with cloud hosting environments (AWS, GCP, Azure) and DevOps practices (CI/CD pipelines, Docker/K8s). Exposure to observability stacks, log monitoring, or AI/MLOps products. Working knowledge of QA automation and performance testing tools. Key Attributes Strong ownership and execution mindset. Ability to balance technical depth with product vision. Excellent communication, task management, and stakeholder coordination skills. Comfortable working in fast-paced, evolving product environments.

Posted 2 months ago

Apply

1.0 - 3.0 years

3 - 5 Lacs

Coimbatore

Work from Office

Position: L1 Support Engineer Experience : Knowledge about IT support, with a focus on ticket handling using Zoho or similar ticketing tools (e.g., ServiceNow, Jira). Practical experience with Datadog for monitoring applications, servers, databases, or networks. Familiarity with IT infrastructure components, including servers (Windows/Linux), databases (SQL/NoSQL), and networking (TCP/IP, DNS, etc.). Technical Skills : Proficiency in using Datadog to analyze logs, metrics, and alerts for issue identification and resolution. Hands-on experience with the Zoho ticketing tool for logging, tracking, and resolving support tickets. Basic understanding of IT infrastructure troubleshooting and monitoring.

Posted 2 months ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

Kolkata, Mumbai, New Delhi

Work from Office

Job Summary: We are seeking a Monitoring and Observability Engineer (L1) to join our IT Operations team. The ideal candidate will be responsible for monitoring the health and performance of IT systems, responding to alerts, managing service queues, and assisting with the onboarding and offboarding processes using tools such as Zabbix, Nagios, and ManageEngine. This is a critical role in ensuring the availability, reliability, and smooth functioning of our IT infrastructure. Key Responsibilities: Monitoring & Alerting: Continuously monitor system performance, including servers, applications, and network devices using Zabbix, Nagios, and ManageEngine. Respond to system alerts and notifications promptly, prioritizing issues based on severity. Collaborate with other teams to escalate and resolve issues as necessary. Notification & Escalation: Notify appropriate stakeholders about service interruptions or performance degradation. Manage and escalate alerts based on predefined escalation procedures. Queue Management: Monitor and manage service request queues to ensure issues are logged and tracked. Ensure that all tickets are resolved within the agreed-upon Service Level Agreements (SLAs). Follow up on open tickets and provide timely updates on ticket statuses. Onboarding/Offboarding: Assist with the onboarding process for new users and systems, ensuring that they are properly configured for monitoring. Support offboarding activities by ensuring systems or accounts are properly decommissioned and removed from monitoring tools. Collaboration & Communication: Work closely with IT teams, network teams, and other departments to maintain system uptime and resolve performance issues. Provide regular updates to stakeholders regarding the status of incidents and requests. Contribute to continuous improvement of monitoring systems and processes. Skills & Qualifications: Technical Skills: Proficiency in Zabbix, Nagios, and ManageEngine monitoring tools. Basic understanding of IT infrastructure (servers, networking, applications, databases, etc.). Knowledge of monitoring metrics (CPU, memory, disk space, network traffic) and alerts. Familiarity with networking protocols (TCP/IP, HTTP, DNS, etc.). Experience: 3+ years of experience in a monitoring or IT support role is preferred. Experience in managing monitoring tools or observability platforms is a plus. Communication Skills: Strong verbal and written communication skills to notify stakeholders of issues and document incidents. Problem-Solving: Strong analytical skills to diagnose issues based on alerts and system performance. Ability to work in a fast-paced environment and manage multiple priorities effectively. Additional Skills: Familiarity with incident management tools and processes. Experience with scripting or automation tools for monitoring processes is a plus. Preferred Experience: Experience in IT operations, monitoring, or helpdesk support. Familiarity with additional monitoring or observability tools. Exposure to incident response and ITIL processes. Join the Cloud4c Talent Community If youre looking for a place that elevates creativity with humanity, work that is as innovative as it is fun, and people who lead with both head and heart, youve found it and our doors are open for you. Click to register with our Talent Community. Well keep your information and reach out to you when we post opportunities in the future that might be a fit. Sign Up

Posted 2 months ago

Apply

1.0 - 4.0 years

4 - 7 Lacs

Bengaluru

Work from Office

We are hiring experienced Performance Testers with deep expertise in JMeter to evaluate and optimize the performance of web applications and APIs. If you re passionate about identifying bottlenecks and enhancing system scalability, we want you on our team. Responsibilities: Design and execute performance test plans using Apache JMeter Simulate user loads and real-world usage scenarios Analyze test outcomes and pinpoint performance bottlenecks Collaborate with development and infrastructure teams for issue resolution Integrate performance testing into CI/CD pipelines Prepare clear performance analysis and test summary reports Log and track performance issues using Jira Qualifications: Hands-on experience with Apache JMeter for load and stress testing. Strong understanding of testing types: load, stress, soak, and spike Familiarity with monitoring tools like Grafana, Dynatrace, or AppDynamics. Ability to analyze metrics like response time, throughput, and resource usage. Experience using Jira and test management tools like TestRail or Zephyr Good communication, analytical thinking, and attention to detail

Posted 2 months ago

Apply

10.0 - 20.0 years

5 - 13 Lacs

Pune

Work from Office

Role & responsibilities We required 4+ years exp resource in performance tester. Conducting capacity plans Preparing Impact assessment for new project requirements workload modeling, gatling scripting and scenario creation stub creation and deployment CICD using harness and Jenkins Performance testing (Load & Soak testing) Cloud services: (Amazon Elastic cache, RDS, ECS, EKS, CloudFormation) Performance monitoring tools: AppDynamics, Grafana, Splunk and AWS CloudWatch Analysis and Reporting Conduct reviews and grant approval for impact assessments and test summary reports from other performance test engineers Version control: Git, Github Confluence, Rally and Control center Interested candidate can share me there updated resume in recruiter.wtr26@walkingtree.in

Posted 2 months ago

Apply

1.0 - 4.0 years

3 - 6 Lacs

Bengaluru

Work from Office

About this opportunity: We are currently seeking an innovative and dedicated Automated Operations Engineer to join our team at Ericsson. The role carries significant responsibility as you will be leading the coordination, support, and execution of 1st Level proactive and reactive maintenance activities. This is integral to ensure that services provided to our valued customers are consistently available and performing to the highest standards, in alignment with our Service Level Agreement (SLA). If you are passionate about continuous improvement and delivering superior service, we would love to hear from you. What you will do: - Engage in 1st Level Service Monitoring and Event Management. - Manage Service and Resource Alarm Handling. - Contribute to Resource and Service Performance Monitoring. - Oversee Security Event Monitoring. - Facilitate Incident Identification. - Support Capacity and Performance Investigations. The skills you bring: - Bachelor s degree in IT, Telecommunications, or a related engineering field. - 1 4 years of hands-on experience in network support, troubleshooting, and alarm monitoring. - Strong problem-solving skills with a customer-centric approach. - Good communication skills to coordinate with internal teams, vendors, and customers. - Experience with network monitoring tools and ticketing systems. - Understanding of telecom infrastructure and hardware components. - Ability to work under pressure and manage multiple incidents simultaneously.

Posted 2 months ago

Apply

5.0 - 10.0 years

13 - 15 Lacs

Chennai

Work from Office

Ford is seeking an experienced Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform. Our Site Reliability Engineering (SRE) team enables modernization by providing robust SRE standards, IaC, monitoring tools powered by AI and easy-to-use dashboards. The resulting transparency of end-to-end performance provides a better view into how teams can proactively manage reliability and strategically apply automation. As an SRE your role will combine software engineering and systems engineering disciplines to ensure that software systems are available, scalable, and maintainable. This individual will play a pivotal role in shaping the evolving needs of our customers including development of Service Level Indicators and Objectives (SLI/SLO), best practices with associated templates, as well as automation to remove toil and facilitate adoption. Bachelors degree in computer science, Computer Engineering, Electrical Engineering or related field or a combination of education and equivalent work experience 5+ years of experience with Golang, Python, Java, NoSQL/SQL Datastore, Spring Boot. 5+ years of experience with any APM and other monitoring tools such as Grafana Cloud, Dynatrace, New Relic, ELK, Splunk, Prometheus, Kafka, DataDog, PagerDuty. 3+ years of GCP, AWS, or Azure experience. 3+ years of experience maintaining, developing, and supporting multi-tier production applications Experience with automated testing, unit/integration/load and/or test-driven development Understanding of gRPC & RESTful APIs, and microservices platform Strong experience with establishing error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime. Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc. Strong background in software development and systems administration, as well as excellent problem-solving and communication skills. Partner with and guide development teams in SRE best practices to improve reliability, MTTR/MTTD, quality, and time-to-market of our suite of software solutions across Ford Collaborate with development teams to design, build, and operate scalable and resilient software systems Guide partner teams in setting appropriate SLOs, leveraging distributed tracing, developing effective dashboards and custom metrics etc. Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve our resilience as an enterprise Identify, reduce, and eliminate TOIL via automation to maximize our partner development teams time spent on engineering and innovation Perform root cause analysis of production incidents and implementing preventive measures Enable/guide partner teams to regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, capacity & resource utilization. Maintain knowledge repository that includes standard operating procedures, SRE best practices & guides, release checklists, etc.

Posted 2 months ago

Apply

1.0 - 4.0 years

2 - 6 Lacs

Bengaluru

Work from Office

We are currently seeking an innovative and dedicated Automated Operations Engineer to join our team at Ericsson. The role carries significant responsibility as you will be leading the coordination, support, and execution of 1st Level proactive and reactive maintenance activities. This is integral to ensure that services provided to our valued customers are consistently available and performing to the highest standards, in alignment with our Service Level Agreement (SLA). If you are passionate about continuous improvement and delivering superior service, we would love to hear from you. What you will do: - Engage in 1st Level Service Monitoring and Event Management. - Manage Service and Resource Alarm Handling. - Contribute to Resource and Service Performance Monitoring. - Oversee Security Event Monitoring. - Facilitate Incident Identification. - Support Capacity and Performance Investigations. The skills you bring: - Bachelor s degree in IT, Telecommunications, or a related engineering field. - 1 4 years of hands-on experience in network support, troubleshooting, and alarm monitoring. - Strong problem-solving skills with a customer-centric approach. - Good communication skills to coordinate with internal teams, vendors, and customers. - Experience with network monitoring tools and ticketing systems. - Understanding of telecom infrastructure and hardware components. - Ability to work under pressure and manage multiple incidents simultaneously.

Posted 2 months ago

Apply

0.0 - 2.0 years

2 - 3 Lacs

Gurugram

Work from Office

Profile Summary We are seeking a detail-oriented and proactive System Monitoring Executive to oversee employee system activities using tracking software, maintain daily system logs , and ensure compliance with IT and organizational policies. The role involves monitoring user behaviour, identifying unusual patterns, and preparing structured reports for management review. Key Roles and Responsibilities Monitor employee system activities using a mobile-friendly tracking app Maintain a structured database of daily system logs and user activity Identify irregularities or non-compliance in system usage Generate regular reports for management and escalate issues if needed Ensure confidentiality and integrity of monitoring data Collaborate with HR/IT to support compliance and productivity goals Must have basic technical knowledge and familiarity with monitoring tools Knowledge and Skills Required Education: Bachelors or Master’s completed Experience: 6 months to 2 years of hands-on experience with employee/system monitoring tools (e.g., Handy, etc.) Proficiency in MS Excel, report preparation, IT systems, basic troubleshooting , and user behaviour analytics Excellent communication (written & verbal) Interpersonal and problem-solving skills Strong analytical and observation skills with attention to detail. High level of discretion, integrity, and confidentiality. Ability to work independently and proactively.

Posted 2 months ago

Apply

0.0 - 8.0 years

13 - 15 Lacs

Chennai

Work from Office

Enterprise Technology plays a critical part in shaping the future of mobility. If you re looking for the chance to leverage advanced technology to redefine the transportation landscape, enhance the customer experience and improve people s lives, this is the opportunity for you. Join us and challenge your IT expertise and analytical skills to help create vehicles that are as smart as you are. Ford is seeking an experienced Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform. Our Site Reliability Engineering (SRE) team enables modernization by providing robust SRE standards, IaC, monitoring tools powered by AI and easy-to-use dashboards. The resulting transparency of end-to-end performance provides a better view into how teams can proactively manage reliability and strategically apply automation. As an SRE your role will combine software engineering and systems engineering disciplines to ensure that software systems are available, scalable, and maintainable. This individual will play a pivotal role in shaping the evolving needs of our customers including development of Service Level Indicators and Objectives (SLI/SLO), best practices with associated templates, as well as automation to remove toil and facilitate adoption. Bachelors degree in computer science, Computer Engineering, Electrical Engineering or related field or a combination of education and equivalent work experience 5+ years of experience with Golang, Python, Java, NoSQL/SQL Datastore, Spring Boot. 5+ years of experience with any APM and other monitoring tools such as Grafana Cloud, Dynatrace, New Relic, ELK, Splunk, Prometheus, Kafka, DataDog, PagerDuty. 3+ years of GCP, AWS, or Azure experience. 3+ years of experience maintaining, developing, and supporting multi-tier production applications Experience with automated testing, unit/integration/load and/or test-driven development Understanding of gRPC & RESTful APIs, and microservices platform Strong experience with establishing error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime. Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc. Strong background in software development and systems administration, as well as excellent problem-solving and communication skills. Partner with and guide development teams in SRE best practices to improve reliability, MTTR/MTTD, quality, and time-to-market of our suite of software solutions across Ford Collaborate with development teams to design, build, and operate scalable and resilient software systems Guide partner teams in setting appropriate SLOs, leveraging distributed tracing, developing effective dashboards and custom metrics etc. Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve our resilience as an enterprise Identify, reduce, and eliminate TOIL via automation to maximize our partner development teams time spent on engineering and innovation Perform root cause analysis of production incidents and implementing preventive measures Enable/guide partner teams to regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, capacity & resource utilization. Maintain knowledge repository that includes standard operating procedures, SRE best practices & guides, release checklists, etc

Posted 2 months ago

Apply

8.0 - 12.0 years

14 - 16 Lacs

Bengaluru

Work from Office

Job Description: 8-12 Years experience in . Net Technologies Hands-on service design, schema design and application integration design Hands-on software development using C#, . Net Core Use of multiple Cloud native database platforms including DynamoDB, SQL, Elasticache, and others Hands-on application design for high availability and resiliency Hands-on problem resolution across a multi-vendor ecosystem Conduct Code reviews and peer reviews Unit test and Unit test automation, defect resolution and software optimization Actively engaged with Client IT and Client Business during daily work sessions Code deployment using CI/CD processes Contribute to each step of the development process from ideation to implementation to release, including rapidly prototyping, running A/B tests, continuous Integration, Automated Testing and Continuous Delivery Understand business requirements and technical limitations Ability to learn new technologies and influence the team and leadership to constantly implement modern solutions Experience in using Elasticsearch, Logstash, Kibana (ELK) stack for Logging and Analytics Experience in container orchestration using Kubernetes Knowledge and Experience working with public cloud AWS services Knowledge of Cloud Architecture and Design Patterns Ability to prepare documentation for Microservices Monitoring tools such as Datadog, Logstash Excellent Communication skills Airline industry knowledge is preferred but not required At DXC Technology, we believe strong connections and community are key to our success. Our work model prioritizes in-person collaboration while offering flexibility to support wellbeing, productivity, individual work styles, and life circumstances. We re committed to fostering an inclusive environment where everyone can thrive. Recruitment fraud is a scheme in which fictitious job opportunities are offered to job seekers typically through online services, such as false websites, or through unsolicited emails claiming to be from the company. These emails may request recipients to provide personal information or to make payments as part of their illegitimate recruiting process. DXC does not make offers of employment via social media networks and DXC never asks for any money or payments from applicants at any point in the recruitment process, nor ask a job seeker to purchase IT or other equipment on our behalf. More information on employment scams is available here .

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies