Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
8.0 - 13.0 years
12 - 22 Lacs
Noida, Gurugram, Delhi / NCR
Work from Office
Role & responsibilities The position We are seeking a highly skilled and motivated individual to join our team as Lead - Monitoring. As the Lead Monitoring, you will play a crucial role in overseeing and optimizing our systems and networks. Your responsibilities will include monitoring the performance metrics of our IT infrastructure. Additionally, you will lead troubleshooting efforts, identify and resolve system issues, and implement proactive measures to minimize downtime and disruptions. This role requires a keen eye for detail, strong analytical skills, and the ability to collaborate effectively with technical teams to implement solutions and improve overall system performance. Roles and Responsibilities Monitor System Performance: Oversee the monitoring of system performance metrics, including uptime, response times, and resource utilization, using monitoring tools such as Nagios, Microsoft SCOM, Site24X7, and other third-party tools, to ensure optimal performance and availability. Troubleshooting and Issue Resolution: Lead the identification, troubleshooting, and resolution of system issues, working closely with technical teams to implement solutions and minimize downtime. Capacity Planning: Develop and implement capacity planning strategies to forecast future resource needs and optimize system scalability and performance. Incident Response: Develop and maintain incident response protocols and procedures, including escalation paths and response timelines, to address system outages and critical incidents promptly. Monitoring Tools Management: Evaluate, select, and manage monitoring tools and technologies to support efficient and effective monitoring of systems, networks, and applications. Performance Analysis: Conduct performance analysis and trend analysis to identify potential bottlenecks, areas for improvement, and optimization opportunities. Cloud Monitoring: Implement and manage cloud monitoring solutions for platforms such as Azure and AWS, ensuring visibility into cloud-based resources, performance metrics, and cost optimization strategies. Monitor cloud infrastructure, services, and applications to identify and resolve issues proactively. Synthetic Monitoring: Design and implement synthetic monitoring solutions to simulate user interactions and transactions across applications, websites, and services. Analyze synthetic monitoring data to identify performance bottlenecks and optimize user experience. Documentation and Reporting: Maintain accurate documentation of monitoring processes, configurations, and incident reports. Generate regular reports on system performance, uptime, and incident resolution metrics. KPIs and Dashboards: Develop and publish key performance indicators (KPIs), dashboards, and other reporting mechanisms to provide insights into system performance, trends, and areas for improvement. Present findings and recommendations to stakeholders and management. Team Leadership: Provide leadership and guidance to monitoring team members, fostering a culture of collaboration, continuous improvement, and excellence in monitoring practices. Skills Bachelor's degree in Computer Science, Information Technology, or a related field. (Master's degree preferred) Proven experience (5+ years) in system monitoring, performance analysis, and incident response, preferably in a lead or supervisory role. Strong technical expertise in monitoring tools such as Nagios, Microsoft SCOM, Site24X7, and other third-party tools. Solid understanding of network protocols, server infrastructure, and cloud environments (e.g., AWS, Azure), with experience in cloud monitoring, synthetic monitoring, and optimization. Experience with scripting languages (e.g., Python, PowerShell) for automation and monitoring tasks. Excellent analytical, problem-solving, and decision-making skills. Strong leadership, communication, and team collaboration abilities. Experience in publishing KPIs, dashboards, and other reporting mechanisms. Good to have : Relevant certifications such as ITIL Foundation, Certified Monitoring Professional (CMP), Microsoft Certified: Azure Administrator Associate, and Microsoft Certified: SCOM (if available) are a plus Interested candidates please apply on the given link https://apply.workable.com/ezrecruiting/j/1217C832C7/
Posted 1 week ago
3.0 - 6.0 years
6 - 15 Lacs
Bengaluru
Hybrid
Observability Engineer: Define and implement new monitoring definitions following best practices. Focus on Infrastructure monitoring (mandatory); Application stack monitoring is a plus. Tune monitoring definitions to reduce operational noise. Experience working with SolarWinds and ServiceNow ITOM AIOPs. ITOM AIOPs Event Management: Tune monitoring policies and event rules. Optimize Operational Intelligence configurations. Optimize architecture and usage of Midservers for monitoring. Configure Health Log Analytics; recommend appropriate logging sources to enhance monitoring and detect change-related alerts. Configure Agent Client Collector (ACC) monitoring, aiming to replace SolarWinds for server monitoring. Develop and configure Service and Configuration Item (CI) binding rules for monitoring. Develop automated alert response mechanisms: Fully automated responses for automatic alert resolution. Playbook actions in ServiceNow triggered manually by agents. Automation can be implemented via scripts, ServiceNow OOTB/custom responses, or Swimlane workflows. Duration: 3-6 months contract followed by C2H (Contract to Hire) Note: Candidates unwilling to convert after 3-6 months will not be considered Shift Timings: 5 PM 2 AM IST
Posted 1 week ago
2.0 - 5.0 years
3 - 6 Lacs
Kochi, Trivandrum
Work from Office
Administer and maintain Windows/Linux servers; perform installations, updates, and troubleshooting. Manage virtualization platforms (VMware/Hyper-V) and support cloud environments (AWS, Azure, GCP). Provide network and infrastructure support including routers, switches, firewalls, and VPNs. Ensure IT security through regular audits, monitoring, and incident response using tools like Nagios/Zabbix. Deliver Level 2/3 technical support and maintain detailed system documentation. Collaborate with cross-functional teams for infrastructure planning and deployments. Willing to travel to Cochin or Trivandrum 12 times per month for on-site support as required. Candidate Profile 2-5 years of experience as a System Engineer Strong knowledge of system administration, troubleshooting, and network support Experience with Windows/Linux servers, virtualization, and cloud services Hands-on experience with IT security and infrastructure monitoring Excellent problem-solving skills and the ability to work in a fast-paced environment Willingness to travel to another location (Cochin or Trivandrum) once or twice a month based on the work location
Posted 3 weeks ago
1.0 - 6.0 years
3 - 8 Lacs
Bengaluru
Work from Office
We are looking for an Ansible Engineer to automate infrastructure deployment and configuration management. Design and implement automation solutions using Ansible playbooks and roles. Develop infrastructure-as-code (IaC) solutions for provisioning and managing servers. Automate cloud environments (AWS, Azure, GCP) using Ansible and Terraform. Manage configuration drift, security compliance, and infrastructure monitoring. Collaborate with DevOps and security teams for automation initiatives.
Posted 3 weeks ago
1 - 5 years
3 - 8 Lacs
Bengaluru
Work from Office
We are looking for an Ansible Engineer to automate infrastructure deployment and configuration management. Design and implement automation solutions using Ansible playbooks and roles. Develop infrastructure-as-code (IaC) solutions for provisioning and managing servers. Automate cloud environments (AWS, Azure, GCP) using Ansible and Terraform. Manage configuration drift, security compliance, and infrastructure monitoring. Collaborate with DevOps and security teams for automation initiatives.
Posted 1 month ago
1 - 6 years
3 - 8 Lacs
Bengaluru
Work from Office
We are looking for an Ansible Engineer to automate infrastructure deployment and configuration management. Design and implement automation solutions using Ansible playbooks and roles. Develop infrastructure-as-code (IaC) solutions for provisioning and managing servers. Automate cloud environments (AWS, Azure, GCP) using Ansible and Terraform. Manage configuration drift, security compliance, and infrastructure monitoring. Collaborate with DevOps and security teams for automation initiatives.
Posted 2 months ago
2 - 7 years
5 - 10 Lacs
Gurgaon
Work from Office
Title: Sr. EMS Analyst Location: Gurgaon, India Job Description Who We Are: Fareportal is a travel technology company powering a next-generation travel concierge service. Utilizing its innovative technology and company owned and operated global contact centers, Fareportal has built strong industry partnerships providing customers access to over 600 airlines, a million lodgings, and hundreds of car rental companies around the globe. With a portfolio of consumer travel brands including CheapOair and OneTravel, Fareportal enables consumers to book-online, on mobile apps for iOS and Android, by phone, or live chat. Fareportal provides its airline partners with access to a broad customer base that books high-yielding international travel and add-on ancillaries. Fareportal is one of the leading sellers of airline tickets in the United States. We are a progressive company that leverages technology and expertise to deliver optimal solutions for our suppliers, customers, and partners. FAREPORTAL HIGHLIGHTS: Fareportal is the number 1 privately held online travel company in flight volume. Fareportal partners with over 600 airlines, 1 million lodgings, and hundreds of car rental companies worldwide. 2019 annual sales exceeded $5 billion. Fareportal sees over 150 million unique visitors annually to our desktop and mobile sites. Fareportal, with its global workforce of over 2,600 employees, is strategically positioned with 9 offices in 6 countries and headquartered in New York City. Job Description and Responsibilities: Monitor on 24 x7 basis health of Servers/Network/Applications/Websites/APIs and report alarms utilizing network/systems/Application/Websites monitoring tools. Hands-on working on enterprise monitoring tools, i.e. MS SCOM/SolarWinds/ AppInsight/Elastic/Grafana/Promethus along with SAAS base monitoring solutions i.e. Rigor/Website pulse/Catchpoint etc. Should have clear understanding on public cloud i.e. AWS and Azure, its monitoring solution, Datadog/CloudWatch/Appinsight etc. Identify, diagnose, and resolve issues. Create and maintain comprehensive documentation. Strong problem-solving and trouble-shooting skills. Monitor performance, capacity, and availability of the IT components on an ongoing basis. Recommend improvements in technologies and practices to increase uptime. Collect and review performance reports for various systems, and report trends in Network, Server, Application & Websites performance. Provide timely response to all incidents, outages and performance alerts. Categorize issues for escalation to appropriate technical teams. Should have clear understanding of Incident Management experience and mandatory (P1/P2). Should have clear understanding on Change as well Problem Management as well. • Ensure timely follow up with cross-functional teams via e-mails, phone calls and slack. Willing to do rotational shifts, 24 x 7. Required Skills & Qualifications: 2+ years experience in Enterprise Monitoring Tools, Applications, Servers monitoring and troubleshooting or similar role. Work experience on Windows Servers, SQL, Network equipment. Familiarity with scripting, network security, firewalls or Linux environment. Bachelor’s Degree/Diploma in Computer Science, Information Systems, Engineering, Business or technical discipline Preferred Skills & Qualifications: Willing to do rotational shifts, 24 x 7. Strong problem-solving and trouble-shooting skills. Aptitude for learning new technologies, interest in professional development. Good Technical Skills in Networks, Firewalls, Servers. Good communicator with a natural aptitude for dealing with people. Should be a team player. Quick learner and able to deal with a wide range of issues. Good analytical skills and able to collate and interpret data from various sources. Ability to assess and prioritize faults and respond or escalate accordingly. Disclaimer This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Fareportal reserves the right to change the job duties, responsibilities, expectations or requirements posted here at any time at the Company’s sole discretion, with or without notice.
Posted 2 months ago
5 - 8 years
12 - 15 Lacs
Pune, Mumbai (All Areas)
Work from Office
Exp in New Relic, Strong exp of APM, Infrastructure Monitoring, Synthetics & NRQL for custom dashboards & alerts. Exp in installing & configuring New Relic agents (e.g., Java, .NET, Node.js) & integrating with cloud platforms like Azure & other etc Required Candidate profile Relevant Exp 5+ years’ in app Monitoring. Exp in automating New Relic configurations using tools like Terraform & creating scripted Synthetic monitors. Exp in AP integration, Indexer, Forwarder must
Posted 3 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
Accenture
36723 Jobs | Dublin
Wipro
11788 Jobs | Bengaluru
EY
8277 Jobs | London
IBM
6362 Jobs | Armonk
Amazon
6322 Jobs | Seattle,WA
Oracle
5543 Jobs | Redwood City
Capgemini
5131 Jobs | Paris,France
Uplers
4724 Jobs | Ahmedabad
Infosys
4329 Jobs | Bangalore,Karnataka
Accenture in India
4290 Jobs | Dublin 2