Jobs
Interviews

1633 Grafana Jobs - Page 27

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

10.0 - 15.0 years

12 - 17 Lacs

Gurugram, India

Work from Office

Position Summary: A Senior Full Stack Senior Software Developer designs, develops, and maintains both front-end and back-end systems for scalable, secure, and high-performance web applications. They lead technical projects, mentor junior developers, and ensure best practices across the development lifecycle. How You’ll Make an Impact (responsibilities of role) Build front-end (React, Angular, Vue.js) and back-end (Node.js, Python, Java) systems. Design and optimize databases (SQL, NoSQL) and APIs (REST, GraphQL). Implement cloud solutions (AWS, Azure) and DevOps tools (Docker, Kubernetes). Write clean, maintainable code and ensure testing (unit, integration, CI/CD). Collaborate with teams and provide technical leadership. What You Bring (required qualifications and skills) Must-Have Qualifications EducationBachelor’s/master’s in computer science or related fields. 5–10+ years of professional experience in software development with a broad and deep understanding of modern systems Strong DevOps mindset and hands-on experience with Docker, VMs, and container orchestration At least one cloud platform (AWS, Azure, or GCP) CI/CD pipelines and Git-based workflows Infrastructure as Code (e.g., Terraform, Pulumi) Solid networking fundamentals (DNS, routing, firewalls, etc.) Proven experience in API design, data modeling, and authn/authz mechanisms such as OAuth2, OIDC, or similar Comfortable with backend development in at least one modern language Go, Rust, C#, or similar Strong frontend development skills using modern frameworks React, Angular, Vue, or Web Components Good understanding of design systems, CSS, and responsive UI Ability to learn new languages and tools quickly and independently Experience working in cross-functional teams and agile environments Should-Have Qualifications Contributions to / or experience with open-source projects Cross-disciplinary understanding of UX/UI design principles Familiarity with testing frameworks and quality assurance practices Experience with monitoring and observability tools (e.g., Prometheus, Grafana) Nice-to-Have Experience with hybrid or distributed architecture Exposure to WebAssembly, micro frontends, or edge computing Background in security best practices for web and cloud applications

Posted 3 weeks ago

Apply

9.0 - 14.0 years

37 - 45 Lacs

Pune

Work from Office

: Job TitleIT Application Owner - AVP LocationPune, India Role Description At the Service, Solutions and AI Domain, our mission is to revolutionize our Private Bank process landscape by implementing holistic, front-to-back process automation. We are committed to enhancing efficiency, agility, and innovation, with a keen focus on aligning every step of our processes with the customers needs and expectations. Our dedication extends to driving innovative technologies, such as AI & workflow services, to foster continuous improvement. We aim to deliver best in class solutions across products, channels, brands, and regions, thereby transforming the way we serve our customers and setting new benchmarks in the industry. The IT Application Owner (ITAO) is the custodian of the application and is responsible to apply and enable during Life-Cycle of the application the IT policies and procedures with specific consideration to IT management and Information Security. The ITAO ensures a clear separation of the responsibility within the project, aimed at achieving a safe and secure running of the application and compliance to regulations, policies and standards. The ITAO is responsible for application documentation, application infrastructure reliability and compliance, and is usually the IT SPOC for audit initiatives. Join us in our journey to redefine banking with AI and service solutions into the future. What well offer you 100% reimbursement under childcare assistance benefit (gender neutral) Sponsorship for Industry relevant certifications and education Accident and Term life Insurance Your key responsibilities Ensure Application Stability and Performance Oversee the structural availability, stability, and performance of applications in your domain (PRD, UAT, SIT). Organize Level 3 Support and align/organize Level 2 Support Collaborate with development teams to organize Level 3 support for the application and align with Service Operations/SO to organize Level 2 support (or setup L2 support in case this couldnt provided by SO) Policy Compliance Ensure policy compliance and take ownership of projects necessary for compliance, such as security monitoring. Technical Roadmap Management Manage the technical roadmap, including technology compliance, and estimate/budget capacity needs. AI Risk Management Identify and proactively manage risks generated from AI usage in the bank, ensuring responsible AI practices are followed. Project Participation Participate in Domain Expert Meetings and contribute to project cost estimates, including "Run the Bank" and total application cost. Define Non-Functional Requirements Ensure project teams incorporate non-functional requirements in their projects. Validate Deliverables Validate deliverables for all projects/changes, such as test plans and analysis documents. Knowledge and Documentation Ensure the availability of all necessary application/service knowledge and documentation. Audit Collaboration Work closely with Audit Teams to avoid delays or escalations related to non-compliance. Infrastructure Responsibility Take responsibility for access and other infrastructure-related topics. Stakeholder Engagement Engage and manage multiple stakeholders to ensure regulatory compliance, smooth operations, and a sound development lifecycle. This includes Business, Security, Development, Test (QA), IT Support, Finance, external Vendors, and Architecture. Compliance with IT Policies Ensure the application is compliant with the company's IT policies based on regulatory requirements. Service Availability and Stability Be accountable for high service availability and stability, while managing projects for maintenance or enhancements. DevOps Facilitation Facilitate a DevOps approach by setting up monitoring, configuring deployment-automation tools, preparing software packages, raising and implementing changes, and managing certificates and software licenses. Technical Performance Monitoring Monitor the technical performance of applications (Response Times, Error Rates, Memory/Storage Usage) and address issues. Strategic Planning Conduct strategic planning for the applications in scope. Change Management Ensure changes to applications are fully aligned with DB standards and regulations, guaranteeing system stability and smooth transitions to production. Technical Project Management Manage technical projects to maintain required service levels. Go Live Transitions Contribute to Go Live transitions. Operational Collaboration Collaborate with Support entities to ensure proper operational levels for the application. Capacity Management Follow up on infrastructure capacity management. Your skills and experience ITIL Framework Knowledge and certification in the ITIL framework. IT Service Management and Cloud Technologies Experience in IT Service management processes and cloud technologies. Educational Background A bachelors degree in computer science or equivalent. Distributed Development Teams Experience working with distributed development teams especially between Europe (Germany and Romania) and India - and familiarity with the Software Development Life Cycle (SDLC). Communication Skills Excellent written and verbal communication skills at all levels, including senior management. Audit and Compliance Experience with audit and compliance, AI/ML ethics & regulation, continuous integration, and DevOps tools. Cloud Knowledge High-level knowledge of Cloud (IaaS, PaaS, SaaS) and the ability to work on tight deadlines. DevOps Tools Skills in utilizing GitHub CI, Jenkins, TeamCity, Ansible, and experience building CI/CD pipelines. Source Control and Monitoring Tools In-depth knowledge of source control (preferably GitHub, Bitbucket) and working knowledge of environment monitoring tools such as Prometheus, Grafana, Geneos, AppDynamics, and New Relic. Infrastructure as Code Knowledge of Infrastructure as Code (Terraform), SQL, and relational databases. Enterprise-Scale Development Basic exposure to delivering good quality code within enterprise-scale development and hands-on experience with cloud security and operations. Financial Services Knowledge gained in Financial Services environments and practical knowledge of database systems and structures. AI and Data-Centric Applications Experience in managing AI and data-centric applications. Unix/Linux Strong knowledge of Unix/Linux including commands and shell scripting. Analytical and Conceptual Thinking Excellent analytical and conceptual thinking skills. Agile Delivery Teams Strong independence and initiative, ability to work in agile delivery teams. How well support you

Posted 3 weeks ago

Apply

10.0 - 16.0 years

45 - 50 Lacs

Bengaluru

Work from Office

: Job Title Senior Site Reliability Engineer - Channels, VP LocationBangalore, India Role Description DWS Technology in India DWS Technology is a global team of technology specialists, spread across multiple trading hubs and tech centres. We have a strong focus on promoting technical excellence our engineers work at the forefront of financial services innovation using cutting-edge technologies. Our India location is our most recent addition to our global network of tech centres and growing strongly. We are committed to building a diverse workforce and to creating excellent opportunities for talented engineers and technologists. Our tech teams and business units use agile ways of working to create #GlobalHausbank solutions from our home market. DWS Digital Products and Channels DWS Digital Products and Channels team orchestrates internal and external API products, portals, enabling services and embedded finance products in global level. The team is highly skilled and innovative group dedicated to developing cutting-edge solutions and services that leverage the power of APIs to drive digital transformation and enhance the asset management experience for clients worldwide. As a Senior Site Reliability Engineer, you will be responsible for the SRE activities across platforms, portals and enabling services together with other SREs and engineers. What well offer you 100% reimbursement under childcare assistance benefit (gender neutral) Sponsorship for Industry relevant certifications and education Accident and Term life Insurance Your key responsibilities As Senior Site Reliability Engineer you Orchestrate and contribute SRE activities across API Platforms and Integration services Introduce all engineering disciplines that combine software- and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems Implement the core of DevOps with specific principles and practices, focusing on what and how to improve reliability Establish and support capacity planning procedures and have a close eye on SLIs and SLOs for production readiness and in live environment Coordinate with the rest of the division and the teams working on different layers of the application and infrastructure, and you have full commitment to collaboration on problem solving For Infrastructure & Service Management you Engage in and improve the whole lifecycle of services - from inception and design, deployment, operation, and refinement Maintain services once they are live by measuring and monitoring availability, latency, and overall system health Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity Develop and enforce policies, standards and guidelines for site reliability Automate application and infrastructure deployment activities to production environments. For Incident & Problem Management you Perform troubleshooting & Emergency Response Investigate root causes and suggest solutions Increase the productivity by leading blameless post-mortems For Application Maintenance you Collaboratively work with Product Owners and Engineers to run reliable services Configure and maintains application & monitoring Identify business objects for monitoring Track system performance, capacity, and use your experience to create effective strategies for maintaining and improving system performance and availability. For Operational Continuous Improvement you Identify issues and optimization potential and introduce related user stories Support with automation knowhow to reduce the risk of bad changes Identify, design, develop, deploy tools and processes to monitor, maintain, and report site performance and availability For Service Onboarding you Support your Squad and your Chapter population in onboarding & promotions Your skills and experience Expert hands-on experience with on-premises Expert hands-on experience with cloud ecosystems run on Google Cloud Expert hands-on experience with Docker / Kubernetes operations with GKE or similar technology Expert experience with automated infrastructure provisioning based on Terraform/TerraGrunt, Terraform Enterprise, Ansible Advanced hands-on experience with Continuous Integration / Continuous Deployment (Github) and patterns for CI/CD pipelines. Advanced hands-on experience of monitoring tools like Prometheus, Grafana, Kibana and alerting tools like OpsGenie, NewRelic, DataDog, Splunk, Google Operations-Suite (Stackdriver) Very good knowledge of security capabilities (TLS, OAuth2, KMS, Vault, Admission Controllers, let's encrypt or similar technologies). Very good understanding of Microservice architectures and experience with API Management with Apigee or WSO2 Experience in software development in at least one language (Java, JavaScript, Python, Go). Good Knowledge of the Software Development Life Cycle processes based on related tools such as TeamCity, BitBucket, Artifactory SonarQube, VeraCode, Crucible JIRA, Confluence, Service Now How well support you About us and our teams Please visit our company website for further information: https://www.db.com/company/company.htm We at DWS are committed to creating a diverse and inclusive workplace, one that embraces dialogue and diverse views, and treats everyone fairly to drive a high-performance culture. The value we create for our clients and investors is based on our ability to bring together various perspectives from all over the world and from different backgrounds. It is our experience that teams perform better and deliver improved outcomes when they are able to incorporate a wide range of perspectives. We call this #ConnectingTheDots.

Posted 3 weeks ago

Apply

3.0 - 8.0 years

6 - 15 Lacs

Chennai

Work from Office

Key Responsibilities: System Reliability & Performance: Design, implement, and maintain highly available, scalable, and resilient systems on Azure. Proactively monitor system health, performance, and availability using Azure Monitor, Application Insights, Log Analytics, and other monitoring tools (e.g., Grafana, Prometheus, Splunk). Define, track, and report on Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure adherence to service availability and performance targets. Conduct root cause analysis (RCA) for incidents and implement preventive measures to avoid recurrence. Participate in on-call rotation to provide 24/7 support for production systems, diagnosing and resolving critical issues promptly. Automation & Infrastructure as Code (IaC): Develop and maintain automation scripts and tools using PowerShell, Python, Bash, or Go to automate repetitive tasks, deployments, and infrastructure provisioning. Implement and manage infrastructure using IaC principles with tools like Terraform or Azure Bicep. Contribute to the design and implementation of robust CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools to ensure efficient and reliable application deployments. Azure Ecosystem Management: Hands-on experience deploying, configuring, and managing a wide range of Azure services, including: Compute: Azure Virtual Machines, Azure Kubernetes Service (AKS), Azure Functions, Azure App Service Networking: Azure Virtual Networks, Load Balancers, Azure Front Door, DNS Storage: Azure Storage Accounts (Blob, File, Queue, Table), Azure SQL Database, Azure Cosmos DB Monitoring & Logging: Azure Monitor, Application Insights, Log Analytics, Kusto Query Language (KQL) Security: Azure Active Directory (AAD), Azure Security Center, Azure Policy, Key Vault, Network Security Groups (NSGs) Optimize Azure resource utilization for cost efficiency and performance. Collaboration & Best Practices: Collaborate closely with development teams (DevOps culture) to integrate reliability practices into the software development lifecycle ("shift-left"). Promote and implement SRE best practices, including error budgets, blameless post-mortems, and continuous improvement. Contribute to documentation of system architecture, operational procedures, and troubleshooting guides. Stay up-to-date with emerging Azure technologies and SRE trends, proposing and adopting relevant innovations. Required Skills & Qualifications: Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience. 3-5 years of hands-on experience in a Site Reliability Engineering, DevOps, or similar role with a strong focus on Microsoft Azure. Proficiency in at least one scripting or programming language (e.g., Python, PowerShell, Go, Bash). Solid understanding of Infrastructure as Code (IaC) principles and experience with tools like Terraform or Azure Bicep. Demonstrated experience with CI/CD pipelines (Azure DevOps preferred). Strong experience with Azure monitoring and logging solutions (Azure Monitor, Application Insights, Log Analytics, KQL). Experience with containerization and orchestration technologies, particularly Azure Kubernetes Service (AKS). Good understanding of networking concepts (TCP/IP, DNS, Load Balancing). Familiarity with database systems (SQL and NoSQL). Strong problem-solving, analytical, and troubleshooting skills. Excellent communication and collaboration skills, with the ability to work effectively in a team environment. Ability to work independently and manage multiple priorities in a fast-paced environment. Preferred Skills & Certifications: Microsoft Certified: Azure Administrator Associate (AZ-104) Microsoft Certified: Azure DevOps Engineer Expert (AZ-400) Certified Kubernetes Administrator (CKA) Experience with other monitoring tools like Grafana, Prometheus, Splunk, Datadog. Familiarity with security best practices in cloud environments. Experience with Git and version control systems.

Posted 3 weeks ago

Apply

3.0 - 8.0 years

5 - 9 Lacs

Gurugram

Work from Office

Role Purpose The L3 Network Security Operations Engineer is a critical role within the Cybersecurity team, with the dual responsibilities of operational excellence and driving forward-looking engineering improvements. This role is designed for individuals who have strong network security operational experience, and a proven track record of prior engineering delivery Provide L3 Operational Support for complex operational issues, troubleshoot and resolve issues Design, configure, and manage advanced network security solutions, including Firewalls, Zero Trust Network Access (ZTNA), Secure Web Gateways (SWGs), and Cloud Network Security capabilities. Continually refine and improve support methodologies, standardizing operational practices and creating detailed documentation. Employ infrastructure-as-code (IaC) and automation techniques, particularly Terraform, to streamline the provisioning, configuration, and management of network security tools and environments. Conduct in-depth analyses of network traffic patterns and security logs with SIEM tools (e.g., Splunk). Support Network Security Infrastructure focusing on patch and lifecycle management. Qualifications: - A minimum of 7 years and above of direct, hands-on experience in Network Security Operations, with a significant focus and exposure to engineering enhancements. - Experience with Zscaler ZIA & ZPA, Palo Alto Firewalls - Preferred experience (or similar) with: Cloudgenix (SD-WAN), Cloudflare (WAF), Forescout (NAC), and Tufin/Algosec (Firewall Orchestration) - Hands-on experience with public cloud providers (AWS preferred) and cloud infrastructure management. - Experience with infrastructure-as-code frameworks (e.g., Terraform Cloud). - Ability to write automation scripts and web services (Python, Bash). - Strong understanding of network protocols and information security best practices. - Experience working with git source control and CI/CD systems (GitLab CI/CD). - Good understanding of enterprise architecture, including endpoint, network, and cloud-based systems - Experience with SIEM (Splunk) technologies, event correlations, query management, and custom detections Experience with observability platforms (Grafana) - B.S. in Information Technology, Computer Science, or a similar technical program. Soft Skills - Excellent communication skills, with the ability to explain technical concepts to non-technical stakeholders and collaborate effectively with cross-functional teams. - Strong analytical, problem-solving, and excellent documentation and organization skills. - Ability to self-organize, prioritize activities independently, and manage uncertainty effectively. - Experience managing stakeholder expectations in the delivery of projects. - Adaptability and continuous learning: proactive approach to self-education and flexibility to pivot strategies in response to new information or changing environments - Attention to detail: able to thoroughly review configurations and policies, identifying gaps in solution designs prior to implementation

Posted 3 weeks ago

Apply

5.0 - 8.0 years

15 - 24 Lacs

Bengaluru

Work from Office

Consultant - Cloud Engineer with DevOps : Elevate Your Impact Through Innovation and Learning Evalueserve is a global leader in delivering innovative and sustainable solutions to a diverse range of clients, including over 30% of Fortune 500 companies. With a presence in more than 45 countries across five continents, we excel in leveraging state-of-the-art technology, artificial intelligence, and unparalleled subject matter expertise to elevate our clients' business impact and strategic decision-making. Our team of over 4, 500 talented professionals operates in countries such as India, China, Chile, Romania, the US, and Canada. Our global network also extends to emerging markets like Colombia, the Middle East, and the rest of Asia-Pacific. Recognized by Great Place to Work in India, Chile, Romania, the US, and the UK in 2022, we offer a dynamic, growth-oriented, and meritocracy-based culture that prioritizes continuous learning and skill development and work-life balance. About Data Analytics (DA) DA, one of the fastest-growing practices in Evalueserve, provides rewarding career opportunities. Established in 2014, our global DA team includes data science professionals across data engineering, business intelligence, digital marketing, advanced analytics, technology, and product engineering fields. Our tenured teammates, some of whom have been with Evalueserve since it was launched more than 20 years ago, hold leadership positions across our seven business lines in different parts of the world. Role Overview We are seeking a skilled and proactive GCP Cloud Engineer with DevOps expertise to design, implement, and manage scalable cloud infrastructure and CI/CD pipelines on Google Cloud Platform. The ideal candidate will have a strong background in cloud architecture, automation, and modern DevOps practices. Responsibilities Design and deploy secure, scalable, and resilient cloud infrastructure on GCP. Implement and manage CI/CD pipelines using tools like Jenkins, GitLab CI, or Cloud Build. Automate infrastructure provisioning using Terraform, Deployment Manager, or similar IaC tools. Monitor and optimize cloud resources for performance and cost efficiency. Collaborate with development teams to ensure smooth integration and deployment. Manage containerized applications using Kubernetes (GKE) and Docker. Ensure compliance with security and governance standards. Troubleshoot and resolve infrastructure and deployment issues. What we are looking for 3+ years of experience in cloud engineering and DevOps. Strong hands-on experience with Google Cloud Platform (GCP) services. Proficiency in Terraform, Ansible, or other IaC tools. Experience with CI/CD tools like Jenkins, GitLab CI, or GCP Cloud Build. Solid understanding of Kubernetes, Docker, and container orchestration. Familiarity with monitoring tools like Prometheus, Grafana, or Stackdriver. Scripting skills in Python, Shell, or Go. Knowledge of networking, security, and cloud architecture best practices. Good to have Azure DevOps and AWS infra/DevOps experience Preferred Qualifications GCP Professional Cloud Architect or DevOps Engineer certification. Experience with multi-cloud or hybrid cloud environments. Familiarity with Agile and DevOps methodologies. Follow us on https://www.linkedin.com/compan y/evalueserve/ Click here to learn more about what our Leaders talking on achievements AI-powered supply chain optimization solution built on Google Cloud. How Evalueserve is now Leveraging NVIDIA NIM to enhance our AI and digital transformation solutions and to accelerate AI Capabilities . Know more about how Evalueserve has climbed 16 places on the 50 Best Firms for Data Scientists in 2024! Want to learn more about our culture and what its like to work with us? Write to us at: careers@evalueserve.com Disclaimer: The following job description serves as an informative reference for the tasks you may be required to perform. However, it does not constitute an integral component of your employment agreement and is subject to periodic modifications to align with evolving circumstances. Please Note : We appreciate the accuracy and authenticity of the information you provide, as it plays a key role in your candidacy. As part of the BGV process, we verify your employment, education, and personal details. Please ensure all information is factual and submitted on time. For any assistance, your TA SPOC is available to support you .

Posted 3 weeks ago

Apply

10.0 - 16.0 years

25 - 32 Lacs

Mumbai

Work from Office

Job Responsibilities: Engineer and automate various database platforms and services. Assist in the ongoing process of rationalizing the technology and usage of databases. Participate in the creation and implementation of operational policies, procedures & documentation. Database Administration and Production support for databases hosted on private cloud across all regions. Database version Upgrades and Security patching. Performance Tuning. Database replication administration. Collaborate with development teams and utilize coding skills to design and implement database solutions for new and existing applications. Willing to work in the weekend and non-office hours as part of wider scheduled support group. Willingness to learn and adapt to new technologies and methodologies. Required Skills Mandatory The candidate must have the following skills and experience: At least 10 + years of experience in Sybase ASE. Sybase IQ is an Add on Proven ability to navigate Linux operating systems and utilize command-line tools proficiently. Exposure in scripting languages like Python and automation tools like Ansible. Have a proven effective and efficient troubleshooting skill set. Ability to cope well under pressure. Strong Organization Skills and Practical Sense Quick and Eager to Learn and explore both Technical and Semi-Technical work types Engineering Mindset Preferred Skills Experience Knowledge of the following will be added advantage (but not mandatory): Experience in MSSQL, MySQL and Oracle Exposure in MSSQL availability group Experience in Infrastructure Automation Development Experience with monitoring systems and log management/reporting tools (e.g.Loki, Grafana, Splunk).

Posted 3 weeks ago

Apply

7.0 - 12.0 years

15 - 17 Lacs

Bengaluru

Work from Office

Infrastructure Management CI/CD Implementation Automation Monitoring & Logging Security Integration Collaboration Troubleshooting

Posted 3 weeks ago

Apply

10.0 - 15.0 years

22 - 27 Lacs

Mumbai, Powai

Work from Office

Notice period - Immediate to 30 days & Currently Serving NP Mandatory The candidate must have the following skills and experience: 10 + years of experience in MSSQL DBA administration Proven ability to navigate Linux operating systems and utilize command -line tools proficiently. Clear understanding on MSSQL availability group Exposure in scripting languages like Python and automation tools like Ansible. Have a proven effective and ef f icient troubleshooting skill set. Ability to cope well under pressure. Strong Organization Skills and Practical Sense Quick and Eager to Learn and explore both Technical and Semi -Technical work types Engineering Mindset Preferred Skills Experience / Knowledge of the following will be added advantage (but not mandatory): Experience in MySQL and Oracle Experience in Inf rastructure Automation Development Experience with monitoring systems and log management/reporting tools (e.g.Loki, Grafana, Splunk).

Posted 3 weeks ago

Apply

5.0 - 9.0 years

15 - 20 Lacs

Hyderabad

Work from Office

Role: DevOps Engineer Location: Hyderabad (WFO - 5 Days) Experience: 5+ Years Key Skills: CI/CD piplines, Kubernetes, Terraform, Ceph, observability stack (Elastic Search, Kibana, Prometheus, Grafana), Shell script

Posted 3 weeks ago

Apply

7.0 - 12.0 years

25 - 37 Lacs

Bengaluru

Hybrid

Job Title : Sr. DevOps SRE Location State : Karnataka Location City : Bangalore Experience Required : 8+ Year(s) Shift: IST shift with some overlap with US shift Work Mode: Hybrid / Remote Position Type: Contract Company Name: VARITE INDIA PRIVATE LIMITED About The Client: An American multinational digital communications technology conglomerate corporation headquartered in San Jose, California. The Client develops, manufactures, and sells networking hardware, software, telecommunications equipment, and other high-technology services and products. The Client specializes in specific tech markets, such as the Internet of Things (IoT), domain security, videoconferencing, and energy management. It is one of the largest technology companies in the world, ranking 82nd on the Fortune 100 with over $51 billion in revenue and nearly 83,300 employees. Essential Job Functions: Develop ansible playbooks for configuring the Clients devices Design, configure, and maintain Grafana dashboards for real-time monitoring and visualization of infrastructure, application, and business metrics. Develop and optimize alerting rules to proactively detect and resolve issues. Create custom Splunk queries, dashboards, and reports for incident detection and troubleshooting. Build, deploy, and manage containers using Docker. Create, manage, and troubleshoot Kubernetes manifests (Deployments, Services, ConfigMaps, etc.). Develop, maintain, and optimize CI/CD pipelines for automated build, test, and deployment processes (using tools like Jenkins, GitLab CI, GitHub Actions, etc.). Implement best practices for infrastructure as code, automated testing, and continuous integration/delivery. Qualifications : Experience Required: 8+ years Relevant Experience: Strong DevOps/SRE with Automation focus How to Apply: Interested candidates are invited to submit their resume using the apply online button on this job post. Equal Opportunity Employer: VARITE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, veteran status, or disability status. Unlock Rewards: Refer Candidates and Earn. If you're not available or interested in this opportunity, please pass this along to anyone in your network who might be a good fit and interested in our open positions. VARITE offers a Candidate Referral program, where you'll receive a one-time referral bonus based on the following scale if the referred candidate completes a three-month assignment with VARITE. Exp Req - Referral Bonus 0 - 2 Yrs. - INR 5,000 2 - 6 Yrs. - INR 7,500 6 + Yrs. - INR 10,000 About VARITE: VARITE is a global staffing and IT consulting company providing technical consulting and team augmentation services to Fortune 500 Companies in USA, UK, CANADA and INDIA. VARITE is currently a primary and direct vendor to the leading corporations in the verticals of Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.

Posted 3 weeks ago

Apply

2.0 - 5.0 years

2 - 3 Lacs

Chennai

Work from Office

Monitoring entire infrastructure using various monitoring tools like Zabbix, SCOM, SolarWinds, Telegraph. Monitoring various types of alerts like CPU, Memory, Database, DR Replication, Backup Failure, Exchange Mail queue, Application URL Alerting Required Candidate profile Good communication Skills 24*7, Rotational Shift 6 days working in week only male candidates can apply Job Location: Saidapet(Chennai) Exp Level: 2 to 3 Yrs Salary: 3 LPA NP: Max 15 days can apply

Posted 3 weeks ago

Apply

4.0 - 7.0 years

8 - 13 Lacs

Bengaluru

Work from Office

Your Impact: The ESM R&D team is seeking an experienced Performance engineer, to join our Global R&D team to deliver innovative enterprise software solutions by working in a fast paced challenging and enriching environment. This is a high-growth business, and our solutions are used by enterprise class highly demanding customers across the globe. We are using a Microservices based architecture composed of multiple services running on Kubernetes using Docker Containers. As a Senior Performance Engineer, You will lead efforts to optimize the performance and efficiency of our applications, services, and systems. You will work closely with development, quality assurance, and infrastructure teams to identify performance bottlenecks, define performance testing strategies, and implement solutions that enhance the overall user experience. You will contribute as a team member and take responsibility for own work commitments and take part in project functional problem-solving. You will make decisions based on established practices. You will work under general guidance with progress reviewed on a regular basis. What the role offers: Performance testing, profiling, and benchmarking efforts for new and existing products and services. Identify performance bottlenecks and work closely with the development team to propose and implement optimization solutions. Develop and execute load, stress, and scalability tests to ensure that systems can handle high traffic and user loads. Collaborate with cross-functional teams to define performance goals and ensure that performance is a key consideration throughout the development lifecycle. Build and maintain automated performance testing frameworks and tools to continuously monitor the performance of systems. Generate detailed performance reports, including recommendations and root cause analysis for performance issues. Conduct performance tuning for databases, servers, and cloud services. Analyze system behavior, identify areas of improvement, and make proactive recommendations for performance enhancements. Stay up to date with emerging performance testing tools and methodologies and drive their adoption. What you need to succeed: Strong experience with performance testing tools such as LoadRunner, JMeter, Gatling, NeoLoad, or similar. Knowledge of monitoring tools (e.g., Yourkit, VirtualVM, Prometheus, Grafana, New Relic) and how to leverage them for performance insights. Proficiency in programming languages like Java, Python, Selenium or similar for automation and scripting purposes. Experience with CI/CD pipelines, GitLab CI, Jenkins and integrating performance testing into automated testing workflows. Excellent communication skills with the ability to explain complex technical concepts to non-technical stakeholders. Strong problem-solving abilities and the ability to thrive in a fast-paced, dynamic environment. Strong analytical skills with the ability to interpret complex data and provide actionable insights.

Posted 3 weeks ago

Apply

15.0 - 24.0 years

60 - 90 Lacs

Pune

Remote

Principal SRE Experience: 15 - 25 Years Exp Salary : Competitive Preferred Notice Period : Within 60 Days Shift : 10:00AM to 6:00PM IST Opportunity Type: Remote Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : CI/CD OR RCA OR Performance Engineering and AWS OR Scaled Agile OR Python and Grafana OR Linux Netskope (One of Uplers' Clients) is Looking for: Principal SRE who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description Job Summary: Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. Whats in it for you You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management. You will have an opportunity to work on hybrid cloud (Google Cloud, On-prem cloud) and work with cutting edge tooling like spinnaker, kubernetes, docker and more. You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills Your contributions to our market-leading product support will significantly impact our rapidly-growing global customer base. What you will be doing Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure Develop innovative ways to smartly measure, monitor & report application and infrastructure health Gain deep knowledge of our application stack Experience improving the performance of micro-services and solve scaling/performance issues Capacity management and planning Function well in a fast-paced and rapidly-changing environment Participate with the dev teams in a 24X7 on-call rotations. Ability to debug and optimize code and automate routine tasks. Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis. Required skills and experience 15+ years of experience troubleshooting Unix/Linux Experience in managing a large-scale web operations role Experience in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby Experience with algorithms, data structures, complexity analysis, and software design Hands-on working with private or public cloud services in a highly available and scalable production environment. Experience with continuous integration and deployment automation tools such as Jenkins, Ansible etc. Knowledge of distributed systems a big plus Previous experience working with geographically-distributed coworkers. Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, developers, Product Managers, etc Should have led teams, collaborating cross-functionally to deliver complex software features and solutions. Education BSCS or equivalent required, MSCS or equivalent strongly preferred How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: Netskope, a global SASE leader, helps organizations apply zero trust principles and AI/ML innovations to protect data and defend against cyber threats. Fast and easy to use, the Netskope platform provides optimized access and real-time security for people, devices, and data anywhere they go. Netskope helps customers reduce risk, accelerate performance, and get unrivaled visibility into any cloud, web, and private application activity. Thousands of customers trust Netskope and its powerful NewEdge network to address evolving threats, new risks, technology shifts, organizational and network changes, and new regulatory requirements About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!

Posted 3 weeks ago

Apply

15.0 - 24.0 years

60 - 90 Lacs

Hyderabad

Remote

Principal SRE Experience: 15 - 25 Years Exp Salary : Competitive Preferred Notice Period : Within 60 Days Shift : 10:00AM to 6:00PM IST Opportunity Type: Remote Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : CI/CD OR RCA OR Performance Engineering and AWS OR Scaled Agile OR Python and Grafana OR Linux Netskope (One of Uplers' Clients) is Looking for: Principal SRE who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description Job Summary: Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. Whats in it for you You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management. You will have an opportunity to work on hybrid cloud (Google Cloud, On-prem cloud) and work with cutting edge tooling like spinnaker, kubernetes, docker and more. You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills Your contributions to our market-leading product support will significantly impact our rapidly-growing global customer base. What you will be doing Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure Develop innovative ways to smartly measure, monitor & report application and infrastructure health Gain deep knowledge of our application stack Experience improving the performance of micro-services and solve scaling/performance issues Capacity management and planning Function well in a fast-paced and rapidly-changing environment Participate with the dev teams in a 24X7 on-call rotations. Ability to debug and optimize code and automate routine tasks. Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis. Required skills and experience 15+ years of experience troubleshooting Unix/Linux Experience in managing a large-scale web operations role Experience in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby Experience with algorithms, data structures, complexity analysis, and software design Hands-on working with private or public cloud services in a highly available and scalable production environment. Experience with continuous integration and deployment automation tools such as Jenkins, Ansible etc. Knowledge of distributed systems a big plus Previous experience working with geographically-distributed coworkers. Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, developers, Product Managers, etc Should have led teams, collaborating cross-functionally to deliver complex software features and solutions. Education BSCS or equivalent required, MSCS or equivalent strongly preferred How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: Netskope, a global SASE leader, helps organizations apply zero trust principles and AI/ML innovations to protect data and defend against cyber threats. Fast and easy to use, the Netskope platform provides optimized access and real-time security for people, devices, and data anywhere they go. Netskope helps customers reduce risk, accelerate performance, and get unrivaled visibility into any cloud, web, and private application activity. Thousands of customers trust Netskope and its powerful NewEdge network to address evolving threats, new risks, technology shifts, organizational and network changes, and new regulatory requirements About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!

Posted 3 weeks ago

Apply

15.0 - 24.0 years

60 - 90 Lacs

Bengaluru

Remote

Principal SRE Experience: 15 - 25 Years Exp Salary : Competitive Preferred Notice Period : Within 60 Days Shift : 10:00AM to 6:00PM IST Opportunity Type: Remote Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : CI/CD OR RCA OR Performance Engineering and AWS OR Scaled Agile OR Python and Grafana OR Linux Netskope (One of Uplers' Clients) is Looking for: Principal SRE who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description Job Summary: Please note, this team is hiring across all levels and candidates are individually assessed and appropriately leveled based upon their skills and experience. The Application SRE Team supports several critical components of our foundational technologies for real-time protection, as well as our RBI and SSPM services. We are a team of software engineers focused on improving availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you. Whats in it for you You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management. You will have an opportunity to work on hybrid cloud (Google Cloud, On-prem cloud) and work with cutting edge tooling like spinnaker, kubernetes, docker and more. You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills Your contributions to our market-leading product support will significantly impact our rapidly-growing global customer base. What you will be doing Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure Develop innovative ways to smartly measure, monitor & report application and infrastructure health Gain deep knowledge of our application stack Experience improving the performance of micro-services and solve scaling/performance issues Capacity management and planning Function well in a fast-paced and rapidly-changing environment Participate with the dev teams in a 24X7 on-call rotations. Ability to debug and optimize code and automate routine tasks. Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis. Required skills and experience 15+ years of experience troubleshooting Unix/Linux Experience in managing a large-scale web operations role Experience in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby Experience with algorithms, data structures, complexity analysis, and software design Hands-on working with private or public cloud services in a highly available and scalable production environment. Experience with continuous integration and deployment automation tools such as Jenkins, Ansible etc. Knowledge of distributed systems a big plus Previous experience working with geographically-distributed coworkers. Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, developers, Product Managers, etc Should have led teams, collaborating cross-functionally to deliver complex software features and solutions. Education BSCS or equivalent required, MSCS or equivalent strongly preferred How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: Netskope, a global SASE leader, helps organizations apply zero trust principles and AI/ML innovations to protect data and defend against cyber threats. Fast and easy to use, the Netskope platform provides optimized access and real-time security for people, devices, and data anywhere they go. Netskope helps customers reduce risk, accelerate performance, and get unrivaled visibility into any cloud, web, and private application activity. Thousands of customers trust Netskope and its powerful NewEdge network to address evolving threats, new risks, technology shifts, organizational and network changes, and new regulatory requirements About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!

Posted 3 weeks ago

Apply

8.0 - 12.0 years

10 - 14 Lacs

Bengaluru

Work from Office

Your Impact: We are seeking a highly skilled and experienced Lead Software Engineer to design, implement, and manage robust and scalable CI/CD pipelines using GitLab CI and other DevOps tools . The ideal candidate will have a deep understanding of infrastructure as code (IaC), cloud platforms ( AWS , GCP , Azure), and automation techniques to streamline deployment and infrastructure management . This role requires expertise in Terraform , Ansible , Kubernetes , and Python to enhance operational efficiency and security. What the role offers: Define and lead the architectural vision for Helm-based installers and upgrade frameworks for Kubernetes applications. Design and optimize Helm charts to streamline installation, upgrades, and rollbacks. Establish best practices for Kubernetes-based deployments, scalability, and fault tolerance. Lead the evaluation and adoption of new tools and technologies in the Kubernetes ecosystem. Oversee security, compliance, and performance considerations in installation and upgrade processes. Provide technical leadership and mentorship to engineers in the Helm/Kubernetes domain. Troubleshoot and resolve complex deployment and upgrade issues. Stay up to date with emerging trends and advancements in Kubernetes, Helm, and cloud-native technologies. What you need to succeed: Bachelors or Masters degree in computer science, Information Technology, or a related field. 8 to 12 years of experience in software engineering or DevOps/CD. Strong understanding on Cloud Platforms: AWS, Azure, GCP, OpenShift Proficiency in Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation. Hands on experience with CICD pipelines, GitOps mythologies and automation framework. DevOps Tools: GitLab CI/CD, ArgoCD/FluxCD, Helm, Maven, NodeJs, JFrog Artifactory, SonarQube DataBase: Postgres Experience in Monitoring C Logging tools like Prometheus, Grafana. Automation C Scripting: Shell Script, Python Deep knowledge of security, networking, and performance optimization in Kubernetes environments. Strong problem-solving, leadership, and communication skills.

Posted 3 weeks ago

Apply

10.0 - 14.0 years

0 Lacs

pune, maharashtra

On-site

Position Overview: As the SRE Lead, you will oversee the reliability and performance of our systems, ensuring they meet the high standards our customers expect. You will lead a team of skilled engineers, guiding them in implementing best practices for reliability, automation, and operational excellence. This role requires a blend of technical expertise, leadership skills, and a strong commitment to continuous improvement.Key Responsibilities:Team Leadership: Manage and mentor a team of Site Reliability Engineers, fostering a culture of collaboration, innovation, and accountability.System Reliability: Drive initiatives to improve system reliability, availability, scalability, and performance.Incident Management: Lead the response to critical incidents, coordinating efforts across teams to ensure swift resolution and minimal customer impact.Automation: Champion automation efforts to streamline operational processes, reduce manual intervention, and increase efficiency.Monitoring and Alerting: Establish and maintain robust monitoring and alerting systems to proactively identify issues and prevent service disruptions.Capacity Planning: Collaborate with cross-functional teams to forecast capacity requirements and optimize resource utilization.Continuous Improvement: Promote a culture of continuous improvement through regular retrospectives, post-incident reviews, and knowledge-sharing sessions.Documentation: Ensure comprehensive documentation of systems, processes, and procedures to facilitate knowledge transfer and training.Qualifications:Technical Expertise: Strong background in Linux/Unix systems administration, networking, and cloud infrastructure (AWS, GCP, Azure).Leadership Skills: Proven experience leading and developing high-performing engineering teams.Problem Solving: Ability to troubleshoot complex issues, prioritize tasks, and make data-driven decisions under pressure.Automation Tools: Proficiency in automation tools and configuration management (e.g., Terraform, Ansible, Chef, Puppet).Monitoring and Logging: Experience with monitoring tools (e.g., Prometheus, Grafana, ELK stack) and log management solutions.CI/CD: Familiarity with CI/CD pipelines and practices.Communication: Excellent communication skills with the ability to articulate technical concepts to non-technical stakeholders.Education and Experience:Bachelors degree in Computer Science, Engineering, or a related field (or equivalent practical experience).10+ years of experience in a Site Reliability Engineering or similar role.5+ years of experience in a leadership or managerial position.,

Posted 3 weeks ago

Apply

1.0 - 6.0 years

7 - 17 Lacs

Noida

Work from Office

Job Summary Site Reliability Engineers (SRE's) cover the intersection of Software Engineer and Systems Administrator. In other words, they can both create code and manage the infrastructure on which the code runs. This is a very wide skillset, but the end goal of an SRE is always the same: to ensure that all SLAs are met, but not exceeded, so as to balance performance and reliability with operational costs. As a Site Reliability Engineer II, you will be learning our systems, improving your craft as an engineer, and taking on tasks that improve the overall reliability of the VP platform. Key Responsibilities: Design, implement, and maintain robust monitoring and alerting systems. Lead observability initiatives by improving metrics, logging, and tracing across services and infrastructure. Collaborate with development and infrastructure teams to instrument applications and ensure visibility into system health and performance. Write Python scripts and tools for automation, infrastructure management, and incident response. Participate in and improve the incident management and on-call process, driving down Mean Time to Resolution (MTTR). Conduct root cause analysis and postmortems following incidents and champion efforts to prevent recurrence. Optimize systems for scalability, performance, and cost-efficiency in cloud and containerized environments. Advocate and implement SRE best practices, including SLOs/SLIs, capacity planning, and reliability reviews. Required Skills & Qualifications: 1+ years of experience in a Site Reliability Engineer or similar role. Excellent communicaiton skills in English. Proficiency in Python for automation and tooling. Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, New Relic, Open Telemetry, etc. Experience with log aggregation and analysis tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Fluentd. Good understanding of cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes). Familiarity with infrastructure-as-code (Terraform, Ansible, or similar). Strong debugging and incident response skills. Knowledge of CI/CD pipelines and release engineering practices.

Posted 3 weeks ago

Apply

1.0 - 5.0 years

0 Lacs

karnataka

On-site

Job Title Netwrok & Security Engineer (NOC) Job Description Designation: NOC Engineer (Network & Security Operations) Location: Bengaluru, India Why you should choose us Are you interested in working for a Global Leader in E-commerce Are you excited about working on highly scalable platforms and applications that are accessed by millions of users every day If so, read on to find out more about the opportunity. Rakuten is the largest E-commerce company in Japan and one of the largest E-commerce and Internet Services companies in the World. Rakuten is ranked in top 20 most innovative companies in the world by Forbes. What will you do Monitor, Implement & Troubleshoot Network issues. Build & deploy monitoring solutions for NOC. Managing Network Security Operations (Firewall & LB) Working on Routing, Switching & IPAM Operations (Network) Change Requests Implementation pertaining to Network & Security Infrastructure Manage Infrastructure, Data Center, Cloud, Virtual & Physical appliances including Cisco, Palo Alto, Juniper, Wireless. Perform monitoring and incident response of cyber security events as part of a highly available Security Operation Center (SOC) What we look for Experience: Relevant Experience (Network & Security Operations): 1 to 3 years Overall Experience: 3 years Work experience with global stakeholders. Enterprise support standard & processes. Technical Skills: Experience in managing DNS/DHCP, Juniper and Cisco Routers, Switches, Firewalls & load balancers. Relevant IT qualifications and certs e.g. CCNA & CCNP Routing: experience in BGP & OSPF configuring & troubleshooting Exposure on Silver Peak's Unity Orchestrator and Edge Connect SDWAN appliances. experience in supporting for Juniper MIST wireless network-related issues. Good knowledge on spine/leaf architecture, Overlay/Underlay network, VXLAN, EVPN. In-depth knowledge on configuring and optimizing Network device monitoring and alerting. Hands-on experience with Cloud networking with the public cloud. VMware networking and virtualization technologies. Network monitoring technologies and SNMP management (PRTG, Syslog, Kentik, Thruk, Thousand Eyes, Grafana & Prometheus etc.) Drive the Automation of network related activities across Deployment and Operations. Ability to read, investigate, evaluate, and interpret security related logs from disparate sources. Create and review alerts generated by the SIEM for false positives, modify and optimize alerts as needed to reduce noise. Develop and follow detailed operational processes, procedures, and playbooks to appropriately analyze, escalate and assist in the remediation of information security related incidents. Develop deep expertise regarding the current the SIEM platform. Assist in the administration, maintenance, and optimization of the current the SIEM platform. Develop advanced queries and alerts to detect adversary actions. Contribute to incident and root cause analysis reports. Other Skills: Should possess excellent communication skills. Ability to participate in Global change control meetings, manager approvals from stake holders, follow processes and implement enterprise network changes. Work with internal and external technical and service teams to provide break-fix. create and/or update knowledge base articles. Support 24 x 7 operational environments with high uptime requirements. Varied shift schedules may include day or evening hours. Provide timely response to all incidents, outages, and performance alerts. Categorize issues for escalation to appropriate technical teams. Recognize, identify, and prioritize incidents in accordance with customer business requirements, organizational policies, and operational impact. Hands-on experience on any automation/ scripting tools or languages Other duties as assigned.,

Posted 3 weeks ago

Apply

0.0 - 3.0 years

0 Lacs

maharashtra

On-site

Experience: You should have 0-2 years of experience in a product support or technical support role. Along with that, you must possess excellent verbal and written communication skills. Strong problem-solving skills are essential, with the ability to troubleshoot and find workable solutions. You should be able to prioritize tickets based on urgency and business impact, with attention to detail. Being self-motivated and eager to learn and adapt to new technologies is a key requirement. A basic understanding of REST APIs and their functionality is necessary. Also, you should be willing to work in rotational shifts, including weekends. Good-to-Have: It would be beneficial if you have knowledge of SQL and can write basic queries. Hands-on experience with ticketing systems like Freshdesk, Jira, Zendesk, or ServiceNow will be an added advantage. Ability to write simple scripts in languages like Python to automate repetitive tasks or improve troubleshooting efficiency is a plus. Experience in working closely with customers to resolve issues and build strong relationships through effective communication is desirable. Additionally, a basic understanding of cloud platforms such as AWS, Azure, or Google Cloud, particularly for products hosted on cloud infrastructure, would be helpful. Familiarity with any one of the monitoring and logging tools like Kibana, Splunk, Nagios, or Grafana for proactive system monitoring and health checks is good-to-have. Your Day: In this role, you will be responsible for defining monitoring events for IDfy's services and setting up the corresponding alerts. Responding to alerts, triaging, investigating, and resolving issues will be part of your daily tasks. You will learn about various IDfy applications and understand the events emitted. Creating MIS reports for service performance and usage monitoring is also a crucial aspect of this role. Responding to incidents and customer tickets in a timely manner and defining monitoring events for Software services are key responsibilities. You will also help improve the IDfy Platform by providing insights based on investigations and root cause analysis. As part of this role, you will often be required to provide support during non-office hours as part of a rotational shift or roster. If you are passionate about customer support, enjoy solving problems, and thrive in a dynamic and fast-paced environment, this role is perfect for you. How to Apply: To apply for this opportunity, you need to register or login on the portal and fill out the application form. Clear the given Video Screening (30 min) and click on "Apply" to get shortlisted. Once you have completed these steps, your profile will be shared with the client for the Interview round. When selected, you will meet the client and kickstart your exciting career journey. About Uplers: Uplers" goal is to make hiring reliable, simple, and fast. They aim to help all talents find and apply for relevant contractual onsite opportunities and progress in their careers. Uplers provide support for any grievances or challenges faced during the engagement and assign a dedicated Talent Success Coach to each individual during the engagement period. If you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. Uplers are looking forward to welcoming you!,

Posted 3 weeks ago

Apply

3.0 - 7.0 years

0 Lacs

hyderabad, telangana

On-site

We are looking for an experienced KQL Developer with a strong background in data querying, preferably using KQL (Kusto Query Language) or SQL, and data visualization. The ideal candidate will have expertise in querying large datasets and visualizing insights, ideally using tools like Grafana. It would be highly desirable to have additional experience or understanding in data science and analytics. As a KQL Developer, your responsibilities will include developing, optimizing, and maintaining complex queries using KQL (or SQL if KQL is not available). You will be analyzing and interpreting data patterns to derive actionable insights and designing and building compelling visualizations, preferably using Grafana, to effectively communicate data insights. Collaboration with cross-functional teams to understand data requirements and build tailored analytics solutions will also be a key part of your role. Additionally, you will support data integration and ETL processes for analysis and reporting while documenting and ensuring data quality, consistency, and reliability across data sources. Key Requirements: - Proficiency in KQL (preferred) or strong SQL skills for handling large datasets and complex queries. - Experience with Grafana (or similar tools) to create insightful, interactive dashboards and reports. - Basic knowledge in data science concepts such as predictive modeling, statistical analysis, or machine learning. - Ability to interpret and analyze data trends, anomalies, and patterns. - Strong critical thinking and troubleshooting skills for data issues. - Excellent written and verbal communication skills, flexibility, self-driven, and a team player. At GlobalLogic, we offer a culture of caring where people come first, a commitment to continuous learning and development, interesting and meaningful work that makes an impact, balance, and flexibility to achieve work-life harmony, and a high-trust organization built on integrity and trust. As a trusted digital engineering partner to leading companies worldwide, we continue to drive innovation and create intelligent products, platforms, and services that redefine industries and transform businesses.,

Posted 3 weeks ago

Apply

11.0 - 16.0 years

30 - 40 Lacs

Hyderabad

Remote

About the Role: We are seeking a highly skilled Senior Technical Lead / Principal Software Engineer with extensive expertise in React.js, Node.js, Javascript, PostgreSQL, TimescaleDB, NoSQL , Grafana, Prometheus , and cloud technologies. The ideal candidate will have a strong background in IoT , MQTT , and the Telecom industry , with proven experience in architecting solutions, leading technical teams, and driving end-to-end product development. Key Responsibilities: Architectural Leadership : Design, develop, and implement scalable, resilient, and secure architectures for complex systems, ensuring alignment with business requirements and industry standards. Legacy Application Reengineering: Lead the transformation of legacy applications into modern, highly scalable platforms. Design and implement microservices architecture to improve scalability, maintainability, and deployment flexibility. Collaborate with stakeholders to understand the current system and identify opportunities for modernization. Technical Ownership : Drive end-to-end ownership of system components, from design through deployment and maintenance. Team Leadership : Lead, mentor, and manage a technical team, fostering a collaborative environment and ensuring delivery excellence. Development : Build and maintain robust, high-performance applications using React.js, Node.js, and Javascript. Database Expertise : Optimize data storage and management strategies using PostgreSQL, TimescaleDB, and NoSQL databases. Optimising Queries and improving performance Cloud Technologies : Architect and implement cloud-based solutions leveraging Azure or AWS and IoT Core. IoT Integration : Develop and implement IoT solutions with protocols like MQTT and tools for real-time data streaming. Telecom Domain : Apply domain expertise in telecom to address unique industry challenges and develop cutting-edge solutions. Collaboration : Work closely with cross-functional teams, including product managers, designers, and other stakeholders, to ensure project success. Best Practices : Advocate and implement coding standards, code reviews, CI/CD pipelines, and other engineering best practices, OWASP Security Compliances Documentation Responsibilities Technical Or Project Documentation : Create and maintain detailed technical documentation, including system architecture diagrams, design specifications, APIs, and technical roadmaps ensuring clear communication with stakeholders Knowledge Base Development : Establish and maintain a repository of knowledge articles, code best practices, and reusable design patterns to support team productivity. Process Documentation : Document CI/CD pipelines, DevOps processes, and deployment workflows for team reference and operational efficiency. Compliance and Standards : Ensure all documentation aligns with company standards, industry best practices, and relevant compliance requirements. Collaboration Tools : Leverage tools like Confluence, JIRA, and Gitlab to ensure seamless documentation management and accessibility. Training Materials : Create technical training materials and onboarding guides for new team members to accelerate learning and alignment with company practices Required Skills and Qualifications: Technical Expertise : Proficiency in React.js, Node.js, JavaScript, TypeScript, Grafana, Prometheus, PostgreSQL, TimescaleDB, and NoSQL databases. Cloud Platforms : Hands-on experience with any cloud platforms and cloud-native design principles. IoT and Messaging : Strong knowledge of IoT protocols (MQTT) and real-time messaging systems. Telecom Industry : Familiarity with telecom systems, challenges, and emerging trends. Architecture : Proven experience in designing distributed systems and microservices architecture. Leadership : Exceptional ability to lead technical teams, mentor engineers, and ensure project timelines are met. Communication : Excellent problem-solving, organizational, and communication skills. Tools and Processes : Experience with DevOps tools, CI/CD pipelines, version control (Git), and agile methodologies. Preferred Qualifications: Good to have any Cloud Certification. Experience with Fullstack Development & Leading Technical team Experience with TimescaleDB and time-series data processing.

Posted 3 weeks ago

Apply

6.0 - 8.0 years

15 - 30 Lacs

Pune

Hybrid

Responsibilities: - Design, implement, and maintain AWS infrastructure - Build and manage Infrastructure as Code (IaC) using Terraform / Terragrunt - Implement auto-scaling strategies and load balancing solutions - Optimize CDN configurations with AWS Cloudfront - Harden infrastructure and application security across all layers - Implement security best practices for AWS configurations - Manage IAM policies, security groups, and network ACLs - Build and maintain robust CI/CD pipelines with GitHub Actions - Automate deployment, monitoring, and management processes - Set up comprehensive monitoring, logging and alerting systems - Create and maintain operational best practices - Managing and resolving DevSecOps-related incidents, service requests, and operational issues in a timely manner - Monitoring system health and performance of DevSecOps platforms, investigating anomalies, and mitigating risks - Lead a team of DevOps Engineers Technical Skills: - Minimum 6 years of DevOps experience, with a strong focus on AWS - Expert knowledge of AWS Cloud services e.g. (VPC, EC2, S3, ELB, RDS, ECS/EKS, IAM, Vault, CloudFront, CloudWatch, SQS/SNS, Lambda, Load Balancer, Security Groups, Redis) - Strong background in infrastructure and application security - Proficiency with containerization and orchestration (Docker, Kubernetes, Helm, Istio, EKS) - Experience with Continuous Integration and Continuous Deployment Pipelines and tooling (Github, Github Actions, Jenkins, Jira) - Proficient in scripting languages (e.g., Python, Bash) and automation tools - Good exposure to Monitoring, Logging and Alerting tools / systems - Prometheus, Grafana, AWS CloudWatch, New Relic, Splunk, Fluentd, Fluent-Bit - Solid understanding of networking concepts and protocols - Hands on experience with Linux system administration - Proficient in using Terraform / Terragrunt for Infrastructure as Code (IaC) to manage and provision infrastructure - Exposure to OpenSearch/ElasticSearch Kibana (ELK) - Methodological knowledge in SAFe, Agile product development with Scrum, ITIL-processes, DevSecOps - AWS certifications (e.g., AWS Certified DevOps Engineer - Professional, AWS Certified Solutions Architect - Associate or Professional) - Security certifications (e.g., CCSP, AWS Security Specialty) are a plus - Proven track record for leading AWS DevOps Platform team Softskills: - Strong analytical & logical thinking skills. Ability to think and act rationally when faced with challenges - Keen eye for details - Sense of ownership and accountability - Fluent communication skills (verbal and written). Should be able to present ideas and thoughts clearly. - Committed team player and good interpersonal skills - Zeal and enthusiasm for learning and exploring new avenues

Posted 3 weeks ago

Apply

7.0 - 12.0 years

16 - 27 Lacs

Pune

Hybrid

Key Responsibilities: PUNE LOCATION Manage production incidents, perform root cause analysis, and ensure preventive actions are implemented. Collaborate with development, QA, and infrastructure teams to ensure applications are production-ready and reliable. Implement and maintain observability tools (monitoring, logging, alerting) for proactive issue detection and resolution. Support CI/CD pipelines and help enforce SRE best practices. Automate routine production tasks and manual interventions. Participate in on-call rotations for incident response and escalation handling. Drive continuous improvement in system reliability, performance, and supportability. Ensure compliance with internal controls, security standards, and disaster recovery protocols. Required Skills & Experience: 6+ years of experience in Production Support or SRE roles. Strong knowledge of Linux/Unix systems and scripting (Shell, Python, etc.). Experience with monitoring tools like Prometheus, Grafana, AppDynamics, Splunk, or ELK Stack. Familiarity with cloud platforms (AWS). Exposure to containerization tools like Docker and orchestration with Kubernetes. Experience with incident management processes and tools (ServiceNow, JIRA). Understanding of SRE principles such as SLAs, SLOs, SLIs, and error budgets. Background in Core Banking/Financial applications is a plus.

Posted 3 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies