Jobs
Interviews

1633 Grafana Jobs - Page 16

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 12.0 years

0 Lacs

karnataka

On-site

As a Sr. DevOps SRE at VARITE INDIA PRIVATE LIMITED, you will be responsible for developing Ansible playbooks to configure the Client's devices. You will design, configure, and maintain Grafana dashboards for real-time monitoring and visualization of infrastructure, application, and business metrics. Your role will involve developing and optimizing alerting rules to proactively detect and resolve issues, as well as creating custom Splunk queries, dashboards, and reports for incident detection and troubleshooting. Additionally, you will build, deploy, and manage containers using Docker, and create, manage, and troubleshoot Kubernetes manifests such as Deployments, Services, ConfigMaps, etc. Your responsibilities will also include developing, maintaining, and optimizing CI/CD pipelines for automated build, test, and deployment processes using tools like Jenkins, GitLab CI, GitHub Actions, etc. You will be expected to implement best practices for infrastructure as code, automated testing, and continuous integration/delivery. To qualify for this position, you should have a minimum of 8 years of experience with a strong focus on DevOps/SRE with automation. If you are interested in this opportunity, please submit your resume by clicking on the apply online button on this job post. VARITE is a global staffing and IT consulting company that provides technical consulting and team augmentation services to Fortune 500 Companies in the USA, UK, Canada, and India. VARITE serves as a primary and direct vendor to leading corporations in various verticals, including Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.,

Posted 2 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

vadodara, gujarat

On-site

As a Senior Software Engineer (Java Developer) at our organization, you will play a crucial role in designing, developing, and deploying high-performance Java-based microservices. Your expertise in Core Java, Spring Boot, and Microservices Architecture will be essential in implementing REST APIs following OpenAPI/Swagger standards. Your responsibilities will focus on ensuring the quality, automation, testing, performance optimization, and monitoring of our systems. In terms of design and development, you will be required to adhere to API-first and Cloud-native design principles while driving the adoption of automated unit tests, integration tests, and contract tests. Your role will involve developing and extending automation frameworks for API and integration-level testing, as well as supporting BDD/TDD practices across development teams. Furthermore, you will contribute to performance tuning, scalability, asynchronous processing, and fault tolerance aspects of the system. Your collaboration with DevOps, Product Owners, and QA teams will be crucial for feature delivery. Additionally, mentoring junior developers, conducting code walkthroughs, and leading design discussions will be part of your responsibilities. The ideal candidate should have at least 5 years of hands-on Java development experience and a deep understanding of Microservices design patterns, API Gateways, and service discovery. Exposure to Cloud deployment models like AWS ECS/EKS, Azure AKS, or GCP GKE is preferred. Proficiency with Git, Jenkins, SonarQube, and containerization (Docker/Kubernetes), along with experience working in Agile/Scrum teams, is highly desired. Experience with API security standards (OAuth2, JWT), event-driven architecture using Kafka or RabbitMQ, Infrastructure as Code (IaC) tools like Terraform or CloudFormation, and performance testing tools like JMeter or Gatling would be considered a plus. Your ownership-driven mindset, strong communication skills, and ability to solve technical problems under tight deadlines will be valuable assets in this role. It is essential for every individual working with or on behalf of our organization to prioritize information security. This includes abiding by security policies, ensuring confidentiality and integrity of information, reporting any security violations, breaches, and completing mandatory security trainings as per company guidelines. If you are a passionate and skilled Senior Software Engineer with expertise in Java development and a desire to contribute to scalable backend systems, we encourage you to apply for this role and join our dynamic team.,

Posted 2 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

chennai, tamil nadu

On-site

The job is located in Chennai, Tamil Nadu, India with the company Hitachi Energy India Development Centre (IDC). As part of the Engineering & Science profession, the job is full-time and not remote. The primary focus of the India Development Centre is on research and development, with around 500 R&D engineers, specialists, and experts dedicated to creating and sustaining digital solutions, new products, and technology. The centre collaborates with Hitachi Energy's R&D and Research centres across more than 15 locations in 12 countries. The mission of Hitachi Energy is to advance the world's energy system to be more sustainable, flexible, and secure while considering social, environmental, and economic aspects. The company has a strong global presence with installations in over 140 countries. As a potential candidate for this role, your responsibilities include: - Meeting milestones and deadlines while staying on scope - Providing suggestions for improvements and being open to new ideas - Collaborating with a diverse team across different time zones - Enhancing processes for continuous integration, deployment, testing, and release management - Ensuring the highest standards of security - Developing, maintaining, and supporting Azure infrastructure and system software components - Providing guidance to developers on building solutions using Azure technologies - Owning the overall architecture in Azure - Ensuring application performance, uptime, and scalability - Leading CI/CD processes design and implementation - Defining best practices for application deployment and infrastructure maintenance - Monitoring and reporting on compute/storage costs - Managing deployment of a .NET microservices based solution - Upholding Hitachi Energy's core values of safety and integrity Your background should ideally include: - 3+ years of experience in Azure DevOps, CI/CD, configuration management, and test automation - 2+ years of experience in various Azure technologies such as IAC, ARM, YAML, Azure PaaS, Azure Active Directory, Kubernetes, and Application Insight - Proficiency in Bash scripting - Hands-on experience with Azure components and services - Building and maintaining large-scale SaaS solutions - Familiarity with SQL, PostgreSQL, NoSQL, Redis databases - Expertise in infrastructure as code automation and monitoring - Understanding of security concepts and best practices - Experience with deployment tools like Helm charts and docker-compose - Proficiency in at least one programming language (e.g., Python, C#) - Experience with system management in Linux environment - Knowledge of logging & visualization tools like ELK stack, Prometheus, Grafana - Experience in Azure Data Factory, WAF, streaming data, big data/analytics Proficiency in spoken and written English is essential for this role. If you have a disability and require accommodations during the job application process, you can request reasonable accommodations through Hitachi Energy's website by completing a general inquiry form. This assistance is specifically for individuals with disabilities needing accessibility support during the application process.,

Posted 2 weeks ago

Apply

1.0 - 5.0 years

0 Lacs

kochi, kerala

On-site

The Software DevOps Engineer (1-3 Years Experience) position requires a Bachelor's degree in Computer Science, Information Technology, or a related field along with 1-3 years of experience in a DevOps or related role. As a Software DevOps Engineer, your responsibilities will include designing, implementing, and maintaining CI/CD pipelines to ensure efficient and reliable software delivery. You will collaborate with Development, QA, and Operations teams to streamline the deployment and operation of applications. Monitoring system performance, identifying bottlenecks, and troubleshooting issues to ensure high availability and reliability are also part of your role. Furthermore, you will automate repetitive tasks and processes to improve efficiency and reduce manual intervention. Participating in code reviews, contributing to the improvement of best practices and standards, and implementing and managing infrastructure as code (IaC) using Terraform are essential duties. Documentation of processes, configurations, and procedures for future reference is required. Staying updated with the latest industry trends and technologies to continuously improve DevOps processes, as well as creating POC for the latest tools and technologies are part of the job. The mandatory skills for this position include proficiency in Azure Cloud, Azure DevOps, CI/CD Pipeline, Version control (git), Linux Commands, Bash Script, Docker, Kubernetes, Helm Charts, Monitoring tools like Grafana, Prometheus, ELK Stack, Azure Monitoring, Azure, AKS, Azure Storage, Virtual Machine, an understanding of micro-services architecture, orchestration, and Sql Server. Optional skills that are beneficial for this role include Ansible Script, Kafka, MongoDB, Key Vault, and Azure CLI. Overall, the ideal candidate for this role should possess a strong understanding of CI/CD concepts and tools, experience with cloud platforms and containerization technologies, a basic understanding of networking and security principles, strong problem-solving skills, attention to detail, excellent communication and teamwork skills, and the ability to learn and adapt to new technologies and methodologies. Additionally, being ready to work with clients directly is a key requirement for this position.,

Posted 2 weeks ago

Apply

4.0 - 8.0 years

5 - 9 Lacs

Noida

Work from Office

Architecture, Lifecycle Management, Platform Governance, Linux, OpenShift,Prometheus, Grafana,Helm,EFK Stack,Ansible,Vault, SCCs, RBAC, NetworkPolicies,CI/CD: Jenkins, GitLab CI, ArgoCD, Tekton,Lead SEV1 issue resolution

Posted 2 weeks ago

Apply

3.0 - 6.0 years

3 - 6 Lacs

Noida

Work from Office

Advanced Troubleshooting, Change Management, Automation,Linux,YAML/Helm/Kustomize,Maintain Operators, upgrade OpenShift clusters,Work with CI/CD pipelines and DevOps teams,Maintain logs, monitoring, and alerting tools (Prometheus, EFK, Grafana)

Posted 2 weeks ago

Apply

6.0 - 8.0 years

18 - 30 Lacs

Hyderabad

Work from Office

Key Skills: Hadoop, Cloudera, HDFS, YARN, Spark, Delta Lake, Linux, Docker, Kubernetes, Jenkins, REST API, Prometheus, Grafana, Splunk, PySpark, Python, Terraform, Ansible, GCP, DevOps, CI/CD, SRE, Agile, Infrastructure Automation Roles & Responsibilities: Lead and support technology teams in designing, developing, and managing data engineering and CI/CD pipelines, and infrastructure. Act as an Infrastructure/DevOps SME in designing and implementing solutions for risk analytics systems transformation, both tactical and strategic, aligned with regulatory and business initiatives. Collaborate with other technology teams, IT support teams, and architects to drive improvements in product delivery. Manage daily interactions with IT and central DevOps/infrastructure teams to ensure continuous support and delivery. Grow the technical expertise within the engineering community by mentoring and sharing knowledge. Design, maintain, and improve the full software delivery lifecycle. Enforce process discipline and improvements in areas like agile software delivery, production support, and DevOps pipeline development. Experience Requirement: 6-8 years of experience in platform engineering, SRE roles, and managing distributed/big data infrastructures. Strong hands-on experience with the Hadoop ecosystem, big data pipelines, and Delta Lake. Proven expertise in Cloudera Hadoop cluster management including HDFS, YARN, and Spark. In-depth knowledge of networking, Linux, HDFS, and DevSecOps tools like Docker, Kubernetes, and Jenkins. Skilled in containerization with Docker and orchestration using Kubernetes. Hands-on experience with designing and managing large-scale tech projects, including REST API standards. Experience with monitoring and logging tools such as Prometheus, Grafana, and Splunk. Global collaboration experience with IT and support teams across geographies. Strong coding skills in Spark (PySpark) and Python with at least 3 years of experience. Expertise in Infrastructure as Code (IaC) tools such as Terraform and Ansible. Working knowledge of GCP or other cloud platforms and their data engineering products is preferred. Familiarity with agile methodologies, with strong problem-solving and team collaboration skills. Education: B.Tech M.Tech (Dual), B.Tech, M. Tech.

Posted 2 weeks ago

Apply

8.0 - 12.0 years

25 - 35 Lacs

Bengaluru

Remote

Job Title : Sr. Devops SRE Location State : Karnataka Location City : Bangalore(Hybrid/ Remote) Experience Required : 8 to 12 Year(s) CTC Range : 25 to 38 LPA Shift: Day Shift Work Mode: Hybrid/ Remote Position Type: Contract ( with possible extension) Openings: 6 Company Name: VARITE INDIA PRIVATE LIMITED About The Client: An American multinational digital communications technology conglomerate corporation headquartered in San Jose, California. The Client develops, manufactures, and sells networking hardware, software, telecommunications equipment, and other high-technology services and products. The Client specializes in specific tech markets, such as the Internet of Things (IoT), domain security, videoconferencing, and energy management. It is one of the largest technology companies in the world, ranking 82nd on the Fortune 100 with over $51 billion in revenue and nearly 83,300 employees. About The Job: Hiring for Sr. Devops SRE Essential Job Functions: Key Responsibilities: Help build a new platform to support business transformation Focus on automation within DevOps (tools, processes) Operate in production environments (Amazon cloud or on-prem datacenters) Strong exposure to Kubernetes clusters and observability tools Top 3 Skill needed Kubernetes Highest priority (hands-on in production cluster setup & management) Observability & Monitoring Tools Grafana, Splunk (logging), Prometheus DevOps Tools & Practices Must Have Skills: Git (code repository) Python (basic to intermediate scripting) Docker Pipelines (CI/CD) Qualifications: Any Graduate How to Apply: Interested candidates are invited to submit their resume using the apply online button on this job post. Equal Opportunity Employer: VARITE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, veteran status, or disability status. Unlock Rewards: Refer Candidates and Earn. If you're not available or interested in this opportunity, please pass this along to anyone in your network who might be a good fit and interested in our open positions. VARITE offers a Candidate Referral program, where you'll receive a one-time referral bonus based on the following scale if the referred candidate completes a three-month assignment with VARITE. Exp Req - Referral Bonus 0 - 2 Yrs. - INR 5,000 2 - 6 Yrs. - INR 7,500 6 + Yrs. - INR 10,000 About VARITE: VARITE is a global staffing and IT consulting company providing technical consulting and team augmentation services to Fortune 500 Companies in USA, UK, CANADA and INDIA. VARITE is currently a primary and direct vendor to the leading corporations in the verticals of Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.

Posted 2 weeks ago

Apply

4.0 - 8.0 years

8 - 18 Lacs

Hyderabad

Work from Office

Key Skills: Elastic Cloud Kubernetes (ECK), Elasticsearch, Kibana, Logstash, Kafka, Cloud Infrastructure, Data Engineering, Prometheus, Grafana, Docker, Kubernetes, CI/CD, Jenkins, High Availability, Disaster Recovery, Networking. Roles & Responsibilities: Manage and configure Elastic Cloud Kubernetes (ECK) clusters in Kubernetes (K8s). Work with Elastic Stack components such as Elasticsearch, Kibana, Logstash, Fleet, and integrate with other tools. Design and implement application services with a focus on OSS integration, availability management, event/incident management, and patch management. Design and develop data pipelines to ingest data into Elastic. Manage log collection tools and technologies such as Beats, Elastic Agent, and Logstash. Implement log management and monitoring tools such as Prometheus and Grafana. Manage cloud infrastructure, including networking concepts like load balancers, firewalls, and VPCs. Ensure high availability and disaster recovery planning for critical infrastructure. Collaborate on continuous integration and continuous delivery (CI/CD) using tools like Jenkins. Experience Requirement: 4-8 years of experience in Elastic Stack (Elasticsearch, Kibana, Logstash), Kafka integration, and cloud technologies. Strong expertise in Kubernetes (K8s), Docker, Linux, and managing cloud infrastructure. Experience with monitoring tools such as Prometheus and Grafana. Hands-on experience with log collection and management technologies (Beats, Logstash, Elastic Agent). Experience with CI/CD tools like Jenkins is a plus. Education: Any Graduation.

Posted 2 weeks ago

Apply

4.0 - 8.0 years

15 - 30 Lacs

Bengaluru

Hybrid

Preferred candidate profile - Experience with system design and architecture - Experience with project execution - Experience with Golang - Experience working with major cloud solutions AWS (preferred), Azure, GCP. - Familiarity with 3-Tier, microservices architecture and distributed systems - Experience with design & development of RESTful services - Experience with different data stores, data modeling and scaling them - Familiarity with datastores such as Aerospike, MySQL, Mongo-db etc. - Good understanding of data structures, multi-threading and concurrency concepts. - Experience with DevOps tools like Jenkins, Ansible, Kubernetes, Git is a plus. - Familiarity with Elastic search queries and visualization tools like grafana, kibana - Strong networking fundamentals: Firewalls, Proxies, DNS, Load balancing, etc.

Posted 2 weeks ago

Apply

2.0 - 4.0 years

14 - 18 Lacs

Pune

Hybrid

So, what’s the role all about? As a Site Reliability Engineer (SRE) for our large and regionally distributed SaaS platform, your primary responsibilities will be to improve the reliability and availability of our mission-critical cloud-based services. How will you make an impact? Essential Duties and Responsibilities: Observability and Monitoring: Create new dashboards and metrics to provide comprehensive observability into the health and performance of development teams' applications, including SLI/SLO metrics. Work with development teams to ensure proper monitoring is set up and enabled for their services. Identify evolutionary improvements to the observability and monitoring solutions. Reliability Consulting and Automation: Consult with development teams on SRE services and best practices to help them improve the reliability of their applications. Create automation and tooling to reduce toil and manual intervention. Incident and Problem Management: Assist other teams in data and performance analysis to identify the root causes of issues and recommend automation actions. Knowledge Sharing and Mentoring: Review the work of other SREs and provide training and guidance to help them improve their skills. Communicate effectively with both technical and non-technical peers and customers. Process and Documentation: Follow established processes when performing work or help document and create processes, as necessary. Document troubleshooting steps and results in appropriate locations for historical access. Ensure compliance with policies, procedures, and standards. Implement or coordinate remediation required by audits and assessments, and document, as necessary. Time Estimation: Estimate the time required to complete activities and projects. Have you got what it takes? 4+ years programming/scripting experience with any of the following: (Go, Python, .Net (C#), Node) 4+ years of experience working within public or private cloud environments 4+ years of SRE/DevOps/Observability or related experience 4+ years of AWS Experience with Agile, Jira, GitHub, monitoring, automation, dashboarding You will have an advantage if you also have: Kubernetes + certification, Grafana , AWS, Azure, DevOps experience. What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr! Enjoy NICE-FLEX! At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere. Requisition ID:7547 Reporting into: Manager, Cloud Operations Role Type: Individual Contributor

Posted 2 weeks ago

Apply

6.0 - 11.0 years

14 - 19 Lacs

Pune

Work from Office

What You'll Do We are looking for accomplished Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping, development, and delivery of the LLM platform features. You will build core agent infrastructureA2A orchestration and MCP-driven tool discoveryso teams can launch secure, scalable agent workflows. You will be reporting to Senior Manager, ML Engineering. What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Drive innovation by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's technical expertise. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, promoting a culture of collaboration and Engineering expertise. Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 6+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Demonstrated experience staying current with breakthroughs in AI/ML, with a focus on GenAI. Experience with design patterns and data structures. Technologies you will work with: Python, LLMs, Agents, A2A, MCP, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, and Grafana We are the AI & ML enablement group in Avalara. We empower Avalara's Product and Engineering teams with the latest AI & ML capabilities, driving easy-to-use, automated compliance solutions that position Avalara as the industry AI technology leader and the go-to choice for all compliance needs.

Posted 2 weeks ago

Apply

6.0 - 11.0 years

10 - 15 Lacs

Pune

Work from Office

What You'll Do We are looking for experienced Machine Learning Engineers with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities will span the entire development lifecycle, including conceptualization, prototyping and delivery of the LLM platform features. You will build core agent infrastructureA2A orchestration and MCP-driven tool discoveryso teams can launch secure, scalable agent workflows. You will be reporting to Senior Manager, Machine Learning What Your Responsibilities Will Be We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Promote innovation by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, encouraging a culture of collaboration Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 6+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. Experience working with technological innovations in AI & ML(esp. GenAI) and apply them. Experience with design patterns and data structures. Good analytical, design and debugging skills. Technologies you will work with: Python, LLMs, Agents, A2A, MCP, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, and Grafana We are the AI & ML enablement group in Avalara.

Posted 2 weeks ago

Apply

8.0 - 13.0 years

12 - 16 Lacs

Pune

Work from Office

What You'll Do We are looking for experienced Machine Learning Engineer with a background in software development and a deep enthusiasm for solving complex problems. You will lead a dynamic team dedicated to designing and implementing a large language model framework to power diverse applications across Avalara. Your responsibilities as a Senior Technical Lead will span the entire development lifecycle, including conceptualization, prototyping and delivery of the LLM platform features. You will be reporting to Senior Manager, Software Engineering What Your Responsibilities Will Be You have a blend of technical skills in the fields of AI & Machine Learning especially with LLMs and a deep-seated understanding of software development practices where you'll work with a team to ensure our systems are scalable, performant and accurate. We are looking for engineers who can think quick and have a background in implementation. Your responsibilities will include: Build on top of the foundational framework for supporting Large Language Model Applications at Avalara Experience with LLMs - like GPT, Claude, LLama and other Bedrock models Leverage best practices in software development, including Continuous Integration/Continuous Deployment (CI/CD) along with appropriate functional and unit testing in place. Inspire creativity by researching and applying the latest technologies and methodologies in machine learning and software development. Write, review, and maintain high-quality code that meets industry standards, contributing to the project's. Lead code review sessions, ensuring good code quality and documentation. Mentor junior engineers, encouraging a culture of collaboration Proficiency in developing and debugging software with a preference for Python, though familiarity with additional programming languages is valued and encouraged. What You'll Need to be Successful 8+ years of experience building Machine Learning models and deploying them in production environments as part of creating solutions to complex customer problems. Bachelor's degree with computer science exposure Proficiency working in cloud computing environments (AWS, Azure, GCP), Machine Learning frameworks, and software development best practices. With technological innovations in AI & ML(esp. GenAI). Expertise in design patterns, data structures, distributed systems, and experience with cloud technologies. Good analytical, design and debugging skills. Technologies you will work with: Python, LLMs, MLFlow, Docker, Kubernetes, Terraform, AWS, GitLab, Postgres, Prometheus, Grafana

Posted 2 weeks ago

Apply

6.0 - 10.0 years

14 - 19 Lacs

Noida

Work from Office

With 80,000 customers across 150 countries, UKG is the largest U.S.-based private software company in the world. And were only getting started. Ready to bring your bold ideas and collaborative mindset to an organization that still has so much more to build and achieveRead on. Here, we know that youre more than your work. Thats why our benefits help you thrive personally and professionally, from wellness programs and tuition reimbursement to U Choose "” a customizable expense reimbursement program that can be used for more than 200+ needs that best suit you and your family, from student loan repayment, to childcare, to pet insurance. Our inclusive culture, active and engaged employee resource groups, and caring leaders value every voice and support you in doing the best work of your career. If youre passionate about our purpose "” people "”then we cant wait to support whatever gives you purpose. Were united by purpose, inspired by you. Key Responsibilities: Monitor and support Kronos Private Cloud and hosted environments remotely. Perform remote monitoring of Microsoft Windows (2003/2008/2012/2016) and Linux servers for:o System performance and uptimeo SQL database healtho Application service and web application statuso Server resource utilization Respond to alerts from monitoring tools and take corrective actions. Troubleshoot and identify root causes of server and application performance issues. Handle Level 1 escalations and follow the defined escalation matrix. Administer and maintain Windows and Linux operating systems. Support web applications and hosting services including IIS, JBoss, and Apache Tomcat. Understand and troubleshoot server-client architecture issues. Collaborate with internal teams to ensure high availability and performance of hosted services. Document incidents, resolutions, and standard operating procedures. Participate in 24/7 rotational shifts, including nights and weekends.Preferred Requirements and Skills: Experience with UKG Workforce Central (WFC) application. Familiarity with ServiceNow for incident, problem, and change management. Strong understanding of cloud infrastructure, virtualisation (VMware), and hybrid environments. Knowledge of web server configurations, deployments, and troubleshooting. Excellent communication, analytical, and problem-solving skills. Familiarity with monitoring tools (DataDog, Grafana, Splunk) and alert management. Willingness to work in rotational shifts, including nights and weekend Where were going UKG is on the cusp of something truly special. Worldwide, we already hold the #1 market share position for workforce management and the #2 position for human capital management. Tens of millions of frontline workers start and end their days with our software, with billions of shifts managed annually through UKG solutions today. Yet its our AI-powered product portfolio designed to support customers of all sizes, industries, and geographies that will propel us into an even brighter tomorrow! UKGCareers@ukg.com

Posted 2 weeks ago

Apply

5.0 - 8.0 years

18 - 20 Lacs

Noida, Madurai, Chennai

Hybrid

1. Expertise on Observability/SRE tools, platforms, and standards, including ELK Stack, Grafana, Prometheus, Loki, Victoria Metrics, Telegraf 2. Familiarity with modern logging frameworks and best practices: Opentelemetry, Kafka etc. 3. Experience with data visualization tools like Grafana, Kibana to create informative and actionable dashboards, reports, and alerts. 4. Proficiency in scripting languages like Python, Bash, or PowerShell is valuable for automating data collection, analysis, and visualization processes. 5. Good to have Experience in Monitoring Tools SCOM, Opensearch.

Posted 2 weeks ago

Apply

15.0 - 20.0 years

5 - 9 Lacs

Chennai

Work from Office

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Spring Boot Good to have skills : NAMinimum 5 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will design, build, and configure applications to meet business process and application requirements. A typical day involves collaborating with team members to understand project needs, developing application features, and ensuring that the applications are aligned with business objectives. You will also engage in problem-solving discussions and contribute to the overall success of the projects by implementing effective solutions. Roles & Responsibilities:- Expected to be an SME.- Collaborate and manage the team to perform.- Responsible for team decisions.- Engage with multiple teams and contribute on key decisions.- Provide solutions to problems for their immediate team and across multiple teams.- Facilitate knowledge sharing sessions to enhance team capabilities.- Monitor project progress and ensure timely delivery of application features. Professional & Technical Skills: - DS & Algo, Java 17/Java EE, Spring Boot, CICD- Web-Services using RESTful, Spring framework, Caching techniques, PostgreSQL SQL, Junit for testing, and containerization with Kubernetes/Docker. Airflow, GCP, Spark, Kafka - Hands on experiencing in building alerting/monitoring/logging for micro services using frameworks like Open Observe/Splunk, Grafana, Prometheus Additional Information:- The candidate should have minimum 5 years of experience in Spring Boot.- This position is based in Chennai.- A 15 years full time education is required. Qualification 15 years full time education

Posted 2 weeks ago

Apply

7.0 - 9.0 years

12 - 20 Lacs

Bengaluru

Work from Office

•This is a contract position for 6 months to 1year and in electronic city office for one of our client. Design, implement, and manage scalable infrastructure solutions in Kubernetes to ensure optimal performance and reliability of services. • Monitor and manage Kubernetes clusters, focusing on service availability, scaling, and resource optimization to meet SLA requirements. • Automate scaling (up and down) of services using tools like Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler. • Develop and maintain CI/CD pipelines for automated deployment, testing, and delivery of infrastructure and services. • Set up, configure, manage, and monitor self-hosted services such as MQTT, Kafka, Redis, Databases, and Nginx within Kubernetes clusters. • Implement robust alerting and monitoring solutions using tools like Prometheus, Grafana, and Loki (for log aggregation) to ensure continuous observability of infrastructure and services. • Handle the deployment, maintenance, and upgrades of both stateful and stateless services across development, staging, and production environments. • Optimize Kubernetes workloads for cost efficiency, reliability, and performance. • Design and implement log aggregation solutions using Loki and its tech stack, enabling efficient centralized log management across environments. • Collaborate with cross-functional teams to troubleshoot and resolve infrastructure issues while adhering to SLA and operational requirements. • Ensure compliance with IT security standards and successfully pass IT security assessments and penetration tests. • Maintain high availability and performance of production systems by proactively managing scalability, disaster recovery, and incident response.

Posted 2 weeks ago

Apply

15.0 - 20.0 years

5 - 9 Lacs

Coimbatore

Work from Office

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Spring Boot Good to have skills : NAMinimum 5 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will design, build, and configure applications to meet business process and application requirements. A typical day involves collaborating with team members to understand project needs, developing application features, and ensuring that the applications are aligned with business objectives. You will also engage in problem-solving discussions and contribute to the overall success of the projects by implementing effective solutions. Roles & Responsibilities:- Expected to be an SME, collaborate and manage the team to perform.- Responsible for team decisions.- Engage with multiple teams and contribute on key decisions.- Provide solutions to problems for their immediate team and across multiple teams.- Facilitate knowledge sharing sessions to enhance team capabilities.- Monitor project progress and ensure timely delivery of application features. Professional & Technical Skills: - DS & Algo, Java 17/Java EE, Spring Boot, CICD- Web-Services using RESTful, Spring framework, Caching techniques, PostgreSQL SQL, Junit for testing, and containerization with Kubernetes/Docker. Airflow, GCP, Spark, Kafka - Hands on experiencing in building alerting/monitoring/logging for micro services using frameworks like Open Observe/Splunk, Grafana, Prometheus Additional Information:- The candidate should have minimum 5 years of experience in Spring Boot.- This position is based at our Coimbatore office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 2 weeks ago

Apply

8.0 - 13.0 years

20 - 25 Lacs

Noida

Work from Office

Who we are and what do we do Innovation in every byte India has witnessed a journey of Innovation in Digital Payments and today it leads the world with over 45% of the Global digital transaction volume. At NPST, we believe that our decade long journey has carved an opportunity for building future roadmap for the world to follow. We are determined to contribute immensely to nation's growth story with our vision to provide digital technology across financial value chain and our mission to create leadership position in digital payment space. Founded in 2013, NPST is a leading fintech firm in India, part of the Make in India initiative and listed on BSE and National Stock Exchange. We specialize in Digital Payments operating as Technology Service Provider to Regulated entities and providing Payment Platform to Industry empowered by payment processing engine, Financial Super app, Risk Intelligence engine and digital merchant solution. While we drive 3% of global digital transaction volume for over 100+ clients, we aim to increase our market share by 5X in next five years through innovation and industry first initiatives. What will you do The ideal candidate will have deep expertise in application support, IT operations, incident/problem management, and hands-on experience in enterprise tools and technologies. The person will play a key role in ensuring high availability, security, and performance of our business-critical applications. Job Responsibilities 1. Leadership & Team Management Lead and mentor a team of application support engineers. Define KPIs, monitor team performance, and conduct regular reviews. Establish and continuously improve support processes and best practices. Act as a point of escalation for critical application incidents. 2. Application Monitoring & Support Manage the end-to-end support lifecycle for web-based and backend applications. Ensure application uptime and response SLAs are met. Coordinate with developers, infrastructure, and security teams for efficient resolution. 3. Incident, Problem & Change Management Handle critical incidents and ensure root cause analysis and permanent resolution. Drive preventive measures and post-incident reviews. Participate in change advisory boards and ensure minimal downtime during releases. 4. Documentation & Compliance Maintain SOPs, runbooks, and knowledge base for all supported applications. Ensure compliance with internal audit, IT security policies, and external standards like ISO 27001. What are we looking for: Technical Skills: Operating Systems : Linux (RHEL/CentOS/Ubuntu), Windows Server Databases : SQL Server, MySQL, Oracle Monitoring Tools : Zabbix, Grafana, Prometheus, Nagios Ticketing & ITSM Tools : ServiceNow, JIRA, Freshservice Middleware : Apache, Nginx, WebLogic, Tomcat Scripting : Shell scripting, PowerShell, basic Python (preferred) Version Control/Deployment Tools : Git, Jenkins, Ansible Security Awareness : Understanding of patch management, encryption, firewalls, and access controls Soft Skills: Strong analytical and troubleshooting skills. Effective communication with technical and non-technical stakeholders. Ability to work under pressure and manage multiple priorities. Strong ownership and customer-centric approach. Education Qualification - Bachelor's degree in software engineering or computer science. Experience: 8-12 years Industry - IT/Software/BFSI/ Banking /Fintech Work arrangement: 5 days working from office Location: Noida/Bengaluru What do we offer: An organization where we strongly believe in one organization, one goal. A fun workplace which compels us to challenge ourselves and aim higher. A team that strongly believes in collaboration and celebrating success together. Benefits that resonate 'We Care'. If this opportunity excites you, we invite you to apply and contribute to our success story. If your resume is shortlisted, you will hear back from us.

Posted 2 weeks ago

Apply

4.0 - 6.0 years

10 - 14 Lacs

Noida, Greater Noida

Work from Office

Hiring an OpenShift L2 Support Engineer with 4 - 6 years of experience for 24x7 onsite support in Noida - OpenShift infrastructure Admin, Telecom Domain, cluster, change, incident mgt, networking Design, integration, deployment, Certification in OCP

Posted 2 weeks ago

Apply

8.0 - 12.0 years

10 - 15 Lacs

Kochi

Work from Office

Job Title - Cloud Platform Engineer Specialist ACS Song Management Level:Level 9 Specialist Location:Kochi, Coimbatore, Trivandrum Must have skills:AWS, Terraform Good to have skills:Hybrid Cloud Experience:8-12 years of experience is required Educational Qualification:Graduation (Accurate educational details should capture) Job Summary Within our Cloud Platforms & Managed Services Solution Line, we apply an agile approach to provide true on-demand cloud platforms. We implement and operate secure cloud and hybrid global infrastructures using automation techniques for our clients business critical application landscape. As a Cloud Platform Engineer you are responsible for implementing on cloud and hybrid global infrastructures using infrastructure-as-code. Roles and Responsibilities Implement Cloud and Hybrid Infrastructures using Infrastructure-as-Code. Automate Provisioning and Maintenance for streamlined operations. Design and Estimate Infrastructure with an emphasis on observability and security. Establish CI/CD Pipelines for seamless application deployment. Ensure Data Integrity and Security through robust mechanisms. Implement Backup and Recovery Procedures for data protection. Build Self-Service Systems for enhanced developer autonomy. Collaborate with Development and Operations Teams for platform optimization. Professional and Technical Skills Customer-Focused Communicator adept at engaging cross-functional teams. Cloud Infrastructure Expert in AWS, Azure, or GCP. Proficient in Infrastructure as Code with tools like Terraform. Experienced in Container Orchestration (Kubernetes, Openshift, Docker Swarm). Skilled in Observability Tools like Prometheus, Grafana, etc., as well as Competent in Log Aggregation tools (Loki, ELK, Graylog) and Familiar with Tracing Systems such as Tempo. CI/CD and GitOps Savvy with potential knowledge of Argo-CD or Flux. Automation Proficiency in Bash and high-level languages (Python, Golang). Linux, Networking, and Database Knowledge for robust infrastructure management. Hybrid Cloud Experience a plus Additional Information About Our Company | Accenture (do not remove the hyperlink) Qualification Experience:3-5 years of experience is required Educational Qualification:Graduation (Accurate educational details should capture)

Posted 2 weeks ago

Apply

3.0 - 8.0 years

9 - 19 Lacs

Hyderabad, Ahmedabad, Bengaluru

Work from Office

Kubernetes Engineer Build bulletproof infrastructure for regulated industries At Ajmera Infotech , we're building planet-scale software for NYSE-listed clients with a 120+ strong engineering team . Our work powers mission-critical systems in HIPAA, FDA, and SOC2-compliant domains where failure is not an option . Why Youll Love It Own production-grade Kubernetes deployments at real scale Drive TDD-first DevOps in CI/CD environments Work in a compliance-first org (HIPAA, FDA, SOC2) with code-first values Collaborate with top-tier engineers in multi-cloud deployments Career growth via mentorship , deep-tech projects , and leadership tracks Key Responsibilities Design, deploy, and manage resilient Kubernetes clusters (k8s/k3s) Automate workload orchestration using Ansible or custom scripting Integrate Kubernetes deeply into CI/CD pipelines Tune infrastructure for performance, scalability, and regulatory reliability Support secure multi-tenant environments and compliance needs (e.g., HIPAA/FDA) Must-Have Skills 38 years of hands-on experience in production Kubernetes environments Expert-level knowledge of containerization with Docker Proven experience with CI/CD integration for k8s Automation via Ansible , shell scripting, or similar tools Infrastructure performance tuning within Kubernetes clusters Nice-to-Have Skills Multi-cloud cluster management (AWS/GCP/Azure) Helm, ArgoCD, or Flux for deployment and GitOps Service mesh, ingress controllers, and pod security policies

Posted 2 weeks ago

Apply

3.0 - 8.0 years

7 - 17 Lacs

Hyderabad, Ahmedabad, Bengaluru

Work from Office

Sr. Site Reliability Engineer - Keep Planet-Scale Systems Reliable, Secure, and Fast At Ajmera Infotech , we build planet-scale platforms for NYSE-listed clients from HIPAA-compliant health systems to FDA-regulated software that simply cannot fail. Our 120+ elite engineers design, deploy, and safeguard mission-critical infrastructure trusted by millions. Why Youll Love It Dev-first SRE culture — automation, CI/CD, zero-toil mindset TDD, monitoring, and observability baked in — not bolted on Code-first reliability — script, ship, and scale with real ownership Mentorship-driven growth — with exposure to regulated industries (HIPAA, FDA, SOC2) End-to-end impact — own infra across Dev and Ops Key Responsibilities Architect and manage scalable, secure Kubernetes clusters (k8s/k3s) in production Develop scripts in Python, PowerShell, and Bash to automate infrastructure operations Optimize performance, availability, and cost across cloud environments Design and enforce CI/CD pipelines using Jenkins, Bamboo, GitHub Actions Implement log monitoring and proactive alerting systems Integrate and tune observability tools like Prometheus and Grafana Support both development and operations pipelines for continuous delivery Manage infrastructure components including Artifactory, Nginx, Apache, IIS Drive compliance-readiness across HIPAA, FDA, ISO, SOC2 Must-Have Skills 3–8 years in SRE or infrastructure engineering roles Kubernetes (k8s/k3s) production experience Scripting: Python, PowerShell, Bash CI/CD tools: Jenkins, Bamboo, GitHub Actions Experience with log monitoring, alerting, and observability stacks Cross-functional pipeline support (Dev + Ops) Tooling: Artifactory, Nginx, Apache, IIS Performance, availability, and cost-efficiency tuning Nice-to-Have Skills Background in regulated environments (HIPAA, FDA, ISO, SOC2) Multi-OS platform experience Integration of Prometheus, Grafana, or similar observability platforms

Posted 2 weeks ago

Apply

7.0 - 11.0 years

17 - 22 Lacs

Mandya

Work from Office

Position Summary F5 Inc. is actively seeking an exceptional Sr Principal Software Engineer (Individual Contributor) to play a pivotal role in our SRE Operations team for the groundbreaking F5XC Product. Are you an SRE Operations specialist with automation in your DNA? Do you thrive in fast-paced SaaS environments where Why This Role is Unique: Our SaaS is hybrid running across public cloud and a global network of 50+ PoPs , delivering terabits of capacity . Our infrastructure spans cloud-native services and physical networking gear (routers, switches, firewalls), creating a uniquely challenging and exciting observability landscape. The Analytics & Observability platform will have deep reach across these layers , ensuring reliability, security, and performance at a massive scale. What Youll Do: Be the Force Behind Observability & Stability Drive end-to-end Observability (Logs, Metrics, and Alerts) across our hybrid SaaS stack , spanning cloud, edge, and physical network devices. Take ownership of Alerting strategy , cutting through noise while ensuring actionable, high-fidelity alerts. Implement intelligent automation to reduce operational toil and enhance real-time visibility. Own & Automate Operations Design, build, and manage automation for self-healing infrastructure across cloud + global PoPs. Develop automation for Kubernetes, ArgoCD, Helm Charts, Golang-based services, AWS, GCP, Terraform . Improve networking observability , ensuring our routers, switches, and firewalls are monitored at scale. Continuously eliminate manual ops work through automation and platform improvements. Lead Incident Response & Operational Excellence Participate in on-call rotations , ensuring rapid incident response across our cloud + edge stack. Drive incident response automation , reducing MTTR and increasing system resilience . Ensure security, compliance, and best practices in observability & automation . Collaborate & Mentor Work closely with application teams, network engineers, and SREs to improve reliability and performance. Mentor junior engineers, fostering a culture of automation-first thinking and deep observability . What Makes You a Great Fit? Deep expertise in Logs, Metrics, and Alerting, with a strong focus on Alerting automation . Experience in hybrid SaaS environments spanning cloud-native and global infrastructure. Strong background in Kubernetes, Infrastructure-as-Code (Terraform), Golang, AWS/GCP, and networking observability . Proven track record of eliminating toil and improving operational efficiency through automation. Passion for deep observability, networking-scale analytics, and automation at the edge .If you love solving reliability challenges at global scale, automating everything, and working in a hybrid cloud + networking environment , we want to talk to you!The About The Role is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change. Must-Have: Observability & Alerting Expertise Strong experience with Logs, Metrics, and Alerts , with a focus on high-fidelity alerting and automation . Automation & Infrastructure as Code Deep knowledge of Terraform, ArgoCD, Helm, Kubernetes, and Golang for automation . Cloud & Hybrid SaaS Experience Hands-on experience managing cloud-native (AWS/GCP) and edge infrastructure . Incident Response & Reliability Engineering Strong on-call experience , with a track record of reducing MTTR through automation Kubernetes Mastery Hands-on experience deploying, managing, and troubleshooting Kubernetes in production environments. Nice-to-Have: Networking & Edge Observability Familiarity with monitoring routers, switches, and firewalls in a global PoP environment . Data & Analytics in Observability Experience with time-series databases (Prometheus, Grafana, OpenTelemetry, etc.) . Security & Compliance Awareness Understanding of secure-by-design principles for monitoring & alerting . Mentorship & Collaboration Ability to mentor junior engineers and work cross-functionally with SREs, application teams, and network engineers . High Availability Disaster Recovery Experience with HA/DR and Migration Qualifications Typically, it requires at least 18 years of related experience with a bachelors degree, 15 years and a masters degree, or a PhD with 12 years experience; or equivalent experience. Excellent organizational agility and communication skills throughout the organization. Environment Empowered Work Culture: Experience an environment that values autonomy, fostering a culture where creativity and ownership are encouraged. Continuous Learning: Benefit from the mentorship of experienced professionals with solid backgrounds across diverse domains, supporting your professional growth. Team Cohesion: Join a collaborative and supportive team where youll feel at home from day one, contributing to a positive and inspiring workplace. F5 Networks, Inc. is an equal opportunity employer and strongly supports diversity in the workplace.

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies