Jobs
Interviews

15 Fluentd Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 9.0 years

0 Lacs

noida, uttar pradesh

On-site

You will be working as an AI Platform Engineer in Bangalore as part of the GenAI COE Team. Your key responsibilities will involve developing and promoting scalable AI platforms for customer-facing applications. It will be essential to evangelize the platform with customers and internal stakeholders, ensuring scalability, reliability, and performance to meet business needs. Your role will also entail designing machine learning pipelines for experiment management, model management, feature management, and model retraining. Implementing A/B testing of models and designing APIs for model inferencing at scale will be crucial. You should have proven expertise with MLflow, SageMaker, Vertex AI, and Azure AI. As an AI Platform Engineer, you will serve as a subject matter expert in LLM serving paradigms, with in-depth knowledge of GPU architectures. Expertise in distributed training and serving of large language models, along with proficiency in model and data parallel training using frameworks like DeepSpeed and service frameworks like vLLM, will be required. Demonstrating proven expertise in model fine-tuning and optimization techniques to achieve better latencies and accuracies in model results will be part of your responsibilities. Reducing training and resource requirements for fine-tuning LLM and LVM models will also be essential. Having extensive knowledge of different LLM models and providing insights on their applicability based on use cases is crucial. You should have proven experience in delivering end-to-end solutions from engineering to production for specific customer use cases. Your proficiency in DevOps and LLMOps practices, along with knowledge of Kubernetes, Docker, and container orchestration, will be necessary. A deep understanding of LLM orchestration frameworks such as Flowise, Langflow, and Langgraph is also required. In terms of skills, you should be familiar with LLM models like Hugging Face OSS LLMs, GPT, Gemini, Claude, Mixtral, and Llama, as well as LLM Ops tools like ML Flow, Langchain, Langraph, LangFlow, Flowise, LLamaIndex, SageMaker, AWS Bedrock, Vertex AI, and Azure AI. Additionally, knowledge of databases/data warehouse systems like DynamoDB, Cosmos, MongoDB, RDS, MySQL, PostGreSQL, Aurora, and Google BigQuery, as well as cloud platforms such as AWS, Azure, and GCP, is essential. Proficiency in DevOps tools like Kubernetes, Docker, FluentD, Kibana, Grafana, and Prometheus, along with cloud certifications like AWS Professional Solution Architect and Azure Solutions Architect Expert, will be beneficial. Strong programming skills in Python, SQL, and Javascript are required for this full-time role, with an in-person work location.,

Posted 2 days ago

Apply

5.0 - 10.0 years

13 - 18 Lacs

Bengaluru

Work from Office

Help design, develop and maintain robust cloud native applications in an as a services model on Cloud platform Evaluate, implement and standardize new tools / solutions to continuously improve the Cloud Platform Leverage expertise to driving organization and departments technical vision in development teams Liaise with global and local stakeholders and influence technical roadmaps Passionately contributing towards hosting a thriving developer community Encourage contribution towards inner and open sourcing, leading by example Profile required - Experience and exposer to good programming practices including Coding and Testing standards - Passion and Experience in proactively investigating, evaluating and implementing new technical solutions with continuously improvement - Possess good development culture and familiarity to industry wide best practices - Production mindset with keen focus on reliability and quality - Passionate about being a part of distributed self-sufficient feature team with regular deliverables - Proactive learner and own skills about Scrum, Data, Automation - Strong technical ability to monitor, investigate, analyze and fix production issues. - Ability to ideate and collaborate through inner and open sourcing - Ability to Interact with client managers, developers, testers and cross functional teams like architects - Experience working in Agile Team and exposure to agile / SAFE development methodologies. - Minimum 5+ years of experience in software development and architecture. - Good experience of design and development including object-oriented programming in python, cloud native application development, APIs and micro-service - Good experience with relational databases like PostgreSQL and ability to build robust SQL queries - Knowledge of Grafana for data visualization and ability to build dashboard from various data sources - Experience in big technologies like Elastic search and FluentD - Experience in hosting applications using Containerization [Docker, Kubernetes] - Good understanding of CI/CD and DevOps and Proficient with tools like GIT, Jenkin, Sonar - Good system skills with linux OS and bash scripting - Understanding of the Cloud and cloud services

Posted 1 week ago

Apply

3.0 - 8.0 years

6 - 12 Lacs

Gurugram

Work from Office

Location: NCR Team Type: Platform Operations Shift Model: 24x7 Rotational Coverage / On-call Support (L2/L3) Team Overview The OpenShift Container Platform (OCP) Operations Team is responsible for the continuous availability, health, and performance of OpenShift clusters that support mission-critical workloads. The team operates under a tiered structure (L2, L3) to manage day-to-day operations, incident management, automation, and lifecycle management of the container platform. This team is central to supporting stakeholders by ensuring the container orchestration layer is secure, resilient, scalable, and optimized. L2 OCP Support & Platform Engineering (Platform Analyst) Role Focus: Advanced Troubleshooting, Change Management, Automation Experience: 3–6 years Resources : 5 Key Responsibilities: Analyze and resolve platform issues related to workloads, PVCs, ingress, services, and image registries. Implement configuration changes via YAML/Helm/Kustomize. Maintain Operators, upgrade OpenShift clusters, and validate post-patching health. Work with CI/CD pipelines and DevOps teams for build & deploy troubleshooting. Manage and automate namespace provisioning, RBAC, NetworkPolicies. Maintain logs, monitoring, and alerting tools (Prometheus, EFK, Grafana). Participate in CR and patch planning cycles. L3 – OCP Platform Architect & Automation Lead (Platform SME) Role Focus: Architecture, Lifecycle Management, Platform Governance Experience: 6+ years Resources : 2 Key Responsibilities: Own lifecycle management: upgrades, patching, cluster DR, backup strategy. Automate platform operations via GitOps, Ansible, Terraform. Lead SEV1 issue resolution, post-mortems, and RCA reviews. Define compliance standards: RBAC, SCCs, Network Segmentation, CIS hardening. Integrate OCP with IDPs (ArgoCD, Vault, Harbor, GitLab). Drive platform observability and performance tuning initiatives. Mentor L1/L2 team members and lead operational best practices. Core Tools & Technology Stack Container Platform: OpenShift, Kubernetes CLI Tools: oc, kubectl, Helm, Kustomize Monitoring: Prometheus, Grafana, Thanos Logging: Fluentd, EFK Stack, Loki CI/CD: Jenkins, GitLab CI, ArgoCD, Tekton Automation: Ansible, Terraform Security: Vault, SCCs, RBAC, NetworkPolicies

Posted 1 week ago

Apply

5.0 - 8.0 years

5 - 8 Lacs

Chennai

Work from Office

Kafka Admin Consult with inquiring teams on how to leverage Kafka within their pipelines Architect, Build and Support existing and new Kafka clusters via IaC Partner with Splunk teams to route trac through Kafka by utilizing open-source agents and collectors deployed via ChefRemediate any health issues within Kafka Automate (where possible) any operational processes on the team Create new and/or update monitoring dashboards and alerts as neededManage a continuous improvement / continuous development (CI/CD pipelinePerform PoCs on new components to expand/enhance teams Kafka oerings Preferred QualificationsKnowledge and experience with Splunk, Elastic, Kibana and GrafanaKnowledge and experience with log collection agents such as Open-Telemetry, Fluent Bit, FluentD, Beats and LogStash.Knowledge and experience with Kubernetes / DockerKnowledge and experience with Kafka-ConnectKnowledge and experience with AWS or AzureKnowledge and experience with Streaming Analytics Mandatory Skills: API Microservice Integration. Experience: 5-8 Years.

Posted 2 weeks ago

Apply

3.0 - 6.0 years

5 - 9 Lacs

Bengaluru

Work from Office

Kubernetes (K8s) Python, Java Ansible Shell scripting Experience with Openshift Container Platform Go Language Robot Framework Experience with logging stacks like Grafana, Opensearch, Fluentd, Logstash stack on K8s DevOps framework Jenkins /Ansible etc Experience in migrating legacy applications to K8S Application development experience on Kubernetes

Posted 2 weeks ago

Apply

3.0 - 7.0 years

9 - 13 Lacs

Pune

Work from Office

As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes. Your primary responsibilities include: 24x7 Observability: Be part of a worldwide team that monitors the health of production systems and services around the clock, ensuring continuous reliability and optimal customer experience. Cross-Functional Troubleshooting: Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively. Deployment and Configuration: Leverage Continuous Delivery (CI/CD) tools to deploy services and configuration changes at enterprise scale. Security and Compliance Implementation: Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA. Maintenance and Support: Tasks related to applying Couchbase security patches and upgrades, supporting Cassandra and Mongo for pager duty rotation, and collaborating with Couchbase Product support for issue resolution. Required education Bachelor's Degree Required technical and professional expertise System Monitoring and Troubleshooting: Strong skills in monitoring/observability, issue response, and troubleshooting for optimal system performance. Automation Proficiency: Proficiency in automation for production environment changes, streamlining processes for efficiency, and reducing toil. Linux Proficiency: Strong knowledge of Linux operating systems. Operation and Support Experience: Demonstrated experience in handling day-to-day operations, alert management, incident support, migration tasks, and break-fix support. Experience with Infrastructure as Code (Terraform/OpenTofu) Experience with ELK/EFK stack (ElasticSearch, Logstash/Fluentd, and Kibana) Preferred technical and professional experience Kubernetes/OpenShift: Strongly preferred experience in working with production Kubernetes/OpenShift environments. Automation/Scripting: In depth experience with the Ansible, Python, Terraform, and CI/CD tools such as Jenkins, IBM Continuous Delivery, ArgoCD Monitoring/Observability: Hands on experience crafting alerts and dashboards using tools such as Instana, Grafana/Prometheus Experience working in an agile team, e.g., Kanban

Posted 2 weeks ago

Apply

7.0 - 12.0 years

25 - 32 Lacs

Pune

Work from Office

Hi, Wishes from GSN!!! Pleasure connecting with you!!! We been into Corporate Search Services for Identifying & Bringing in Stellar Talented Professionals for our reputed IT / Non-IT clients in India. We have been successfully providing results to various potential needs of our clients for the last 20 years. Who are we looking for? Skilled IT Operations Consultant specializing in Monitoring and Observability to design, implement and optimize monitoring solutions for our customers. Strong background in monitoring, observability and IT service management is MUST . 1. WORK LOCATION : PUNE 2. Job Role: LEAD ENGINEER 3. EXPERIENCE : 7+ yrs 4. CTC Range: Rs. 25 LPA to Rs. 30 LPA 5. Work Type : WFO ****** Looking for SHORT JOINERS ****** Job Description : Required Skills : Strong understanding of infrastructure and platform development principles and experience with programming languages such as Python, Ansible for developing custom scripts . Strong knowledge of monitoring frameworks, logging systems (ELK stack, Fluentd), and tracing tools (Jaeger, Zipkin) along with the OpenSource solutions like Prometheus, Grafana. Extensive EXP with monitoring and observability solutions such as OpsRamp, Dynatrace, New Relic , must have worked with ITSM integration (e.g. integration with ServiceNow, BMC remedy etc.) Working EXP with RESTful APIs and understanding of API integration with the monitoring tools . Knowledge of ITIL processes and Service Management frameworks . Familiarity with security monitoring and compliance requirements. Familiarity with AIOps and Machine Learning techniques for anomaly detection and incident prediction. Excellent analytical and problem-solving skills, ability to debug and troubleshoot complex automation issues Roles & Responsibilities : Design end-to-end monitoring and observability solutions to provide comprehensive visibility into infrastructure, applications and networks. Implement monitoring tools and frameworks (e.g., Prometheus, Grafana, OpsRamp, Dynatrace, New Relic) to track key performance indicators and system health metrics. Integration of monitoring and observability solutions with IT Service Management Tools. Develop and deploy dashboards and reports to proactively identify and address system performance issues. Architect scalable observability solutions to support hybrid and multi-cloud environments. Collaborate with infrastructure, development and DevOps teams to ensure seamless integration of monitoring systems into CI/CD pipelines. Continuously optimize monitoring configurations and thresholds to minimize noise and improve incident detection accuracy. Utilize AIOps and machine learning capabilities for intelligent incident management and predictive analytics. Work closely with business stakeholders to define monitoring requirements and success metrics. Document monitoring architectures, configurations and operational procedures. ****** Looking for SHORT JOINERS ****** If interested, dont hesitate to click APPLY for IMMEDIATE response. Best Wishes, GSN HR | Google review : https://g.co/kgs/UAsF9W

Posted 2 weeks ago

Apply

1.0 - 6.0 years

7 - 17 Lacs

Noida

Work from Office

Job Summary Site Reliability Engineers (SRE's) cover the intersection of Software Engineer and Systems Administrator. In other words, they can both create code and manage the infrastructure on which the code runs. This is a very wide skillset, but the end goal of an SRE is always the same: to ensure that all SLAs are met, but not exceeded, so as to balance performance and reliability with operational costs. As a Site Reliability Engineer II, you will be learning our systems, improving your craft as an engineer, and taking on tasks that improve the overall reliability of the VP platform. Key Responsibilities: Design, implement, and maintain robust monitoring and alerting systems. Lead observability initiatives by improving metrics, logging, and tracing across services and infrastructure. Collaborate with development and infrastructure teams to instrument applications and ensure visibility into system health and performance. Write Python scripts and tools for automation, infrastructure management, and incident response. Participate in and improve the incident management and on-call process, driving down Mean Time to Resolution (MTTR). Conduct root cause analysis and postmortems following incidents and champion efforts to prevent recurrence. Optimize systems for scalability, performance, and cost-efficiency in cloud and containerized environments. Advocate and implement SRE best practices, including SLOs/SLIs, capacity planning, and reliability reviews. Required Skills & Qualifications: 1+ years of experience in a Site Reliability Engineer or similar role. Excellent communicaiton skills in English. Proficiency in Python for automation and tooling. Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, New Relic, Open Telemetry, etc. Experience with log aggregation and analysis tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Fluentd. Good understanding of cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes). Familiarity with infrastructure-as-code (Terraform, Ansible, or similar). Strong debugging and incident response skills. Knowledge of CI/CD pipelines and release engineering practices.

Posted 3 weeks ago

Apply

8.0 - 10.0 years

15 - 30 Lacs

Pune

Work from Office

Role Overview We are looking for experienced DevOps Engineers (8+ years) with a strong background in cloud infrastructure , automation , and CI/CD processes. The ideal candidate will have hands-on experience in building , deploying , and maintaining cloud solutions using Infrastructure-as-Code (IaC) best practices. The role requires expertise in containerization , cloud security , networking and monitoring tools to optimize and scale enterprise-level applications. Key Responsibilities Design , implement , and manage cloud infrastructure solutions on AWS , Azure , or GCP. Develop and maintain Infrastructure-as-Code (IaC) using Terraform , CloudFormation ,or similar tools. Implement and manage CI/CD pipelines using tools like GitHub Actions , Jenkins ,GitLab CI/CD , BitBucket Pipelines , or AWS CodePipeline. Manage and orchestrate containers using Kubernetes , OpenShift , AWS EKS , AWS ECS , and Docker. Work on cloud migrations , helping organizations transition from on-premises data centers to cloud-based infrastructure. Ensure system security and compliance with industry standards such as SOC 2 , PCI ,HIPAA , GDPR , and HITRUST. Set up and optimize monitoring , logging , and alerting using tools like Datadog ,Dynatrace , AWS CloudWatch , Prometheus , ELK , or Splunk. Automate deployment , configuration , and management of cloud-native applications using Ansible , Chef , Puppet , or similar configuration management tools. Troubleshoot complex networking , Linux/Windows server issues , and cloud-related performance bottlenecks. Collaborate with development , security , and operations teams to streamline the DevSecOps process. Must-Have Skills 3+ years of experience in DevOps , cloud infrastructure , or platform engineering. Expertise in at least one major cloud provider: AWS , Azure , or GCP. Strong experience with Kubernetes , ECS , OpenShift , and container orchestration technologies. Hands-on experience in Infrastructure-as-Code (IaC) using Terraform , AWS CloudFormation , or similar tools. Proficiency in scripting/programming languages like Python , Bash , or PowerShell for automation. Strong knowledge of CI/CD tools such as Jenkins , GitHub Actions , GitLab CI/CD , or BitBucket Pipelines. Experience with Linux operating systems (RHEL , SUSE , Ubuntu , Amazon Linux) and Windows Server administration. Expertise in networking (VPCs , Subnets , Load Balancing , Security Groups ,Firewalls). Experience in log management and monitoring tools like Datadog , CloudWatch , Prometheus , ELK , Dynatrace. Strong communication skills to work with cross-functional teams and external customers. Knowledge of Cloud Security best practices , including IAM , WAF , GuardDuty , CVE scanning , vulnerability management. Good-to-Have Skills Knowledge of cloud-native security solutions (AWS Security Hub , Azure Security Center , Google Security Command Center). Experience in compliance frameworks (SOC 2 , PCI , HIPAA , GDPR , HITRUST). Exposure to Windows Server administration alongside Linux environments. Familiarity with centralized logging solutions (Splunk , Fluentd , AWS OpenSearch). GitOps experience with tools like ArgoCD or Flux. Background in penetration testing , intrusion detection , and vulnerability scanning. Experience in cost optimization strategies for cloud infrastructure. Passion for mentoring teams and sharing DevOps best practices.

Posted 1 month ago

Apply

10.0 - 12.0 years

0 Lacs

Delhi, India

On-site

Job Title Cloud Application Migration Architect Government Sector (NIC, Cloud Native, OpenStack, Azure) Location: New Delhi / PAN India / Hybrid Job Type: Full-time Job Summary We are looking for an experienced Cloud Application Migration Architect to lead modernization and migration of government IT systems to secure, scalable, and cloud-native platforms. This role demands deep knowledge of OpenStack, Kubernetes, Red Hat OpenShift, and Microsoft Azure, along with experience operating in regulated government environments. You will be responsible for both lift-and-shift and cloud-native migration strategies for legacy applications running on aging infrastructure across ministries and departments. Key Responsibilities - Lead migration of legacy government applications to cloud platforms (OpenStack, Azure), including both lift-and-shift (as-is) and modernization approaches. - Design cloud architectures using OpenStack, Kubernetes, OpenShift, and Azure with a strong focus on security, scalability, and compliance. - Containerize legacy applications where applicable and deploy using Kubernetes/OpenShift with automated CI/CD pipelines. - Ensure compliance with government IT regulations including GIGW, CERT-In, MeitY, STQC, and data residency requirements. - Develop migration playbooks, automation scripts, and infrastructure-as-code templates using tools like Ansible, Terraform, and Heat. - Work with customer teams and government stakeholders to conduct application discovery, define migration sequencing, and ensure minimal downtime during cutovers. - Implement observability and security measures (e.g., Prometheus, Grafana, Fluentd, Keycloak, Vault) for post-migration support. - Provide documentation, training, and handover to operational teams post-migration. Required Qualifications Bachelors/Masters degree in Computer Science, IT, or Engineering. - 10+ years of IT experience with at least 3+ years in cloud architecture and migration projects. - Proven experience in migrating legacy applications as-is (lift-and-shift) to cloud platforms such as OpenStack or Azure. - Strong hands-on expertise in OpenStack (Nova, Neutron, Cinder, Keystone, etc.) within government/private cloud environments. - In-depth knowledge of Kubernetes, Helm, and cloud-native design patterns. - Practical experience with Red Hat OpenShift, including Operators, Pipelines, and GitOps workflows. - Proficiency in Microsoft Azure, particularly AKS, Azure AD, VM Migration, and hybrid models (e.g., Azure Arc). - Familiarity with infrastructure automation (Ansible, Terraform) and container security best practices. - Understanding of Indian government IT and security guidelines including CERT-In, GIGW, and MeitY frameworks. Preferred Qualifications Prior project experience with NIC/MeitY initiatives such as eDistrict, eOffice, CPGRAMS, or other Digital India platforms. - Relevant certifications: Red Hat Certified Specialist in OpenShift, CKA/CKAD, Azure Solutions Architect, OpenStack Administrator. - Familiarity with STQC audit processes and GoI documentation workflows. - Knowledge of GoI data sovereignty and hybrid cloud governance models. Soft Skills - Strong written and verbal communication, especially when interfacing with senior government officials and customer staff. - Ability to operate in structured environments with clear documentation, audit trails, and compliance checkpoints. - Independent, accountable, and proactive in managing mission-critical deployments. - Leadership experience with multi-vendor and inter-departmental teams.

Posted 1 month ago

Apply

3.0 - 8.0 years

15 - 30 Lacs

Bengaluru

Remote

Hiring for USA based big Multinational Company (MNC) The Cloud Engineer is responsible for designing, implementing, and managing cloud-based infrastructure and services. This role involves working with cloud platforms such as AWS, Microsoft Azure, or Google Cloud to ensure scalable, secure, and efficient cloud environments that meet the needs of the organization. Design, deploy, and manage cloud infrastructure in AWS, Azure, GCP, or hybrid environments. Automate cloud infrastructure provisioning and configuration using tools like Terraform, Ansible, or CloudFormation. Ensure cloud systems are secure, scalable, and reliable through best practices in architecture and monitoring. Work closely with development, operations, and security teams to support cloud-native applications and services. Monitor system performance and troubleshoot issues to ensure availability and reliability. Manage CI/CD pipelines and assist in DevOps practices to streamline software delivery. Implement and maintain disaster recovery and backup procedures. Optimize cloud costs and manage billing/reporting for cloud resources. Ensure compliance with data security standards and regulatory requirements. Stay current with new cloud technologies and make recommendations for continuous improvement. Bachelors degree in Computer Science, Information Technology, Engineering, or a related field. 3+ years of experience working with cloud platforms such as AWS, Azure, or Google Cloud. Proficiency in infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation). Experience with CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps). Familiarity with containerization and orchestration (e.g., Docker, Kubernetes). Strong scripting skills (e.g., Python, Bash, PowerShell). Solid understanding of networking, security, and identity management in the cloud. Excellent problem-solving and communication skills. Ability to work independently and as part of a collaborative team.

Posted 1 month ago

Apply

10.0 - 15.0 years

3 - 5 Lacs

Hyderabad, India

Hybrid

Job Purpose Designs, develops, and implements Java applications to support business requirements. Follows approved life cycle methodologies, creates design documents, writes code and performs unit and functional testing of software. Contributes to the overall architecture and standards of the group, acts as an SME and plays a software governance role. Key Activities / Outputs • Work closely with business analysts to analyse and understand the business requirements and business case, in order to produce simple, cost effective and innovative solution designs • Implement the designed solutions in the required development language (typically Java) in accordance with the Vitality Group standards, processes, tools and frameworks • Testing the quality of produced software thoroughly through participation in code reviews, the use of static code analysis tools, creation and execution of unit tests, functional regression tests, load tests and stress tests and evaluating the results of performance metrics collected on the software. • Participate in feasibility studies, proof of concepts, JAD sessions, estimation and costing sessions, evaluate and review programming methods, tools and standards, etc. • Maintain the system in production and provide support in the form of query resolution and defect fixes • Prepare the necessary technical documentation including payload definitions, class diagrams, activity diagrams, ERDs, operational and support documentation, etc • Driving the skills development of team members, coaching of team members for performance and coaching on career development, recruitment, staff training, performance management, etc Technical Skills or Knowledge Extensive experience working with Java, Solid understanding of Object Orientated programming fundamentals, Needs to have a high-level understanding of the common frameworks in the Java technology stack, Extensive knowledge of design patterns and the ability to recognize and apply them, Spring, Hibernate, Junit, SOA, Microservices, Docker, Data Modelling, UML, SQL, SoapUI (SOAP) / REST client (JSON), Architectural Styles, Kafka, Zookeeper, Zuul, Eureka, Obsidian, Elasticsearch, Kibana, FluentD Preferred Technical Skills (Would be advantageous) This position is a hybrid role based in Hyderabad which requires you to be in the office on a Tuesday, Wednesday and Thursday.

Posted 1 month ago

Apply

3.0 - 6.0 years

4 - 8 Lacs

Bengaluru

Work from Office

We are looking for a Kibana Subject Matter Expert (SME) to support our Network Operations Center (NOC) by designing, developing, and maintaining real-time dashboards and alerting mechanisms. The ideal candidate will have strong experience in working with Elasticsearch and Kibana to visualize key performance indicators (KPIs), system health, and alerts related to NOC-managed infrastructure. Key Responsibilities: Design and develop dynamic and interactive Kibana dashboards tailored for NOC monitoring. Integrate various NOC elements such as network devices, servers, applications, and services into Elasticsearch/Kibana. Create real-time visualizations and trend reports for system health, uptime, traffic, errors, and performance metrics. Configure alerts and anomaly detection mechanisms for critical infrastructure issues using Kibana or related tools (e.g., ElastAlert, Watcher). Collaborate with NOC engineers, infrastructure teams, and DevOps to understand monitoring requirements and deliver customized dashboards. Optimize Elasticsearch queries and index mappings for performance and data integrity. Provide expert guidance on best practices for log ingestion, parsing, and data retention strategies. Support troubleshooting and incident response efforts by providing actionable insights through Kibana visualizations. Primary Skills Proven experience as a Kibana SME or similar role with a focus on dashboards and alerting. Strong hands-on experience with Elasticsearch and Kibana (7.x or higher). Experience in working with log ingestion tools (e.g., Logstash, Beats, Fluentd). Solid understanding of NOC operations and common infrastructure elements (routers, switches, firewalls, servers, etc.). Proficiency in JSON, Elasticsearch Query DSL, and Kibana scripting for advanced visualizations. Familiarity with alerting frameworks such as ElastAlert, Kibana Alerting, or Watcher. Good understanding of Linux-based systems and networking fundamentals. Strong problem-solving skills and attention to detail. Excellent communication and collaboration skills. Preferred Qualifications: Experience in working within telecom, ISP, or large-scale IT operations environments. Exposure to Grafana, Prometheus, or other monitoring and visualization tools. Knowledge of scripting languages such as Python or Shell for automation. Familiarity with SIEM or security monitoring solutions.

Posted 1 month ago

Apply

4.0 - 12.0 years

8 - 11 Lacs

Pune, Bengaluru

Work from Office

Proven experience as a DevOps engineer or in a similar role with a focus on monitoring and observability. Expert-level knowledge of Splunk (advanced configuration, data indexing, search optimization, and alerting). Advanced experience with Grafana for creating real-time, interactive dashboards and visualizations. Strong proficiency in Linux/Unix systems administration and scripting (Bash, Python, etc.). Solid understanding of cloud platforms like AWS, Azure, or GCP and how to integrate monitoring solutions into these environments. Experience with containerization (Docker, Kubernetes) and orchestration tools. Familiarity with Infrastructure as Code tools (Terraform, Ansible, etc.). Experience with automation tools (Jenkins, GitLab CI, etc.) for deploying and managing infrastructure. Strong problem-solving skills with the ability to troubleshoot and resolve complex technical issues in a fast-paced environment. Experience with distributed systems and knowledge of performance tuning, scaling, and high-availability setups. Preferred Skills : Experience in managing large-scale Splunk and Grafana environments. Knowledge of log aggregation technologies (Fluentd, Logstash, etc.). Familiarity with Alerting Incident Management Tools (PagerDuty, Opsgenie, etc.). Certifications in Cloud Platforms (AWS Certified DevOps Engineer, Azure DevOps Engineer, etc.). Familiarity with Agile methodologies (Scrum/Kanban) and DevOps practices. Understanding of Security principles and practices as they relate to logging and monitoring.

Posted 2 months ago

Apply

6 - 8 years

16 - 20 Lacs

Bengaluru

Work from Office

Senior DevOps Engineer Location: Bengaluru South, Karnataka, India Experience: 68 Years Compensation: 1620 LPA Industry: PropTech | AgriTech | Cloud Infrastructure | Platform Engineering Employment Type: Full-Time | On-Site/Hybrid Are you a DevOps Engineer passionate about building scalable and efficient infrastructure for innovative platforms? If you’re excited by the challenge of automating and optimizing cloud infrastructure for a mission-driven PropTech platform, this opportunity is for you. We are seeking a seasoned DevOps Engineer to be a key player in scaling a pioneering property-tech ecosystem that reimagines how people discover, trust, and own their dream land or property. Our ideal candidate thrives in dynamic environments, embraces automation, and values security, performance, and reliability. You’ll be working alongside a passionate and agile team that blends technology with sustainability, enabling seamless experiences for both property buyers and developers. Key Responsibilities Architect, deploy, and maintain highly available, scalable, and secure cloud infrastructure, preferably on AWS. Design, develop, and optimize CI/CD pipelines for automated software build, test, and deployment. Implement and manage Infrastructure as Code (IaC) using Terraform, CloudFormation, or similar tools. Set up and manage robust monitoring, logging, and alerting systems (Prometheus, Grafana, ELK, etc.). Proactively monitor and improve system performance, availability, and resilience. Ensure compliance, access control, and secrets management across environments using best-in-class DevSecOps practices. Collaborate closely with development, QA, and product teams to streamline software delivery lifecycles. Troubleshoot production issues, identify root causes, and implement long-term solutions. Optimize infrastructure costs while maintaining performance SLAs. Build and maintain internal tools and automation scripts to support development workflows. Stay updated with the latest in DevOps practices, cloud technologies, and infrastructure design. Participate in on-call support rotation for critical incidents and infrastructure health. Preferred Qualifications Bachelor's degree in Computer Science, Engineering, or related field. 6–8 years of hands-on experience in DevOps, SRE, or Infrastructure roles. Strong proficiency in AWS (EC2, S3, RDS, Lambda, ECS/EKS). Expert-level scripting skills in Python, Bash, or Go. Solid experience with CI/CD tools such as Jenkins, GitLab CI, CircleCI, etc. Expertise in Docker, Kubernetes, and container orchestration at scale. Experience with configuration management tools like Ansible, Chef, or Puppet. Solid understanding of networking, DNS, SSL, firewalls, and load balancing. Familiarity with relational and non-relational databases (PostgreSQL, MySQL, etc.) is a plus. Excellent troubleshooting and analytical skills with a performance- and security-first mindset. Experience working in agile, fast-paced startup environments is a strong plus. Nice to Have Experience working in PropTech, AgriTech, or sustainability-focused platforms. Exposure to geospatial mapping systems, virtual land visualization, or real-time data platforms. Prior work with DevSecOps, service meshes like Istio, or secrets management with Vault. Passion for building tech that positively impacts people and the planet. Why Join Us? Join India’s first revolutionary PropTech platform, blending human-centric design with cutting-edge technology to empower property discovery and ownership. Be part of a company that doesn’t just build products—it builds ecosystems: for urban buyers, rural farmers, and the environment. Work with a forward-thinking leadership team from one of India’s most respected sustainability and land stewardship organizations. Collaborate across cross-disciplinary teams solving real-world challenges at the intersection of tech, land, and sustainability.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies