Home
Jobs

257 Opentelemetry Jobs - Page 10

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 years

0 Lacs

Sahibzada Ajit Singh Nagar, Punjab, India

On-site

Linkedin logo

Everything we do is powered by our customers! Featured on Deloitte's Technology Fast 500 list and G2's leaderboard, Maropost offers a connected experience that our customers anticipate, transforming marketing, merchandising, and operations with commerce tools designed to scale with fast-growing businesses. With a relentless focus on our customers’ success, we are motivated by curiosity, creativity, and collaboration to power 5,000+ global brands. Driven by a customer-first mentality, we empower businesses to achieve their goals and grow alongside us. If you're ready to make a significant impact and be part of our transformative journey, Maropost is the place for you. Become a part of Maropost today and help shape the future of commerce! What You'll Be Responsible For Building and managing REST API stack for Maropost Web Apps preferably using ASPNET framework. MUST have good debugging skillset. Good knowledge in .NET, Standard .NET, .NET Core Stacks. MUST have experience in at least 1 production level application using .NET Stack Good understanding of JavaScript internals. MUST have experience in at least 1 production level application using any JavaScript UI framework. MUST have Good SQL experience using SQL Server. MUST have effective communication Skill. Drive innovation within the engineering team, identifying opportunities to improve processes, tools, and technologies. Evaluating and improving the tools and frameworks used in software development. Reviewing the architecture and code written by other developers. What You'll Bring To Maropost B.E/B.Tech 5+ years of hands-on experience with building enterprise grade application Enthusiasm to learn building and managing API endpoints for multimodal clients. Enthusiasm to learn and contribute to a challenging & fun-filled startup. A knack for problem-solving and following efficient coding practices. Extraordinarily strong interpersonal communication and collaboration skills Hands-on experience with tech stacks – .Net / .NET Core / VB.net Hands-on experience in - any JavaScript UI framework (Preferably Angular) Hands-on experience in - any database (preferably using MS SQL Server) Hands-on experience in - any Code versioning platform like GitHub/BitBucket and CI/CD platform like Jenkins/AzureDevOps Frontend: HTML, CSS, JavaScript Familiarity in any of the following will be added advantage - Databases and caching: Redis, CosmosDB, DynamoDB Cloud services: Managing infrastructure with basic services from (GCP/AWS/Azure), such as VMs, API Gateway, and Load Balancers. Monitoring and observability tools: Prometheus, Grafana, Loki, OpenTelemetry. Network protocols and libraries: HTTP, WebSocket, Socket.io. Version control and CI/CD: Jenkins, Argo CD, Spinnaker, Terraform What’s in it for you? You will have the autonomy to take ownership of your role and contribute to the growth and success of our brand. If you are driven to make an immediate impact, achieve results, thrive in a high performing team and want to grow in a dynamic and rewarding environment – You belong to Maropost! Show more Show less

Posted 3 weeks ago

Apply

6.0 years

0 Lacs

India

On-site

Linkedin logo

We’re looking for problem solvers, innovators, and dreamers who are searching for anything but business as usual. Like us, you’re a high performer who’s an expert at your craft, constantly challenging the status quo. You value inclusivity and want to join a culture that empowers you to show up as your authentic self. You know that success hinges on commitment, that our differences make us stronger, and that the finish line is always sweeter when the whole team crosses together. About Alteryx Alteryx is a leader in data analytics automation, helping organizations unlock the power of their data with a platform that makes analytics accessible to all. As we advance our cloud platform capabilities, we’re looking for a skilled Lead Backend Engineer to join our Cloud Connectivity team — the backbone of our integration platform that powers internal and external connectivity experiences across the organization. Role Overview As a Lead Software Engineer in the Cloud Connectivity Platform domain, you’ll play a critical role in designing and building core backend services that enable seamless, secure, and scalable data connectivity. This is a platform role , building reusable services and APIs that serve as the foundation for multiple product teams and external customers. You’ll bring technical leadership and hands-on engineering expertise to evolve our connectivity platform, ensure operational excellence, and foster alignment across cross-functional teams. Must Have 6+ years of backend software development experience with a strong track record in shipping scalable production systems. Expertise in Java and frameworks like Spring Boot. Solid understanding of designing and operating distributed systems and microservices architectures. Hands-on experience with Kubernetes and Docker for containerized application development and deployment. Strong background in modern observability tooling (e.g., Prometheus, Grafana, OpenTelemetry, or similar). Experience building platform-level capabilities, including APIs, SDKs, and services consumed by other engineering teams. Solid grasp of RESTful API design, service reliability, versioning, and backward compatibility. Strong understanding of cloud infrastructure (AWS, Azure, or GCP). Excellent communication and collaboration skills, with the ability to influence and align stakeholders across teams. Proactive and self-driven, able to lead technical initiatives and drive them to completion with minimal guidance. Nice To Have Experience in the data domain — particularly data integration, ingestion, or connectivity patterns. Prior experience in NodeJS is a big plus. Familiarity with API gateways, connector frameworks, or third-party data integrations. Background working with multi-tenant, secure cloud platforms. Experience in designing internal developer platforms or SDKs. Prior experience mentoring junior engineers or driving engineering best practices across teams. What You'll Be Doing Lead the design and implementation of core backend services for data connectivity, powering integrations across Alteryx’s cloud platform. Develop platform services that are reusable, reliable, and performant, with high standards for quality and observability. Collaborate with product, infrastructure, and other engineering teams to align on architecture and integration strategies. Contribute to roadmap planning, technical design reviews, and team-level decision-making. Foster a culture of engineering excellence through mentorship, collaboration, and continuous improvement. Why Join Us? Build foundational systems that enable thousands of users and teams to connect to the data they care about. Work on technically challenging and high-impact problems at the platform level. Be part of a company that values innovation, inclusivity, and continuous learning. Collaborate with global teams while working from our Bengaluru office. Find yourself checking a lot of these boxes but doubting whether you should apply? At Alteryx, we support a growth mindset for our associates through all stages of their careers. If you meet some of the requirements and you share our values, we encourage you to apply. As part of our ongoing commitment to a diverse, equitable, and inclusive workplace, we’re invested in building teams with a wide variety of backgrounds, identities, and experiences. This position involves access to software/technology that is subject to U.S. export controls. Any job offer made will be contingent upon the applicant’s capacity to serve in compliance with U.S. export controls. Show more Show less

Posted 3 weeks ago

Apply

0.0 - 5.0 years

0 Lacs

Chennai, Tamil Nadu

On-site

Indeed logo

Network Python Automation EngineerIn-Office Chennai, Tamil Nadu, India Telecom Engineer Job Description: Telco Network Automation Engineer Experience Level: Mid-Senior (5 years & above) Industry: Telecommunications --- Role Overview: We are seeking a Telco Network Automation Engineer with expertise in Python, DevOps, and GUI automation to streamline and enhance network operations. This role focuses on automating telecom network functions, service provisioning, monitoring, and troubleshooting using modern automation frameworks. You will work closely with Network, SRE, and DevOps teams to develop scalable automation solutions, optimize workflows, and improve service reliability. --- Key Responsibilities: ✅ Network Automation & Scripting · Develop and maintain Python-based automation scripts for telecom network configurations, provisioning, and monitoring. · Automate network deployment, scaling, and optimization using APIs, Ansible, Netconf, REST, and CLI scripting. ✅ DevOps & CI/CD Integration · Implement network automation pipelines using Jenkins, GitLab CI/CD, or Ansible AWX. · Deploy and manage containerized network functions (CNFs) and virtual network functions (VNFs) using Docker and Kubernetes. · Work with Infrastructure as Code (IaC) tools like Terraform for network automation. ✅ GUI Automation & RPA · Automate GUI-based telecom applications using Selenium, Python for GUI, Ai based Python Automation. · Develop robotic process automation (RPA) scripts with Python for repetitive network tasks. ✅ Observability & Monitoring · Implement network observability solutions using Prometheus, Grafana, and ELK (Elasticsearch, Logstash, Kibana). · Automate alerts and self-healing mechanisms using Python-based event-driven automation. ✅ Collaboration & Documentation · Work with cross-functional teams (Network, DevOps, and SRE) to align automation strategies. · Document automation workflows, playbooks, and best practices. --- Required Skills & Experience: ✔ Programming & Scripting: · Proficiency in Python (automation, API development, multi-threading, network libraries). · Experience with Bash, Shell, PowerShell, or YAML scripting. ✔ Network & Telecom Knowledge: · Understanding of Telco networks, SDH, Routing & Networking, and VoIP/SIP protocols. · Hands-on experience with network devices (routers, switches, firewalls, EPC, IMS, BSS/OSS systems). · Design and implement automation workflows for telecommunication networks, including provisioning, configuration, and monitoring. · Automate repetitive network tasks such as circuit provisioning, topology adjustments, and fault monitoring. · Optimize legacy network management systems (e.g., ECI NMS) through automation. · Work on technologies like NMS, EMS, IMS, 5G, and virtualization platforms (e.g., vBlock, CNIS, NCP). ✔ Automation & DevOps Tools: · Ansible, Terraform, Kubernetes, Docker (Infrastructure automation). · Git, Jenkins, CI/CD pipelines (Continuous deployment of automation scripts). · REST APIs, Netconf, SNMP, gRPC (Network automation protocols). · Develop Python scripts to automate manual processes, such as data extraction, email parsing, and log analysis. · Build custom modules and libraries for network automation and GUI automation. · Use Python for interacting with APIs, databases, and file systems to streamline automation workflows. · Write scripts for fetching CRQ IDs from emails, managing dependencies, and performing data transformations. · Maintain script repositories and ensure proper documentation. ✔ GUI & RPA Automation: · Selenium, PyAutoGUI, for GUI automation. · Network Automation: Experience with network management systems (e.g., ECI NMS) and Telco protocols. · GUI Automation: Proficiency with tools like Pywinauto, or AutoIt for desktop application automation. · Scripting: Strong Python programming skills, including libraries like Selenium, Pywinauto, Ai-Python libraries. · Virtualization Platforms: Familiarity with platforms like VINO, vBlock, CNIS, and NCP. · API Integration: Experience with REST APIs and web services for network and application integration. · DevOps Tools: Hands-on experience with Ansible, Jenkins, and Docker. · Version Control: Knowledge of Git and CI/CD pipelines. ✔ Monitoring & Observability: · Experience with Prometheus, Grafana, ELK Stack, Splunk, or OpenTelemetry. --- Preferred Qualifications: ➕ Experience with public cloud networking (AWS, GCP, Azure). ➕ Knowledge of AI/ML-driven network automation. ➕ Exposure to telecom service orchestration tools (ONAP, OpenStack, Cisco NSO). --- Why Join Us? Work on cutting-edge and transformations across 4G, 5G & SDN/NFV automation projects. Be part of an innovative DevOps & SRE-driven network team. Competitive salary and growth opportunities in telecom automation Job Type: Full-time Pay: ₹400,000.00 - ₹1,400,000.00 per year Benefits: Health insurance Provident Fund Schedule: Rotational shift Supplemental Pay: Shift allowance Yearly bonus Work Location: In person

Posted 3 weeks ago

Apply

3.0 - 7.0 years

15 - 20 Lacs

Pune

Work from Office

Naukri logo

What Youll Do - Configure and manage observability agents across AWS, Azure & GCP - Use IaC techniques and tools such as Terraform, Helm & GitOps, to automate deployment of Observability stack - Experience with different language stacks such as Java, Ruby, Python and Go - Instrument services using OpenTelemetry and integrate telemetry pipelines - Optimize telemetry metrics storage using time-series databases such as Mimir & NoSQL DBs - Create dashboards, set up alerts, and track SLIs/SLOs - Enable RCA and incident response using observability data - Secure the observability pipeline You Bring - BE/BTech/MTech (CS/IT or MCA), with an emphasis in Software Engineering - Strong skills in reading and interpreting logs, metrics, and traces - Proficiency with LGTM (Loki, Grafana, Tempo, Mimi) or similar stack, Jaeger, Datadog, Zipkin, InfluxDB etc. - Familiarity with log frameworks such as log4j, lograge, Zerolog, loguru etc. - Knowledge of OpenTelemetry, IaC, and security best practices - Clear documentation of observability processes, logging standards & instrumentation guidelines - Ability to proactively identify, debug, and resolve issues using observability data - Focused on maintaining data quality and integrity across the observability pipeline

Posted 3 weeks ago

Apply

0.0 years

0 Lacs

Bengaluru District, Karnataka

On-site

Indeed logo

Job Title: System Architect - Kubernetes & Cloud Infrastructure (12-14 Years Experience) Period : 6 Month Contractual Role Location: Bangalore/Gurgaon Job Description: This is a senior position. Candidate should plan and lead the modernization of legacy container based systems into scalable, secure, and production-grade Kubernetes (K8s) platforms across both cloud (AWS EKS, GKE, AKS) and on-prem (OpenShift, Tanzu) environments. This role requires deep expertise in Kubernetes architecture, orchestration and roll-out strategy, previous/existing coding/development, and telecom or large-scale production experience. The position also requires team leadership, strategy planning along with hands-on implementation of highly available and resilient systems in K8 environment. Key Responsibilities: 1. Analyze existing monolithic but containerized applications to plan their transformation into K8snative microservices architecture. 2. Create end-to-end Kubernetes orchestration architecture, addressing ingress, service mesh, PVCs, networking, observability, scaling, resiliency and secure communication. 3. Lead the migration of applications to K8s clusters, ensuring zero data loss, seamless user experience, and alignment with business SLAs. 4. Define and implement horizontal and vertical scaling strategies, multi-zone resilience, volume management, and optimized ingress/data flow. 5. Integrate supporting systems such as Kafka, ClickHouse, KeyCloak, MySQL/Oracle, Redis, and Nginx, ensuring security, HA, and performance tuning. 6. Architect and execute deployment strategies, including blue-green, canary, rollback, backup & restore, and K8s upgrade strategies with minimal downtime. 7. Debug complex production issues related to Kubernetes, networking (CNI), ingress, persistent volumes, and pod scheduling. 8. Mentor and lead a team of 5+ engineers, guiding them across the design, implementation, and production rollout phases. 9. Collaborate with engineering leadership to deliver project plans, timelines, architectural reviews, and technical risk assessments. 10. Serve as the technical authority and hands-on contributor for the full lifecycle of Kubernetes enablement. 11. Harden Kubernetes environments with Zero Trust, container vulnerability and DevSecOps best practices for secure, production-grade deployments. Required Skills & Experience: Kubernetes (K8s): EKS, OpenShift (OCP), Tanzu, Cluster API, Helm, Operators Container Orchestration: Docker, CRI-O, Containerd, K3s Infra as Code & Automation: Terraform, Ansible, Helm Charts, Shell, Python K8 Storage & Networking: PVCs, CSI drivers, LoadBalancers, Ingress Controllers, CNI (Calico, Flannel, Multus etc) Observability & Monitoring: Prometheus, Grafana, Loki, OpenTelemetry Messaging & Databases: Kafka, ClickHouse, MySQL/Oracle, Redis Security & Authentication: KeyCloak, OAuth2, OIDC, RBAC, GDPR Networking & API Gateways: Nginx, HAProxy Microservices Understanding: Java, GoLang, C-based services Job Type: Contractual / Temporary Contract length: 6 months Pay: From ₹80,619.83 per month Schedule: Day shift Ability to commute/relocate: Bengaluru District, Karnataka: Reliably commute or planning to relocate before starting work (Required) Work Location: In person

Posted 4 weeks ago

Apply

0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Job Description - DevOps Engineer This role involves strong leadership with proven skills in people management, project management and Agile Scrum Development methodologies incorporating DevOps best practices in the team. Responsibilities: ● Monitor the progress of technical personnel, ensuring that application development and deployment is done in the best possible way, and implements quality control and review systems throughout the development and deployment processes. ● Be responsible for ensuring that the DevOps strategy is implemented in the end-to-end development of the product, while ensuring scalability, stability and High Performance. ● Find ways to improve the existing architecture of the product keeping in mind the various automation tools available and the skills required. ● Managing other DevOps roles and obtaining full efficiency from the team will be the primary target. ● Employ and leverage standard tools and techniques to maximize team effectiveness (physical and/or virtual boards, collaboration and productivity software, etc. ). ● Be an evangelist for DevOps technology and champion agile development best practices, including automated testing using CI/CD, Perl/Python/Groovy/Java/Bash ● Be managing build and release and provide CI/CD expertise to agile teams in the enterprise and Infrastructure automation using Ansible, and IAC tools. ● Work on a cloud-based infrastructure spanning Amazon Web Services, Microsoft Azure, and Google Cloud ● Be responsible for defining the business continuity framework/ disaster recovery of the group. ● Evaluate and collaborate with cross-functional teams on how to achieve strategic development objectives using DevOps methodologies. ● Work with senior software managers and architects to develop multi-generation applications/product/cloud plans. ● Work with tech partners and professional consultants for great and successful DevOps and Microservices journey adoption or implementations. ● Contribute to and create integrations and orchestration blueprints. Technical Expertise: ● Deep knowledge of Infrastructure, Cloud, DevOps, SRE, Database Management, Observability, and Cybersecurity services. ● Solid 4+ years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments ● Strong Experience with Databases, dataops (MySql. PostgreSQL, MongoDB, ElasticSearch, Kafka) ● Hands-on experience with ELK or other logging and observability tools ● Hands-on experience with Prometheus, Grafana & Alertmanager and on-call processes like Pagerduty ● Strong with skills - K8s, Terraform, Helm, ArgoCD, AWS/GCP/Azure etc ● Good with Python/Go Scripting Automation ● Strong with fundamentals like DNS, Networking, Linux ● Experience with APM tools like - Newrelic, Datadog, and OpenTelemetry ● Good experience with Incident Response, Incident Management, Writing detailed RCAs ● Experience with Git and coding best practices ● Solutioning & Architecture: Proven ability to design, implement, and optimize end-to-end cloud solutions, following well-architected frameworks and best practices. ● Expertise in developing software applications and managing high demand infrastructure. ● Deploying the SaaS products across major cloud providers. - AWS, Azure, & GCP(way more things beyond compute)Hands on experience with DevOps and related best practices with bash, PowerShell, python, etc. ● Experience with object-oriented design using programming languages Python/Java/Go/Bash. ● Decent skills in configuration management/IAC tools. - Ansible and Terraform. ● Good Database management skills. - MongoDB, Elasticsearch, MySQL, ScyllaDB, Redis, Kafka etc. ● Strong understanding and hands-on experience of the cloud-native landscape. - Orchestration using k8s, observability, SecOps, etc. ● Experience with source code management, building CI/CD pipelines, and GitOps is a plus. ● Good communication skills to collaborate with clients and cross-functional teams ● Problem-solving attitude. ● Collaborative team spirit. Show more Show less

Posted 4 weeks ago

Apply

12.0 years

0 Lacs

Thiruvananthapuram, Kerala

Remote

Indeed logo

Thiruvananthapuram Office, AEDGE AICC India Pvt Ltd About the Company Armada is an edge computing startup that provides computing infrastructure to remote areas where connectivity and cloud infrastructure is limited, as well as areas where data needs to be processed locally for real-time analytics and AI at the edge. We’re looking to bring on the most brilliant minds to help further our mission of bridging the digital divide with advanced technology infrastructure that can be rapidly deployed anywhere . About the role We are looking for a highly experienced and visionary Lead Golang Engineer to spearhead the architecture, design, and implementation of scalable backend systems. The ideal candidate has extensive experience with Golang, distributed systems, and microservices, along with proven leadership skills. You will lead a team of engineers, influence strategic technical decisions, and contribute to the success of key initiatives. Location. This role is office-based at our Trivandrum, Kerala office. What You'll Do (Key Responsibilities) Lead the design and development of complex, high-performance backend services using Golang. Define system architecture and best practices for scalable, secure, and maintainable code. Mentor and guide a team of backend engineers, conducting code reviews and promoting engineering excellence. Collaborate with product managers, architects, and other stakeholders to align engineering goals with business objectives. Drive DevOps best practices including CI/CD, observability, and incident management. Proactively identify technical risks and implement effective mitigation strategies. Stay current with industry trends and apply new technologies to improve system performance and developer productivity. Required Qualifications Bachelor’s or Master’s degree in Computer Science, Engineering, or related field. 12+ years of software development experience, with at least 5+ years of hands-on Golang development. Proven track record in building large-scale, distributed backend systems. Strong knowledge of microservices architecture, API design, and cloud-native applications. Experience with Docker, Kubernetes, and cloud platforms (AWS, GCP, or Azure). Proficiency with relational and NoSQL databases. Deep understanding of concurrency, performance optimization, and systems design. Strong communication skills and the ability to work cross-functionally. Preferred Experience and Skills Experience in leading geographically distributed engineering teams. Knowledge of event-driven architecture and tools like Kafka or RabbitMQ. Familiarity with observability tools like Prometheus, Grafana, and OpenTelemetry. Contributions to open-source Golang projects or community initiatives. Compensation & Benefits For India-based candidates: We offer a competitive base salary along with equity options, providing an opportunity to share in the success and growth of Armada. #LI-JV1 #LI-Onsite You're a Great Fit if You're A go-getter with a growth mindset. You're intellectually curious, have strong business acumen, and actively seek opportunities to build relevant skills and knowledge A detail-oriented problem-solver. You can independently gather information, solve problems efficiently, and deliver results with a "get-it-done" attitude Thrive in a fast-paced environment. You're energized by an entrepreneurial spirit, capable of working quickly, and excited to contribute to a growing company A collaborative team player. You focus on business success and are motivated by team accomplishment vs personal agenda Highly organized and results-driven. Strong prioritization skills and a dedicated work ethic are essential for you Equal Opportunity Statement At Armada, we are committed to fostering a work environment where everyone is given equal opportunities to thrive. As an equal opportunity employer, we strictly prohibit discrimination or harassment based on race, color, gender, religion, sexual orientation, national origin, disability, genetic information, pregnancy, or any other characteristic protected by law. This policy applies to all employment decisions, including hiring, promotions, and compensation. Our hiring is guided by qualifications, merit, and the business needs at the time.

Posted 4 weeks ago

Apply

0 years

0 Lacs

India

On-site

Linkedin logo

About Us At Valiance, we are building next-generation AI solutions to solve high-impact business problems. As part of our AI/ML team, you’ll work on deploying cutting-edge Gen AI models, optimizing performance, and enabling scalable experimentation. Role Overview We are looking for a skilled MLOps Engineer with hands-on experience in deploying open-source Generative AI models on cloud and on-prem environments. The ideal candidate should be adept at setting up scalable infrastructure, observability, and experimentation stacks while optimizing for performance and cost. Responsibilities Deploy and manage open-source Gen AI models (e.g., LLaMA, Mistral, Stable Diffusion) on cloud and on-prem environments Set up and maintain observability stacks (e.g., Prometheus, Grafana, OpenTelemetry) for monitoring Gen AI model health and performance Optimize infrastructure for latency, throughput, and cost-efficiency in GPU/CPU-intensive environments Build and manage an experimentation stack to enable rapid testing of various open-source Gen AI models Work closely with ML scientists and data teams to streamline model deployment pipelines Maintain CI/CD workflows and automate key stages of the model lifecycle Leverage NVIDIA tools (Triton Inference Server, TensorRT, CUDA, etc.) to improve model serving performance (preferred) Required Skills & Qualifications Strong experience in deploying ML/Gen AI models using Kubernetes, Docker, and CI/CD tools Proficiency in Python, Bash scripting, and infrastructure-as-code tools (e.g., Terraform, Helm) Experience with ML observability and monitoring stacks Familiarity with cloud services (GCP, AWS, or Azure) and/or on-prem environments Exposure to model tracking tools like MLflow, Weights & Biases, or similar Bachelor’s/Master’s in Computer Science, Engineering, or related field Nice to Have Hands-on experience with NVIDIA ecosystem (Triton, CUDA, TensorRT, NGC) Familiarity with serving frameworks like vLLM, DeepSpeed, or Hugging Face Transformers Show more Show less

Posted 4 weeks ago

Apply

0.0 - 5.0 years

0 Lacs

Musheerabad, Hyderabad, Telangana

On-site

Indeed logo

As the Senior DevOps Engineer focused on Observability, you will set observability standards, lead automation efforts and mentor engineers ensuring all monitoring and Datadog configuration changes are implemented Infrastructure-as-Code (IaC). You will lead the design and management of a code-driven Datadog observability platform, providing end-to-end visibility into Java applications, Kubernetes workloads and containerized infrastructure. This role emphasizes cost-effective observability at scale requiring deep expertise in Datadog monitoring, logging, tracing and optimization techniques. You'll collaborate closely with SRE, DevOps and Software Engineering teams to standardize monitoring and logging practices to deliver scalable, reliable and cost-efficient observability solutions. This is a hands-on engineering role focused on observability-as-code. All monitoring, logging, alerting, and Datadog configurations are defined and managed through Terraform, APIs and CI/CD workflows — not manual configuration in the Datadog UI. PRIMARY RESPONSIBILITIES: Own and define observability standards for Java applications, Kubernetes workloads and cloud infrastructure Configure and manage the Datadog platform using Terraform and Infrastructure-as-Code (IaC) best practices Drive adoption of structured JSON logging, distributed tracing and custom metrics across Java and Python services Optimize Datadog usage through cost governance, log filtering, sampling strategies and automated reporting Collaborate closely with Java developers and platform engineers to standardize instrumentation and alerting Troubleshoot and resolve issues with missing or misconfigured logs, metrics and traces, working with developers to ensure proper instrumentation and data flow into Datadog Involve in incident response efforts using Datadog insights for actionable alerting, root cause analysis (RCA) and reliability improvements Serve as the primary point of contact for Datadog-related requests, supporting internal teams with onboarding, integration and usage questions Continuously audit and tune monitors for alert quality, reducing false positives and improving actionable signal detection Maintain clear internal documentation on Datadog usage, standards, integrations and IaC workflows Evaluate and propose improvements to the observability stack, including new Datadog features, OpenTelemetry adoption and future architecture changes Mentor engineers and develop internal training programs on Datadog, observability-as-code and modern log pipeline architecture QUALIFICATIONS: Bachelor’s degree in Computer Science, Engineering, Mathematics, Physics or a related technical field 5+ years of experience in DevOps, Site Reliability Engineering, or related roles with a strong focus on observability and infrastructure as code Hands-on experience managing and scaling Datadog programmatically using code-based workflows (e.g. Terraform, APIs, CI/CD) Deep expertise in Datadog including APM, logs, metrics, tracing, dashboards and audit trails Proven experience integrating Datadog observability into CI/CD pipelines (e.g. GitLab CI, AWS CodePipeline, GitHub Actions) Solid understanding of AWS services and best practices for monitoring services on Kubernetes infrastructure Strong background in Java application development is preferred Job Types: Full-time, Permanent, Contractual / Temporary Contract length: 12 months Pay: ₹700,000.00 - ₹1,500,000.00 per year Benefits: Paid sick time Schedule: Monday to Friday Night shift US shift Ability to commute/relocate: Musheerabad, Hyderabad, Telangana: Reliably commute or planning to relocate before starting work (Preferred) Education: Bachelor's (Preferred) Experience: DevOps: 5 years (Required) Language: English (Required) Location: Musheerabad, Hyderabad, Telangana (Preferred) Shift availability: Night Shift (Required) Work Location: In person Expected Start Date: 01/06/2025

Posted 4 weeks ago

Apply

0.0 years

0 Lacs

Musheerabad, Hyderabad, Telangana

On-site

Indeed logo

As the Senior DevOps Engineer focused on Observability, you will set observability standards, lead automation efforts and mentor engineers ensuring all monitoring and Datadog configuration changes are implemented Infrastructure-as-Code (IaC). You will lead the design and management of a code-driven Datadog observability platform, providing end-to-end visibility into Java applications, Kubernetes workloads and containerized infrastructure. This role emphasizes cost-effective observability at scale requiring deep expertise in Datadog monitoring, logging, tracing and optimization techniques. You'll collaborate closely with SRE, DevOps and Software Engineering teams to standardize monitoring and logging practices to deliver scalable, reliable and cost-efficient observability solutions. This is a hands-on engineering role focused on observability-as-code. All monitoring, logging, alerting, and Datadog configurations are defined and managed through Terraform, APIs and CI/CD workflows — not manual configuration in the Datadog UI. PRIMARY RESPONSIBILITIES: Own and define observability standards for Java applications, Kubernetes workloads and cloud infrastructure Configure and manage the Datadog platform using Terraform and Infrastructure-as-Code (IaC) best practices Drive adoption of structured JSON logging, distributed tracing and custom metrics across Java and Python services Optimize Datadog usage through cost governance, log filtering, sampling strategies and automated reporting Collaborate closely with Java developers and platform engineers to standardize instrumentation and alerting Troubleshoot and resolve issues with missing or misconfigured logs, metrics and traces, working with developers to ensure proper instrumentation and data flow into Datadog Involve in incident response efforts using Datadog insights for actionable alerting, root cause analysis (RCA) and reliability improvements Serve as the primary point of contact for Datadog-related requests, supporting internal teams with onboarding, integration and usage questions Continuously audit and tune monitors for alert quality, reducing false positives and improving actionable signal detection Maintain clear internal documentation on Datadog usage, standards, integrations and IaC workflows Evaluate and propose improvements to the observability stack, including new Datadog features, OpenTelemetry adoption and future architecture changes Mentor engineers and develop internal training programs on Datadog, observability-as-code and modern log pipeline architecture QUALIFICATIONS: Bachelor’s degree in Computer Science, Engineering, Mathematics, Physics or a related technical field 5+ years of experience in DevOps, Site Reliability Engineering, or related roles with a strong focus on observability and infrastructure as code Hands-on experience managing and scaling Datadog programmatically using code-based workflows (e.g. Terraform, APIs, CI/CD) Deep expertise in Datadog including APM, logs, metrics, tracing, dashboards and audit trails Proven experience integrating Datadog observability into CI/CD pipelines (e.g. GitLab CI, AWS CodePipeline, GitHub Actions) Solid understanding of AWS services and best practices for monitoring services on Kubernetes infrastructure Strong background in Java application development is preferred Job Types: Full-time, Permanent, Contractual / Temporary Contract length: 12 months Pay: ₹700,000.00 - ₹1,500,000.00 per year Benefits: Paid sick time Schedule: Monday to Friday Night shift US shift Ability to commute/relocate: Musheerabad, Hyderabad, Telangana: Reliably commute or planning to relocate before starting work (Preferred) Education: Bachelor's (Preferred) Language: English (Required) Location: Musheerabad, Hyderabad, Telangana (Preferred) Shift availability: Night Shift (Required) Work Location: In person

Posted 4 weeks ago

Apply

0 years

0 Lacs

Ahmedabad, Gujarat, India

On-site

Linkedin logo

Role Description Location : All UST Locations Experience Range : 5-8yrs Responsibilities Infrastructure as Code & Cloud Automation Design and implement Infrastructure as Code (IaC) using Terraform, Ansible, or equivalent for both Azure and on-prem environments. Automate provisioning and configuration management for Azure PaaS services (App Services, AKS, Storage, Key Vault, etc.). Manage Hybrid Cloud Deployments, ensuring seamless integration between Azure and on-prem alternatives. CI/CD Pipeline Development (Without Azure DevOps) Develop and maintain CI/CD pipelines using GitHub Actions or Jenkins. Automate containerized application deployment using Docker, Kubernetes (AKS). Implement canary deployments, blue-green deployments, and rollback strategies for production releases. Cloud Security & Secrets Management Implement role-based access control (RBAC) and IAM policies across cloud and on-prem environments. Secure API and infrastructure secrets using HashiCorp Vault (instead of Azure Key Vault). Monitoring, Logging & Observability Set up observability frameworks using Prometheus, Grafana, and ELK Stack (ElasticSearch, Kibana, Logstash). Implement centralized logging and monitoring across cloud and on-prem environments. Must Have Skills & Experience Cloud & DevOps Azure PaaS Services: App Services, AKS, Azure Functions, Blob Storage, Redis Cache Kubernetes & Containerization: Hands-on experience with AKS, Kubernetes, Docker CI/CD Tools: Experience with GitHub Actions, Jenkins Infrastructure as Code (IaC): Proficiency in Terraform Security & Compliance IAM & RBAC: Experience with Active Directory, Keycloak, LDAP Secrets Management: Expertise in HashiCorp Vault or Azure Key Vault Cloud Security Best Practices: API security, network security, encryption Networking & Hybrid Cloud Azure Networking: Knowledge of VNets, Private Endpoints, Load Balancers, API Gateway, Nginx Hybrid Cloud Connectivity: Experience with VPN Gateway, Private Peering Monitoring & Performance Optimization Observability tools: Prometheus, Grafana, ELK Stack, Azure Monitor & App Insights Logging & Monitoring: Experience with ElasticSearch, Logstash, OpenTelemetry, Log Analytics Good To Have Skills & Experience Experience with additional IaC tools (Ansible, Chef, Puppet) Experience with additional container orchestration platforms (OpenShift, Docker Swarm) Knowledge of advanced Azure services (e.g., Azure Logic Apps, Azure Event Grid) Familiarity with cloud-native monitoring solutions (e.g., CloudWatch, Datadog) Experience in implementing and managing multi-cloud environments Key Personal Attributes Strong problem-solving abilities Ability to work in a fast-paced and dynamic environment Excellent communication skills and ability to collaborate with cross-functional teams Proactive and self-motivated, with a strong sense of ownership and accountability. Skills Azure,Scripting,CI/CD Show more Show less

Posted 4 weeks ago

Apply

0 years

0 Lacs

India

On-site

Linkedin logo

Project description We're seeking a strong and creative Software Engineer eager to solve challenging problems of scale and work on cutting edge technologies. In this project, you will have the opportunity to write code that will impact thousands of users every month. You'll implement your critical thinking and technical skills to develop cutting edge software, and you'll have the opportunity to interact with teams across disciplines. In Luxoft, our culture is one that strives on solving difficult problems focusing on product engineering based on hypothesis testing to empower people to come up with ideas. In this new adventure, you will have the opportunity to collaborate with a world-class team in the field of Insurance by building a holistic solution, interacting with multidisciplinary teams. Responsibilities As a Lead OpenTelemetry Developer, you will be responsible for developing and maintaining OpenTelemetry-based solutions. You will work on instrumentation, data collection, and observability tools to ensure seamless integration and monitoring of applications. This role involves writing documentation and promoting best practices around OpenTelemetry. Mandatory Skills Experience in Instrumentation: Expertise in at least one programming language supported by OpenTelemetry and a broad knowledge of other languages (e.g., Python, Java, Go, PowerShell, .NET) Passion for Observability: Strong interest in observability and experience in writing documentation and blog posts to share knowledge. Technical Skills: Familiarity with tools and technologies such as Prometheus, Grafana, and other observability platforms (E.g. Dynatrace, AppDynamics (Splunk), Amazon CloudWatch, Azure Monitor, Honeycomb) Experience in Java instrumentation techniques (e.g. bytecode manipulation, JVM internals, Java agents) Show more Show less

Posted 4 weeks ago

Apply

10 years

0 Lacs

Bengaluru, Karnataka

Work from Office

Indeed logo

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world. Company Overview: At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our 39,000 employees around the world work to discover and bring life-changing medicines to those who need them the most, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world. Tech@Lilly builds and maintains capabilities using cutting edge technologies like most prominent tech companies. What differentiates Tech@Lilly is that we redefine what’s possible through tech to advance our purpose – creating medicines that make life better for people around the world, like data driven drug discovery and connected clinical trials. We hire the best technology professionals from a variety of backgrounds, so they can bring an assortment of knowledge, skills, and diverse thinking to deliver innovative solutions in every area of our business. LRL Tech unites science with technology to accelerate the Research and Development of medicines and to deliver therapeutic innovations. The team leverages technology and platforms to streamline scientific experimentation to help Researchers follow the science, to understand the disease and identify potential therapies. They are at the forefront of advanced analytics to enable data driven drug discovery, to innovate so Scientists can rapidly analyze and accelerate the discovery of new medicines. Position Overview: We are looking for a hands-on, Advisor - Informatics & Scientific Applications Architect to design and deliver next-generation digital platforms supporting drug discovery, preclinical research, analytical sciences, and CMC applications. This role demands deep expertise in AWS serverless architectures, cloud-native designs, automation, and microservices for scientific data applications. You will play a key leadership role in architecting multi-tenant, high-performance, modular, and scalable informatics ecosystems that integrate scientific workflows, computational platforms, and cloud infrastructure. Key Responsibilities: Architectural Leadership: Architect and build a multi-tenant, serverless informatics platform leveraging AWS Lambda, DynamoDB, S3, EBS, EFS, Route 53, and API Gateway. Design data partitioning strategies for multi-tenant scientific data storage and explore microservices frameworks for scalable architecture. Lead cloud-native software design using Kubernetes, Docker, and containerized services, ensuring high reproducibility and scalability of scientific applications. Build scalable Research Data Lakes, asset registries, and metadata catalog to support large-scale scientific data ingestion and retrieval. Complex Scientific Data Flow & Interoperability: Architect frameworks that facilitate seamless data exchange between discovery research systems, including LIMS, ELN, analytical tools, and registration systems. Implement RESTful and GraphQL APIs for high-performance data interoperability across computational models, bioassays, and experimental workflows. Establish scientific data standards to ensure consistency, traceability, and governance across the R&D landscape. Scientific Workflow Automation & Computational Frameworks: Architect scientific workflow automation platforms using Apache Airflow, EventBridge, RabbitMQ, and Kafka, enabling real-time data acquisition and bioassay processing. Design platforms supporting in silico modeling, AI-driven analytics, and high-throughput screening simulations. Integrate Cloud (AWS/Azure) platforms with HPC clusters to handle bioinformatics, cheminformatics, and translational modeling workloads. Cloud, DevOps, and Observability: Maintain deep technical hands-on expertise with AWS CloudFormation, Ansible, Jenkins, Git, and other DevOps automation tools. Implement observability solutions using Prometheus, Grafana, OpenTelemetry to monitor system health, performance, and workflows. Continuously learn, explore, and drive adoption of cutting-edge cloud-native, containerization, serverless, and scientific informatics trends. Cross-Functional Scientific Collaboration: Collaborate closely with scientists, data scientists, computational biologists, formulation teams, and manufacturing engineers to co-create informatics solutions. Serve as a trusted technical advisor for cloud migration, scientific data modernization, and AI/ML integration projects. Work with UI/UX teams to create intuitive digital interfaces for scientific workflow automation and data exploration. Technology Strategy, Governance & Best Practices: Drive architectural strategy, making informed decisions around buy vs. build, third-party integrations, and platform extensibility. Define and enforce best practices for scientific IT security, data privacy, compliance (GxP, FAIR), and cloud operations. Champion a modular, service-oriented, event-driven architecture to enable rapid innovation, maintainability, and scalability. Required Qualifications: Experience: 10+ years of enterprise IT and scientific informatics architecture experience. Deep technical leadership experience in AWS serverless and scientific data integration projects. Proven experience building cloud-native, scalable platforms integrating LIMS, ELN, MES, compound registries, and scientific analysis tools. Education: Bachelor’s or Master’s degree in Computer Science, Bioinformatics, Information Systems, or related disciplines. Technical Expertise: Expertise in AWS serverless architectures (Lambda, DynamoDB, S3, Route53, API Gateway), containerized platforms (Kubernetes, Docker), and scientific workflow tools (Airflow, Kafka, EventBridge). Strong knowledge of microservices design, DevOps automation, scientific data systems, and HPC integration. Experience in observability setup for complex distributed systems in scientific environments. Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response. Lilly does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status. #WeAreLilly

Posted 4 weeks ago

Apply

3 years

0 Lacs

Chennai, Tamil Nadu

Work from Office

Indeed logo

Be at the Forefront of Mobility's Future: Join Ford as a Site Reliability Engineer! Enterprise Technology is the engine driving the future of transportation, and we're looking for a talented Site Reliability Engineer (SRE) to help us redefine mobility. In this role, you'll leverage cutting-edge technology to enhance customer experiences, improve lives, and create vehicles as smart as you are. As an SRE at Ford, you'll be instrumental in developing, enhancing, and expanding our global monitoring and observability platform. You'll blend software and systems engineering to ensure the uptime, scalability, and maintainability of our critical cloud services. You'll be at the intersection of SRE and Software Development, building and driving the adoption of our global monitoring capabilities. If you're passionate about using your IT expertise and analytical skills to shape the future of transportation, this is your opportunity to make a real impact. Join us and be part of a team that's building the future of mobility! Bachelor’s degree in Computer Science, Engineering, Mathematics or equivalent experience. 3+ years of experience as an SRE, DevOps Engineer, Software Engineer or similar role. Strong experience with Golang development and desired familiarity with Terraform Provider development. Proficient with monitoring and observability tools, particularly OpenTelemetry or other tools. Proficient with cloud services, with a strong preference for Kubernetes and Google Cloud Platform (GCP) experience. Solid programming skills in Golang and scripting languages, with a good understanding of software development best practices. Experience with relational and document databases. Ability to debug, optimize code, and automate routine tasks. Strong problem-solving skills and the ability to work under pressure in a fast-paced environment. Excellent verbal and written communication skills. Write, configure, and deploy code that improves service reliability for existing or new systems; set standard for others with respect to code quality. Provide helpful and actionable feedback and review for code or production changes. Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors. Lead debugging, troubleshooting, and analysis of service architecture and design. Participate in on-call rotation. Write documentation: design, system analysis, runbooks, playbooks. Provide design feedback and uplevel design skills of others. Implement and manage SRE monitoring application backends using Golang, Postgres, and OpenTelemetry. Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms. Work within GCP infrastructure, optimizing performance, and cost, and scaling resources to meet demand. Collaborate with development teams to enhance system reliability and performance, applying a platform engineering mindset to system administration tasks. Develop and maintain automated solutions for operational aspects such as on-call monitoring, performance tuning, and disaster recovery. Troubleshoot and resolve issues in our dev, test, and production environments. Participate in postmortem analysis and create preventative measures for future incidents. Implement and maintain security best practices across our infrastructure, ensuring compliance with industry standards and internal policies. Participate in security audits and vulnerability assessments. Participate in capacity planning and forecasting efforts to ensure our systems can handle future growth and demand. Analyze trends and make recommendations for resource allocation. Identify and address performance bottlenecks through code profiling, system analysis, and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues. Develop, maintain, and test disaster recovery plans and procedures to ensure business continuity in the event of a major outage or disaster. Participate in regular disaster recovery exercises. Contribute to internal knowledge bases and documentation.

Posted 4 weeks ago

Apply

3 years

0 Lacs

Chennai, Tamil Nadu

Work from Office

Indeed logo

Be at the Forefront of Mobility's Future: Join Ford as a Site Reliability Engineer! Enterprise Technology is the engine driving the future of transportation, and we're looking for a talented Site Reliability Engineer (SRE) to help us redefine mobility. In this role, you'll leverage cutting-edge technology to enhance customer experiences, improve lives, and create vehicles as smart as you are. As an SRE at Ford, you'll be instrumental in developing, enhancing, and expanding our global monitoring and observability platform. You'll blend software and systems engineering to ensure the uptime, scalability, and maintainability of our critical cloud services. You'll be at the intersection of SRE and Software Development, building and driving the adoption of our global monitoring capabilities. If you're passionate about using your IT expertise and analytical skills to shape the future of transportation, this is your opportunity to make a real impact. Join us and be part of a team that's building the future of mobility! Bachelor’s degree in Computer Science, Engineering, Mathematics or equivalent experience. 3+ years of experience as an SRE, DevOps Engineer, Software Engineer or similar role. Strong experience with Golang development and desired familiarity with Terraform Provider development. Proficient with monitoring and observability tools, particularly OpenTelemetry or other tools. Proficient with cloud services, with a strong preference for Kubernetes and Google Cloud Platform (GCP) experience. Solid programming skills in Golang and scripting languages, with a good understanding of software development best practices. Experience with relational and document databases. Ability to debug, optimize code, and automate routine tasks. Strong problem-solving skills and the ability to work under pressure in a fast-paced environment. Excellent verbal and written communication skills. Write, configure, and deploy code that improves service reliability for existing or new systems; set standard for others with respect to code quality. Provide helpful and actionable feedback and review for code or production changes. Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors. Lead debugging, troubleshooting, and analysis of service architecture and design. Participate in on-call rotation. Write documentation: design, system analysis, runbooks, playbooks. Provide design feedback and uplevel design skills of others. Implement and manage SRE monitoring application backends using Golang, Postgres, and OpenTelemetry. Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms. Work within GCP infrastructure, optimizing performance, and cost, and scaling resources to meet demand. Collaborate with development teams to enhance system reliability and performance, applying a platform engineering mindset to system administration tasks. Develop and maintain automated solutions for operational aspects such as on-call monitoring, performance tuning, and disaster recovery. Troubleshoot and resolve issues in our dev, test, and production environments. Participate in postmortem analysis and create preventative measures for future incidents. Implement and maintain security best practices across our infrastructure, ensuring compliance with industry standards and internal policies. Participate in security audits and vulnerability assessments. Participate in capacity planning and forecasting efforts to ensure our systems can handle future growth and demand. Analyze trends and make recommendations for resource allocation. Identify and address performance bottlenecks through code profiling, system analysis, and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues. Develop, maintain, and test disaster recovery plans and procedures to ensure business continuity in the event of a major outage or disaster. Participate in regular disaster recovery exercises. Contribute to internal knowledge bases and documentation.

Posted 4 weeks ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Matillion is The Data Productivity Cloud. We are on a mission to power the data productivity of our customers and the world, by helping teams get data business ready, faster. Our technology allows customers to load, transform, sync and orchestrate their data. We are looking for passionate, high-integrity individuals to help us scale up our growing business. Together, we can make a dent in the universe bigger than ourselves. With offices in the UK, US and Spain, we are now thrilled to announce the opening of our new office in Hyderabad, India. This marks an exciting milestone in our global expansion, and we are now looking for talented professionals to join us as part of our founding team. Role Scope We are now looking to add a Staff Site Reliability Engineer to #TeamGreen. This role can be based out of our Hybridad Office. Matillion is built around small development teams with responsibility for specific themes and initiatives. The Staff Site Reliability Engineer will be in the core & observability team which owns the operation and efficiency of our cloud platforms and services. The team is responsible for designing and implementing our cloud infrastructure, service reliability, service deployments, service observability and monitoring of Matillion products. What you will be doing Site Reliability Engineering Lead designs of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Matillion’s services Lead sustainable incident response, blameless postmortems, and production improvements that result in direct business opportunities for Matillion Provide guidance to other team members on managing end-to-end availability and performance of critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions Mentor and train other team members on design techniques and coding standards, and to cultivate innovation and collaboration across multiple teams Manage individual projects priorities, deadlines, and deliverables What we are looking for Essential Skills Passion for performance, observability, availability, scalability and security Have previous experience of large scale web operations in a public cloud environment Be competent in Ruby, Go, Java, Python or an equivalent programming language Have worked with some of the following key technologies: Prometheus, Grafana, Elasticsearch, Logstash, Kibana, OpenTelemetry, Micrometer, New Relic, Data Dog Be experienced with cloudformation, terraform and any other infrastructure-as-code technologies Have a solid understanding of networking systems and protocols Be confident in your ability to own and deliver projects and issues to resolution using Agile methodologies and demonstrate a definite bias for action and focus on results Be an excellent communicator and cross-team collaborator Strive for personal excellence through continuous development and by keeping current with developments and offerings in the observability field Have a passion for solving problems Personal Capabilities Required, e.g. skills, attitude, strengths Inquisitiveness- digging into problems and solutions to understand the underlying technology Autonomy - ability to work on a task and solve problems independently Motivation - sets personal challenges and constantly looking to stretch themselves Problem solving - recognition of problems and recasting difficult-to-solve problems in order to find unique and innovative solutions Integrity - honest and transparent in dealing, open to voice and accept criticism, is trustworthy and builds credibility through actions Detail focussed - pays attention to the details and can make a conscious effort to understand causes instead of just the effects Big picture aware - understands the scope and impact of a problem or solution Matillion has fostered a culture that is collaborative, fast-paced, ambitious, and transparent, and an environment where people genuinely care about their colleagues and communities. Our 6 core values guide how we work together and with our customers and partners. We operate a truly flexible and hybrid working culture that promotes work-life balance, and are proud to be able to offer the following benefits: - Company Equity - 27 days paid time off - 12 days of Company Holiday - 5 days paid volunteering leave - Group Mediclaim (GMC) - Enhanced parental leave policies - MacBook Pro - Access to various tools to aid your career development More about Matillion Thousands of enterprises including Cisco, DocuSign, Slack, and TUI trust Matillion technology to load, transform, sync, and orchestrate their data for a wide range of use cases from insights and operational analytics, to data science, machine learning, and AI. With over $300M raised from top Silicon Valley investors, we are on a mission to power the data productivity of our customers and the world. We are passionate about doing things in a smart, considerate way. We’re honoured to be named a great place to work for several years running by multiple industry research firms. We are dual headquartered in Manchester, UK and Denver, Colorado. We are keen to hear from prospective Matillioners, so even if you don’t feel you match all the criteria please apply and a member of our Talent Acquisition team will be in touch. Alternatively, if you are interested in Matillion but don't see a suitable role, please email talent@matillion.com. Matillion is an equal opportunity employer. We celebrate diversity and we are committed to creating an inclusive environment for all of our team. Matillion prohibits discrimination and harassment of any type. Matillion does not discriminate on the basis of race, colour, religion, age, sex, national origin, disability status, genetics, sexual orientation, gender identity or expression, or any other characteristic protected by law. Show more Show less

Posted 4 weeks ago

Apply

3 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

We're Hiring: Databog Specialist Experience: 3+ years Roles & Responsibilities: ✅ Customize and configure Datadog agent YAML to enable various checks ✅ Build playbooks to automate agent installation & configuration ✅ Work with OpenTelemetry to extract key infrastructure metrics ✅ Modify application code to enable traces and spans ✅ Enable Digital Experience Monitoring for browser and mobile apps ✅ Create and manage API and browser synthetic tests ✅ Handle log ingestion, indexing, parsing, and exploration ✅ Set up pipelines, custom parsers, and archives for logs ✅ Apply Datadog tagging best practices for seamless filtering and grouping ✅ Integrate Datadog with various tools and technologies ✅ Design custom dashboards based on business/application needs ✅ Manage users, roles, and licensing within the platform ✅ Guide L1 & L2 teams in implementing Datadog solutions Qualifications: 1. Bachelor’s degree in CS, IT, or related field 2. Hands-on experience in Datadog setup and administration 3. Strong in Linux/Unix systems and scripting (Bash/Python) 4. Solid grasp of cloud components and DevOps tools (Ansible, Jenkins, Chef, etc.) 5. Excellent troubleshooting skills and ability to work across teams 6. Strong communication and collaboration skills Interested candidates send your resume at rakshita.prabhu@syngrowconsulting.com Show more Show less

Posted 4 weeks ago

Apply

0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Your Impact At DISCO, our cloud infrastructure and system control plane are the backbone that enables us to deliver cutting-edge solutions to our clients. As a Software Engineer III on our growing DevOps team, you will be an individual contributor, evolving and managing our AWS-based cloud infrastructure, focused on building and maintaining highly performant observability systems. Your expertise will deliver enhancements to our system's reliability, scalability, security, and the overall success of our cloud-based solution development. Your work will help shape our future cloud strategy through your collaboration with our internal customers, our fellow Engineers. What You'll Do Deliver Systems: Engage actively in coding, code reviews, and technical discussions, ensuring high-quality output. Optimize Performance: Continuously enhance system performance, focusing on meeting internal customer needs using the best practices for designing scalable distributed systems. Architect Systems: Contribute towards the design, development, deployment, and maintenance of scalable, high-performance, easily modifiable distributed systems. Collaborate Cross-Functionally: Collaborate closely with global cross-functional teams to translate requirements into robust technical solutions. Innovative Problem Solving: Addresses complex technical challenges with innovative solutions. Continuous Learning and Adaptation: Stays updated with the latest technology trends and advancements, continually enhancing skills and knowledge. Who You Are A DevOps Engineer with 7+ years of experience (Infrastructure, Observability) 5+ years working with cloud-native applications in AWS 5+ years of writing Infrastructure As Code 4+ years of configuring and Maintaining Linux Systems with Scripting/Automation 4+ years of building and managing modern observability stack (Prometheus, Grafana, ELK, DataDog) 3+ years Scripting Languages An Engineer who enjoys partnering with Engineers across the stack to build, deploy, and maintain highly performant systems An Engineer who prefers to work collaboratively, whether it's learning from or mentoring others. An engineer who is a team player. Even Better If You Have… Experience building and managing PaaS (Platform as a Service) systems for internal customers Experience with Windows Servers, ASP.NET Deployments (WebDeploy) Experience with other cloud platforms such as Azure & GCP 4+ years of automating deployment/scaling of containerized apps 4+ years of experience managing Jenkins / CI-CD Frameworks Experience with Software Development, any tech stack Experience with Cloud Networking (vlan, routing) DISCO's Technology Stack Cloud Provider - AWS: EC2, Lambda, Aurora, Redshift, DynamoDB, ECS, EKS, SQS, SNS, Kinesis, S3, CloudFront, CloudFormation, KMS, CodePipeline, etc. Observability: ELK Stack, OpenTelemetry, Prometheus, Grafana, DataDog, New Relic, Sentry.io Programming Languages: Python, Bash, Kotlin CI/CD: Terraform, Docker, Jenkins, CodeDeploy, GitHub, Artifactory, HashiCorp Consul Perks of DISCO Open, inclusive, and fun environment Benefits, including medical and dental insurance Competitive salary plus discretionary bonus Opportunity to be a part of a startup that is revolutionizing the legal industry Growth opportunities throughout the company About DISCO DISCO provides a cloud-native, artificial intelligence-powered legal solution that simplifies ediscovery, legal document review and case management for enterprises, law firms, legal services providers and governments. Our scalable, integrated solution enables legal departments to easily collect, process and review enterprise data that is relevant or potentially relevant to legal matters. Are you ready to help us fulfill our mission to use technology to strengthen the rule of law? Join us! We are an equal opportunity employer and value diversity. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Show more Show less

Posted 4 weeks ago

Apply

4 - 9 years

25 - 30 Lacs

Bengaluru

Remote

Naukri logo

DevOps/Site Reliability Engineer (AWS) Experience: 4 - 9 Years Exp Salary : INR 28-30 Lacs per annum Preferred Notice Period : Within 30 Days Shift : 10:00AM to 7:00PM IST Opportunity Type: Remote Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills required : Docker, Python, Aws Networking & Security, Bash, CI/CD, EKS, MongoDB, Terraform Good to have skills : Compliance, Disaster Recovery, opentelemetry, SDN, Security NST Cyber (One of Uplers' Clients) is Looking for: DevOps/Site Reliability Engineer (AWS) who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a team player, with a positive attitude and a desire to make a difference, then we want to hear from you. Role Overview Description We are seeking a seasoned DevOps Architect / Senior Engineer with deep expertise in AWS, EKS, Terraform, Infrastructure as Code, and MongoDB Atlas to lead the design, implementation, and management of our cloud-native infrastructure. This is a hands-on leadership role focused on ensuring the scalability, reliability, security, and efficiency of our production-grade systems. Key Responsibilities : Cloud Infrastructure Design & Management (AWS) Architect, build, and manage secure, scalable AWS infrastructure (VPC, EC2, S3, IAM, Security Groups). Implement secure cloud networking and ensure high availability. Monitor, optimize, and troubleshoot AWS environments. Container Orchestration (AWS EKS) Deploy and manage production-ready EKS clusters, including workload deployments, scaling (manual and via Karpenter), monitoring, and security. Maintain CI/CD pipelines for Kubernetes applications. Infrastructure as Code (IaC) Lead development of Terraform-based IaC modules (clean, reusable, and secure). Manage Terraform state and promote best practices (modularization, code reviews). Extend IaC to multi-cloud (Azure, GCP) and leverage CloudFormation or Bicep when needed. Programming, Automation & APIs Develop automation scripts using Python, Bash, or PowerShell. Design, secure, and manage APIs (AWS API Gateway, optionally Azure API Management). Integrate systems/services via APIs and event-driven architecture. Troubleshoot and resolve infrastructure or deployment issues. Database Management Administer MongoDB Atlas: setup, configuration, performance tuning, backup, and security. Implement best practices for high availability and resilience. DevOps Leadership & Strategy Define and promote DevOps best practices across the organization. Automate and streamline development-to-deployment workflows. Mentor junior engineers and foster a culture of technical excellence. Stay ahead of emerging DevOps and Cloud trends. Mandatory Skills : Cloud Administration (AWS) VPC design (subnets, route tables, NAT/IGW, peering). IAM (users, roles, policies with least-privilege enforcement). Deep AWS service knowledge and administrative experience. Container Orchestration (AWS EKS) EKS production-grade cluster setup and upgrades. Workload autoscaling using Karpenter. Logging/Monitoring via Prometheus, Grafana, CloudWatch. Secure EKS practices: RBAC, PSP/PSA, admission controllers, secret management. CI/CD & Kubernetes Experience with Jenkins, GitLab CI, ArgoCD, Flux. Microservices deployment and Kubernetes cluster federation knowledge. Infrastructure as Code Expert in Terraform (HCL, modules, backends, security). Familiarity with CloudFormation, Bicep for cross-cloud support. Git-based version control and CI/CD integration. Automated infrastructure provisioning. Programming & API Proficient in Python, Bash, PowerShell. Secure API design, development, and management. Database Management Proven MongoDB Atlas administration: scaling, backups, alerts, and performance monitoring. Good to Have Skills : Infrastructure & OS Server & Virtualization Management (Linux/Windows). OS Security Hardening & Automation. Disaster Recovery planning and implementation. Docker containerization. Networking & Security Advanced networking (DNS, BGP, routing). Software Defined Networking (SDN), hybrid networking. Zero Trust Architecture. Load balancer (ALB/ELB/NLB) security and WAF management. Compliance: ISO 27001, SOC 2, PCI-DSS. Secrets management (Vault, AWS Secrets Manager). Observability & Automation OpenTelemetry, LangTrace for observability. AI-powered automation (e.g., CrewAI). SIEM/Security monitoring. Cloud Governance Cost optimization strategies. AWS Well-Architected Framework familiarity. Incident response, governance, and compliance management. Qualifications & Experience Bachelor's degree in Computer Science, Engineering, or equivalent practical experience. 5+ years in DevOps / SRE / Cloud Engineering with AWS focus. 5+ years hands-on experience with EKS and Terraform. Proven experience with cloud-native architecture and automation. AWS Certifications (DevOps Engineer Pro, Solutions Architect Pro) preferred. Agile/Scrum experience a plus. Interview Process - Technical Round 1 - with Garvit Technical Round 2 - with Rupesh Technical Round 3 - with Pradeep HR Round How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply! And Register or log in on our portal 2. Upload updated Resume & Complete the Screening Form 3. Increase your chances to get shortlisted & meet the client for the Interview! About Our Client: NST Cyber pioneers proactive, AI-driven Cyber Threat Exposure Management (CTEM). Our flagship NST Assure CTEM delivers rapid threat assessment, continuous vulnerability prioritization and automated responses while maintaining compliance. In a dynamic cyber landscape, we're dedicated to safeguarding digital assets and operational integrity of our customers. About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!

Posted 1 month ago

Apply

1 - 3 years

8 - 11 Lacs

Chennai

Work from Office

Naukri logo

About the Role: We are primarily focused on helping our customers in addressing application performance problems. This role is primarily deep into Research and Cutting-edge Technologies. We are a 100% product development company and are looking for passionate, thoughtful, and compassionate individuals who have experience of building backend applications in Golang . We are primarily focused on helping our customers in addressing application performance problems. It is a challenging role with scope to dive deep into research and development in cutting-edge technologies. You will work with a cross functional product development team. Required Skills: Strong programming fundamentals in Go (Golang) Exposure to containerization technologies such as Docker or equivalent Experience in Kubernetes Operators development using Golang , Helm charts , or Ansible Familiarity with Linux administration and Bash scripting Strong knowledge of Kubernetes fundamentals Experience developing new Kubernetes controllers Hands-on with creating Kubernetes Operators using Kubebuilder or Operator SDK Working knowledge or experience with Python (preferred) Experience with Git Nice to Have: Knowledge of OpenTelemetry libraries Additional experience in Python or Java Experience troubleshooting production issues using external tools (for Go -related applications)

Posted 1 month ago

Apply

5 - 8 years

15 - 25 Lacs

Chennai, Bengaluru

Work from Office

Naukri logo

We are looking for a Senior Platform Engineer Airflow & Control-M with 5-10 years of experience to join our team in Bangalore or Chennai The ideal candidate will have strong expertise in Airflow, Control-M, Kubernetes, Observability (OpenTelemetry), Python, and Bash scripting The role involves managing critical data workflows, enhancing platform automation, and ensuring system reliability and scalability Excellent communication skills and hands-on experience in stabilizing production environments are essential

Posted 1 month ago

Apply

- 3 years

2 - 4 Lacs

Bengaluru

Work from Office

Naukri logo

Key Responsibilities: Deliver engaging and interactive training sessions (24 hours total) based on structured modules. Teach integration of monitoring, logging, and observability tools with machine learning. Guide learners in real-time anomaly detection, incident management, root cause analysis, and predictive scaling. Support learners in deploying tools like Prometheus, Grafana, OpenTelemetry, Neo4j, Falco, and KEDA. Conduct hands-on labs using LangChain, Ollama, Prophet, and other AI/ML frameworks. Help participants set up smart workflows for alert classification and routing using open-source stacks. Prepare learners to handle security, threat detection, and runtime anomaly classification using LLMs. Provide post-training support and mentorship when necessary.

Posted 1 month ago

Apply

0 years

35 - 45 Lacs

Bengaluru, Karnataka

Work from Office

Indeed logo

Java (11/8 or higher). We are looking for 8 to 12 Years exp Guy on Observability tool:Prometheus/ Grafana/OpenTelemetry/ELK/Jaeger/Zipkin/New Relic,Docker/Kubernetes,CI/CD pipline,Architect Job Type: Full-time Pay: ₹3,500,000.00 - ₹4,500,000.00 per year Schedule: Day shift Work Location: In person

Posted 1 month ago

Apply

5 - 8 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

At Orange Business, we are currently undergoing a fundamental and transversal transformation initiative aimed at keeping us at the forefront of the industry. This role is pivotal in managing our current products to sustain revenue while simultaneously creating space for the development of new products that will pave the way for the future of the company. You will be part of a dynamic team that is driving innovation and shaping the next chapter of our organizationOur mission is to deliver outstanding digital experiences that amaze our customers, partners, and employees. As an IT Portfolio Owner, you will play a crucial role in maximizing the efficiency of our IT capacity while ensuring that we deliver exceptional value aligned with portfolio budget.Are you ready to shape the future of technology while driving impactful change? MissionYour daily missions: Leading large scale transformation to build new/ strategic platforms Supporting the existing/ legacy tools to keep the ongoing business operationalLead cross-cultural, cross-functional teams of people to drive business goals.You will manage the IT budget, overseeing both CAPEX and OPEX to ensure financial efficiency.You will be responsible for assessing and managing the portfolio capacity, including resources and competencies, to optimize project delivery.You will validate the IT strategy and roadmap, ensuring alignment with Portfolio and Enterprise Architect recommendations. Desired ProfileWe are looking for a candidate with a minimum of 12+ years of experience in IT portfolio management or a related field. A strong educational background in engineering is essential, as it will provide the technical foundation necessary to excel in this role.You should possess strong leadership skills to effectively drive both external and internal IT teams, ensuring proper prioritization within the portfolios.A results-oriented mindset is crucial, along with the ability to manage cross-functional/ global teams and influence stakeholders with impact.Additionally, you should have the capability to build and maintain budgets and roadmaps, prepare executive reporting, and maintain a focus on key objectives. Technical SkillsDeep expertise in service assurance transformation, particularly in monitoring and observability for enterprise networks.Hands-on experience with full-stack observability platforms, AIOps, and telemetry solutions.Strong understanding of cloud-native monitoring (e.g., Prometheus, Grafana, OpenTelemetry, ELK Stack).Knowledge of automation frameworks for network and application monitoring.Experience with machine learning-driven anomaly detection and predictive analytics for service assurance.Familiarity with ITSM and incident management platforms (e.g., ServiceNow, Splunk ITSI).Ability to drive integration of monitoring solutions across hybrid cloud and on-premise environments.In-depth knowledge of B2B telecommunications, particularly in voice and collaboration products, is a must.Proven capacity to manage a team effectively, fostering collaboration and productivity.Strong financial analysis and budgeting skills to oversee CAPEX and OPEX.Expertise in IT strategy development to align projects with business objectives.Proficiency in risk management to identify and mitigate potential challenges.SAFe understanding is required to ensure familiarity with agile frameworks.Familiarity with portfolio management tools (e.g., Jira, Microsoft Project). Soft SkillsYou should possess exceptional analytical and problem-solving abilities that empower you to assess complex situations critically and make informed, strategic decisions that drive project success.Outstanding communication and interpersonal skills are essential, allowing you to effectively collaborate with diverse teams and stakeholders at all levels of the organization, fostering a culture of open dialogue and transparency.A strong team-oriented mindset with a collaborative approach to working with others.Adaptability and flexibility in a fast-paced environment, enabling you to respond effectively to changing priorities.Strategic thinking and decision-making skills to align IT initiatives with business goals.

Posted 1 month ago

Apply

2 - 4 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Job Title: Golang Developer Location: Hyderabad Experience: 2-4 Years Job Summary We are looking for a Golang Developer with 2-4 years of experience to join our dynamic team in Hyderabad. As a Golang Developer, you will be responsible for developing high-performance, scalable applications and services. This role requires expertise in Golang along with strong problem-solving skills and the ability to work collaboratively in a fast-paced environment. The role is office-based only, and we require full-time commitment from the office in Hyderabad. Key Responsibilities Application Development: Design, develop, and maintain highly scalable applications using Golang. Code Optimization & Performance Tuning Continuously optimize code for performance, scalability, and reliability.Conduct code reviews and ensure the highest standards of code quality are maintained. System Architecture & Design Design and develop backend systems, services, and APIs that meet functional and non-functional requirements.Contribute to system architecture discussions and help implement robust, scalable solutions. Troubleshooting & Debugging Diagnose and resolve technical issues and bottlenecks in applications.Use debugging and profiling tools to identify issues and optimize system performance. Collaboration & Documentation Work closely with front-end developers, QA, and product teams to deliver seamless applications.Write clear and concise technical documentation for features and systems. Agile Development Work in an Agile development environment, participating in sprints, planning, and retrospectives.Deliver high-quality code on time, ensuring proper testing and integration. Qualifications Education: Bachelor’s degree in Computer Science, Information Technology, or related field. Certifications: Certification in Golang is a plus (but not mandatory). Skills & Expertise Must-Have Technical Skills: Programming Languages: Proficiency in Golang. Concurrency: Strong understanding of concurrency models and multi-threading in Golang. API & RPC Development: Experience with building and maintaining RESTful APIs, along with a trong understanding of gRPC and RPC-based service communication. Database Systems: Experience with SQL/NoSQL databases (PostgreSQL, MongoDB, etc.). Version Control: Proficiency in using Git for version control and collaboration. Testing: Knowledge of writing unit and integration tests. Problem-Solving: Strong analytical and problem-solving skills. Soft Skills Strong analytical and problem-solving skills.Ability to work independently and collaboratively in a team.Good communication skills for reporting issues and progress.Time management and the ability to meet deadlines. Good To Have Skills Experience with cloud platforms like AWS, Azure, or Google Cloud.Knowledge of containerization tools like Docker and container orchestration with Kubernetes.Experience with CI/CD pipelines and DevOps practices.Familiarity with microservices architecture.Event-driven systems: Experience with Kafka/NATs for messaging.Monitoring & Observability: Experience with Grafana, OpenTelemetry for tracing and onitoring. Work Experience 2-4 years of professional experience in software development, specifically in Golang.Proven track record of building high-performance, scalable applications and services.Experience with building backend systems and APIs. Compensation & Benefits Competitive salary and annual performance-based bonuses.Comprehensive health and optional Parental insurance.Retirement savings plans and tax saving plans.Work-Life Balance: Flexible work hours.Professional Development: Opportunities for certifications, workshops, and conferences. Key Result Areas (KRAs) High-Performance Code Delivery - Deliver clean, efficient, and scalable code, focusing on the performance of applications. System Optimization - Optimize backend systems for performance and scalability, especially in high-traffic environments. Cross-Team Collaboration - Contribute to collaborative efforts across teams to ensure the successful delivery of new features. Mentoring & Knowledge Sharing - Act as a mentor for junior developers, helping them improve their skills and knowledge of Golang and Rust development. Key Performance Indicators (KPIs) Code Quality & Efficiency - Percentage of bugs/defects reported post-deployment. & Code quality score based on peer reviews and static code analysis tools. Performance & Scalability - System performance improvement (e.g., response time, throughput) post-optimization & Load testing results for developed features and applications. Timely Delivery - Percentage of tasks delivered on time as per sprint timelines & Number of sprints completed successfully with zero delays. Collaboration & Communication- Positive feedback from peers and cross-functional teams & Number of successfully closed cross-team issues or blockers. Contact: hr@bigtappanalytics.com

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies