Senior Devops Engineer

4 - 9 years

5 - 15 Lacs

Posted:19 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Title: DevOps Engineer

Location: Ahmedabad / Remote (Hybrid flexibility)

Department: Engineering & Infrastructure

Reports To: CTO / Head of Software Development

________________________________________

About Omnidya Tech LLP,

Hello

Omnidya is building Indias first advanced AI-powered dashcam ecosystem for fleet management, safety analytics, and smart transportation. Our platform fuses edge AI processing (ADAS, DMS, ANPR, telematics) with secure cloud connectivity (AWS IoT, S3, MQTT, and real-time streaming).

We are seeking a DevOps Engineer to scale our infrastructure, automate build and deployment pipelines, and manage GPU-based AI compute clusters both on-premise and in the cloud.

________________________________________

Role Overview

As a DevOps Engineer, you will play a crucial role in automating deployments, managing distributed edge-cloud systems, and maintaining our GPU training and inference environments. You’ll work closely with the AI, firmware, and backend teams to ensure smooth CI/CD workflows, optimal GPU utilization, and high system reliability.

________________________________________

Key Responsibilities

CI/CD & Automation

  • Design, build, and maintain CI/CD pipelines using GitLab CI, Jenkins, or GitHub Actions for backend, AI, and firmware builds.
  • Automate testing and deployment for Yocto-based embedded systems (i.MX8 platforms).
  • Create Docker containers and deployment scripts for AI inference and cloud microservices.

Cloud & Infrastructure Management

  • Manage and scale AWS infrastructure (IoT Core, EC2, ECR, CloudWatch, Lambda, Route 53).
  • Set up and maintain Terraform or CloudFormation for Infrastructure as Code (IaC).
  • Implement robust monitoring, alerting, and log aggregation using Prometheus, Grafana, ELK, or CloudWatch.

GPU Rack & Compute Cluster Management

  • Manage on-premise GPU servers / AI training racks (Ubuntu-based, multi-GPU systems).
  • Configure, optimize, and monitor GPU utilization for PyTorch / TensorFlow workloads.
  • Handle CUDA driver updates, containerized training environments, and model deployment pipelines.
  • Automate job scheduling using Slurm, Docker Swarm, or Kubernetes for GPU workloads.
  • Monitor performance metrics (GPU load, memory, thermals, power usage) to ensure stable training and inference operations.

Device Integration & Fleet Management

  • Streamline OTA (Over-The-Air) update pipelines for connected edge devices.
  • Manage provisioning, authentication, and status monitoring of thousands of IoT devices.
  • Ensure robust MQTT, REST API, and video data sync between dashcams and the cloud.

Security & Compliance

  • Implement AWS IAM policies, TLS/SSL certificates, and secure OTA mechanisms.
  • Collaborate on device and cloud-level security hardening for regulatory compliance (BIS, ICAT).

Documentation & Collaboration

  • Document automation flows, deployment topologies, and infrastructure standards.
  • Collaborate with AI, embedded, and backend teams to align deployment processes across systems.

________________________________________

Required Skills & Experience

Experience

  • 3–7 years of experience in DevOps, Cloud Infrastructure, or Site Reliability Engineering.

Technical Skills

  • Linux system administration (Ubuntu, Yocto, Debian)
  • Containerization: Docker, Podman, Kubernetes (preferably K3s / MicroK8s)
  • CI/CD Tools: GitLab CI, Jenkins, GitHub Actions
  • Cloud Platforms: AWS (EC2, IoT Core, S3, Lambda, CloudWatch)
  • IaC: Terraform, CloudFormation
  • Monitoring: Prometheus, Grafana, ELK Stack
  • Networking: VPN, DNS, load balancing, NAT, SSL certificates
  • GPU Systems:
  • Hands-on with NVIDIA GPU drivers, CUDA, cuDNN, TensorRT
  • Experience with GPU workload management, thermal/power profiling, and optimization
  • Familiarity with multi-GPU training, inference scaling, and model deployment

Bonus Skills

  • Experience with embedded Linux (Yocto, NXP i.MX8)
  • Understanding of RTMP/FLV streaming pipelines or GStreamer
  • Familiarity with Python microservices (FastAPI / Flask)
  • Knowledge of AI/ML model lifecycle management (training quantization edge inference)

________________________________________

Soft Skills

  • Strong analytical and problem-solving mindset.
  • Excellent communication and cross-functional collaboration.
  • Passion for automation, reliability, and scalability.
  • Ability to work independently in a fast-paced startup environment.

________________________________________

What We Offer

  • Competitive salary and performance-based bonuses.
  • Opportunity to work on cutting-edge edge-AI + GPU infrastructure projects.
  • Exposure to AWS, IoT, AI training clusters, and fleet-scale deployment systems.
  • Hybrid work setup and rapid growth opportunities in a high-impact product team.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

pune, maharashtra, india