Backend &MLOps Engineer

0 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Backend & MLOps Engineer – Integration, API, and Infrastructure Expert


1.⁠ ⁠Role Objective:

Responsible for building robust backend infrastructure, managing ML operations, and creating scalable APIs for AI applications. Must excel in deploying and maintaining AI products in production environments with high availability and security standards.  The engineer will be expected to build secure, scalable backend systems that integrate AI models into services (REST, gRPC), manage data pipelines, enable model versioning, and deploy containerized applications in secure (air-gapped) Naval infrastructure.


2.⁠ ⁠Key Responsibilities:


2.1. Create RESTful and/or gRPC APIs for model services.


2.2. Containerize AI applications and maintain Kubernetes-compatible Docker images.


2.3. Develop CI/CD pipelines for model training and deployment.


2.4. Integrate models as microservices using TorchServe, Triton, or FastAPI.


2.5. Implement observability (metrics, logs, alerts) for deployed AI pipelines.


2.6. Build secured data ingestion and processing workflows (ETL/ELT).


2.7. Optimize deployments for CPU/GPU performance, power efficiency, and memory usage


3.⁠ ⁠Educational Qualifications


Essential Requirements:


3.1. B.Tech/ M.Tech in Computer Science, Information Technology, or Software Engineering.


3.2. Strong foundation in distributed systems, databases, and cloud computing.


3.3. Minimum 70% marks or 7.5 CGPA in relevant disciplines.


Professional Certifications:

3.4. AWS Solutions Architect/DevOps Engineer Professional


3.5. Google Cloud Professional ML Engineer or DevOps Engineer


3.6. Azure AI Engineer or DevOps Engineer Expert.


3.7. Kubernetes Administrator (CKA) or Developer (CKAD).


3.8. Docker Certified Associate


Core Skills & Tools


4.⁠ ⁠Backend Development:


4.1. Languages: Python, FastAPI, Flask, Go, Java, Node.js, Rust (for performance-critical components)


4.2. Web Frameworks: FastAPI, Django, Flask, Spring Boot, Express.js.


4.3. API Development: RESTful APIs, GraphQL, gRPC, WebSocket connections.


4.4. Authentication & Security: OAuth 2.0, JWT, API rate limiting, encryption protocols.


5.⁠ ⁠MLOps & Model Management:


5.1. ML Platforms: MLflow, Kubeflow, Apache Airflow, Prefect


5.2. Model Serving: TensorFlow Serving, TorchServe, ONNX Runtime, NVIDIA Triton, BentoML


5.3. Experiment Tracking: Weights & Biases, Neptune, ClearML


5.4. Feature Stores: Feast, Tecton, Amazon SageMaker Feature Store


5.5. Model Monitoring: Evidently AI, Arize, Fiddler, custom monitoring solutions


6.⁠ ⁠Infrastructure & DevOps:


6.1. Containerization: Docker, Podman, container optimization.


6.2. Orchestration: Kubernetes, Docker Swarm, OpenShift.


6.3. Cloud Platforms: AWS, Google Cloud, Azure (multi-cloud expertise preferred).


6.4. Infrastructure as Code: Terraform, CloudFormation, Pulumi, Ansible.


6.5. CI/CD: Jenkins, GitLab CI, GitHub Actions, ArgoCD.


6.6. DevOps & Infra: Docker, Kubernetes, NGINX, GitHub Actions, Jenkins.


7.⁠ ⁠Database & Storage:


7.1. Relational: PostgreSQL, MySQL, Oracle (for enterprise applications)


7.2. NoSQL: MongoDB, Cassandra, Redis, Elasticsearch


7.3. Vector Databases: Pinecone, Weaviate, Chroma, Milvus


7.4. Data Lakes: Apache Spark, Hadoop, Delta Lake, Apache Iceberg


7.5. Object Storage: AWS S3, Google Cloud Storage, MinIO


7.6. Backend: Python (FastAPI, Flask), Node.js (optional)


7.7. DevOps & Infra: Docker, Kubernetes, NGINX, GitHub Actions, Jenkins


8.⁠ ⁠Secure Deployment:


8.1. Military-grade security protocols and compliance


8.2. Air-gapped deployment capabilities


8.3. Encrypted data transmission and storage


8.4. Role-based access control (RBAC) & IDAM integration


8.5. Audit logging and compliance reporting


9.⁠ ⁠Edge Computing:


9.1. Deployment on naval vessels with air gapped connectivity.


9.2. Optimization of applications for resource-constrained environment.


10.⁠ ⁠High Availability Systems:


10.1. Mission-critical system design with 99.9% uptime.


10.2. Disaster recovery and backup strategies.


10.3. Load balancing and auto-scaling.


10.4. Failover mechanisms for critical operations.


11.⁠ ⁠Cross-Compatibility Requirements:


11.1. Define and expose APIs in a documented, frontend-consumable format (Swagger/OpenAPI).


11.2. Develop model loaders for AI Engineer's ONNX/ serialized models.


11.3. Provide UI developers with test environments, mock data, and endpoints.


11.4. Support frontend debugging, edge deployment bundling, and user role enforcement.


12.⁠ ⁠Experience Requirements


12.1. Production experience with cloud platforms and containerization.


12.2. Experience building and maintaining APIs serving millions of requests.


12.3. Knowledge of database optimization and performance tuning.


12.4. Experience with monitoring and alerting systems.


12.5. Architected and deployed large-scale distributed systems.


12.6. Led infrastructure migration or modernization projects.


12.7. Experience with multi-region deployments and disaster recovery.


12.8. Track record of optimizing system performance and cost

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You