Position Title: Head of Artificial Intelligence (AI Head)
Location:
Bengaluru / Hyderabad / Pune (or Pan-India)
Employment Type:
Full-Time | Senior Leadership
Reporting To:
CTO / CEO / Managing Director
Department:
Artificial Intelligence & Data ScienceRole PurposeThe Head of Artificial Intelligence (AI Head) will be responsible for
defining, building, and implementing AI-driven solutions across the enterprise
to improve operational efficiency, decision-making, customer experience, and revenue growth.This role will lead
AI strategy, use-case identification, model development, deployment, and governance
, ensuring scalable, ethical, and compliant AI adoption across business functions such as
retail, supply chain, manufacturing, sales, marketing, finance, and customer service
.
Key Responsibilities & Accountabilities
- AI Strategy & Roadmap
- Define and execute the enterprise AI strategy and roadmap aligned with business objectives and digital transformation goals.
- Identify high-impact AI/ML use cases across functions (forecasting, optimization, automation, personalization, fraud detection, predictive maintenance, etc.).
- Prioritize initiatives based on business value, feasibility, and ROI.
- AI Solution Development & Deployment
- Lead end-to-end AI solution lifecycle: problem definition, data preparation, feature engineering, model development, testing, deployment, and monitoring.
- Design and implement ML, Deep Learning, NLP, Computer Vision, and Generative AI solutions using industry-standard frameworks and libraries (e.g., Python, TensorFlow, PyTorch, scikit-learn, XGBoost, LightGBM).
- Hands-on experience with NLP toolkits and transformer models (Hugging Face Transformers, spaCy, BERT, GPT-family, LLaMA), embeddings, prompt engineering, and retrieval-augmented generation (RAG) for LLM-based solutions.
- Computer vision experience with OpenCV, Detectron2, EfficientDet, YOLO and CNN/transformer-based architectures for image/video tasks.
- Ensure seamless integration of AI models with ERP, CRM, WMS, OMS, e-commerce platforms, data warehouses and expose ML capabilities via REST/gRPC APIs or microservices.
- Champion production readiness: containerization (Docker), orchestration (Kubernetes), API design, load testing, latency optimization, and edge/ONNX/TensorRT deployments where applicable.
- Data, MLOps & Platforms
- Define AI architecture including data pipelines, model training environments, cloud platforms, feature stores, and MLOps frameworks. Provide clear patterns for training, validation and production serving.
- Data engineering and pipeline examples: Apache Spark (batch), Kafka / Debezium (streaming), Flink (stream processing), Airflow / Argo Workflows (orchestration), and ETL/ELT with tools like dbt.
- Feature store and data quality: Feast, Tecton; data validation and testing with Great Expectations and data monitors.
- MLOps platforms & practices (explicit examples): experiment tracking (Weights & Biases, MLflow, Neptune), model registries (MLflow Registry, DVC), CI/CD for ML (Jenkins, GitHub Actions, GitLab CI, Tekton), workflow orchestration (Kubeflow Pipelines, Argo, Prefect), and reproducible pipelines (Pachyderm).
- Model serving & inference: KServe, Seldon Core, BentoML, TorchServe, TensorFlow Serving; low-latency inference optimizations using ONNX, TensorRT, NVIDIA Triton, and edge deployments where needed.
- Distributed training & acceleration: PyTorch DDP, Horovod, DeepSpeed, TensorFlow MultiWorker; GPU/TPU provisioning and management on cloud or on-prem clusters.
- Cloud AI platforms & infra automation: AWS (SageMaker, EKS, ECS, S3, RDS), GCP (Vertex AI, GKE, BigQuery), Azure (Azure ML, AKS); infrastructure as code using Terraform, CloudFormation, Helm charts.
- Observability, monitoring & model governance: Prometheus, Grafana, ELK stack, Evidently, WhyLabs, Fiddler for model monitoring, drift detection, data quality alerts, and automated retraining triggers.
- Vector databases & retrieval stores for embeddings: Pinecone, Milvus, Weaviate, Qdrant; typical use with RAG pipelines and semantic search architectures.
- Security, compliance & cost optimization: secrets management (HashiCorp Vault, cloud KMS), network policies, IAM best practices, and cost monitoring/rightsizing for GPUs & storage.
- Typical production ML workflow (example pattern): data ingestion (Kafka) -> ETL & feature engineering (Spark + dbt) -> experiments (PyTorch/TensorFlow + W&B) -> model registry (MLflow) -> CI/CD (GitHub Actions + Argo) -> serving (Seldon/BentoML on Kubernetes) -> monitoring (Prometheus + Evidently) -> retraining pipeline (Airflow/Kubeflow).
- Business Enablement & Adoption
- Partner with business leaders to translate business problems into AI-driven solutions.
- Drive AI adoption and change management, ensuring users trust and effectively use AI outputs.
- Enable AI-powered dashboards, decision-support tools, and automation systems.
- Governance, Ethics & Compliance
- Establish AI governance frameworks, including data privacy, explainability, bias mitigation, and ethical AI principles.
- Ensure compliance with data protection laws, healthcare/pharma regulations, and ISO standards.
- Work closely with Information Security teams to ensure secure and compliant AI implementations.
- Team Leadership & Capability Building
- Build and lead high-performing AI, Data Science, and ML Engineering teams (onshore/offshore).
- Define skill frameworks, hiring plans, training programs, and succession planning.
- Foster a culture of innovation, experimentation, and continuous learning.
- Vendor & Partner Management
- Evaluate and manage AI vendors, platforms, cloud providers, and system integrators.
- Negotiate SLAs, performance metrics, and commercial terms.
- Ensure optimal cost-to-value outcomes for AI investments.
Key Performance Indicators (KPIs)
- Number of AI use cases deployed to production
- Business value and ROI delivered through AI initiatives
- Model accuracy, stability, and adoption rates
- Time-to-deploy AI solutions
- Compliance, security, and audit outcomes
- AI platform scalability and cost optimization
Qualifications
Required Qualifications (Must-haves)
- Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related field.
- 10+ years of overall professional experience, including a minimum of 6 years in AI/ML leadership roles managing cross-functional teams and delivering enterprise solutions.
- Proven track record of designing and implementing enterprise-scale AI/ML solutions that delivered measurable business impact (examples: forecasting, personalization, automation, predictive maintenance, fraud detection).
- Strong hands-on experience with Python and SQL and working knowledge of ML/DL frameworks such as TensorFlow or PyTorch and scikit-learn.
- Operational experience deploying models to production using containerization and orchestration (Docker, Kubernetes) and exposing models via APIs/microservices.
- Familiarity with at least one major cloud provider and its ML services (AWS/GCP/Azure) and practical experience with model training and inference on cloud infrastructure.
- Demonstrated expertise in MLOps practices (CI/CD for ML, model versioning/registry, experiment tracking, monitoring) and the ability to implement robust production pipelines.
- Strong software engineering discipline: version control (Git), testing, code review practices, and API design.
- Proven ability to translate business problems into AI solutions and to communicate with senior stakeholders and cross-functional teams.
Preferred Qualifications (Optional / Nice-to-have)
- Master’s degree or PhD in AI, Data Science, Machine Learning, or Analytics.
- Experience with big data and data engineering technologies (Apache Spark, Kafka, Flink), cloud data warehouses (Snowflake, Redshift, BigQuery) and orchestration tools (Airflow).
- Hands-on familiarity with MLOps/platform tooling: MLflow, DVC, Kubeflow, Argo, Seldon, WhyLabs; infrastructure automation with Terraform and Helm.
- Experience with advanced ML topics: distributed training (GPU/TPU), ONNX/TensorRT optimizations, latency and cost optimization techniques.
- Practical experience with NLP/LLMs (Hugging Face, transformer models), embeddings, RAG architectures, vector databases (Milvus, Pinecone, Weaviate) and prompt engineering for production use cases.
- Experience in computer vision solutions (OpenCV, Detectron2, YOLO, ViT) where applicable.
- Knowledge of model interpretability and governance tools/methods (SHAP, LIME), bias detection & mitigation approaches, and privacy-preserving techniques (differential privacy, anonymization).
- Domain experience in healthcare/pharma, retail, manufacturing, BFSI, or supply chain is a plus.
- Track record of publications, patents, or speaking at conferences is advantageous.
Technical Skills & Keywords
- [This section highlights concrete technical keywords and grouped tooling for search and candidate matching — examples provided to improve discoverability and clarify expected stacks]
- Languages & Core: Python (primary), SQL, Bash; secondary: R, Scala, Java
- ML / DL Frameworks: PyTorch, TensorFlow, JAX, scikit-learn, XGBoost, LightGBM, CatBoost
- NLP & LLMs: Hugging Face Transformers, spaCy, BERT, GPT-family, LLaMA, SentenceTransformers, tokenizers, prompt engineering, RAG patterns
- Computer Vision: OpenCV, Detectron2, YOLO family, EfficientDet, Vision Transformers (ViT)
- Data & Big Data: Apache Spark, Kafka, Flink, Hive, Presto, dbt; cloud warehouses: Snowflake, Redshift, BigQuery
- Feature Stores & Data Quality: Feast, Tecton, Great Expectations
- MLOps & Experimentation: MLflow, Weights & Biases, Neptune, DVC, Guild; experiment tracking and model registry tools
- Orchestration & Pipelines: Airflow, Kubeflow Pipelines, Argo Workflows, Prefect
- Model Serving & Inference Platforms: Seldon Core, KServe, BentoML, TorchServe, TensorFlow Serving, NVIDIA Triton
- Vector & Embedding Stores: Pinecone, Milvus, Weaviate, Qdrant
- Containerization & Orchestration: Docker, Kubernetes (EKS/GKE/AKS), Helm
- Infra Automation & CI/CD: Terraform, CloudFormation, Jenkins, GitHub Actions, GitLab CI, Tekton
- Distributed Training & Acceleration: PyTorch DDP, Horovod, DeepSpeed; GPU/TPU management
- Optimization & Model Formats: ONNX, TensorRT, quantization, pruning
- Monitoring & Observability: Prometheus, Grafana, ELK, Evidently, WhyLabs, Fiddler
- Datastores & Caching: PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch
- Security & Governance: HashiCorp Vault, cloud KMS, IAM best practices, privacy-preserving tooling
- Testing & Quality: pytest, tox, Great Expectations for data, integration & end-to-end ML tests
- Example typical production ML stack (concise): Python + PyTorch/TensorFlow, feature store (Feast), experiment tracking (W&B/MLflow), model registry (MLflow), CI/CD (GitHub Actions + Argo), serving (Seldon/BentoML on Kubernetes), observability (Prometheus + Grafana + Evidently), vector DB (Pinecone) for RAG scenarios.
Desired Competencies
- Strong business-first mindset with ability to translate AI into measurable outcomes
- Strategic thinking with hands-on technical depth
- Excellent stakeholder management and executive communication
- High ethical standards and governance orientation
- Ability to operate in fast-paced, transformation-led environments
Skills: platforms,ml,intelligence,artificial intelligence