Principle Engineer

10 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You will manage a talented team of data scientists and AI engineers, driving the adoption of intelligent automation, predictive analytics, and proactive problem resolution across our complex IT landscape. This position requires a leader with a deep understanding of both data science principles and the intricacies of enterprise IT operations.

Responsibilities

  • Strategic Leadership: Define and execute the AIOps strategy and roadmap, aligning it with overall IT and business objectives. Identify opportunities to leverage AI/ML for enhanced IT observability, incident management, performance optimization, and automation.
  • Team Management & Development: Lead, mentor, and grow a high-performing team of data scientists and AI engineers. Foster a culture of innovation, continuous learning, and technical excellence.
  • Solution Design & Development: Oversee the end-to-end design, development, and deployment of AIOps solutions, including anomaly detection, predictive failure analysis, root cause analysis, intelligent alerting, and automated remediation.
  • Cross-Functional Collaboration: Partner closely with IT Operations, Site Reliability Engineering (SRE), Network Engineering, Application Development, and other stakeholders to understand operational challenges and deliver impactful AI-driven solutions.
  • Data & Platform Management: Ensure the availability, quality, and governance of operational data necessary for AI/ML model training and inference. Drive the selection, integration, and optimization of AIOps platforms and tools.
  • Model Lifecycle Management: Establish robust MLOps practices for model development, testing, deployment, monitoring, and retraining to ensure the continuous effectiveness and reliability of AI models in production.
  • Innovation & Research: Stay abreast of the latest advancements in AI/ML, AIOps, and IT operations. Drive research and experimentation to explore new techniques and technologies that can further enhance our operational intelligence.
  • Performance & Metrics: Define key performance indicators (KPIs) for AIOps initiatives and regularly report on the impact and value delivered to the organization.
  • Budget & Resource Management: Manage project budgets, resources, and timelines effectively to ensure successful delivery of AIOps programs.

Qualifications

Required Qualifications:

  • Bachelor's or Master's degree in Computer Science, Data Science, Artificial Intelligence, Engineering, or a related quantitative field.
  • 10+ years of progressive experience in data science, machine learning, and/or AI engineering.
  • 5+ years of experience in a leadership or management role, leading technical teams focused on data science or AI.
  • Proven experience in designing, developing, and deploying AI/ML models for real-world applications, particularly within IT operations or related domains (e.g., observability, security, infrastructure management).
  • Strong understanding of IT operations concepts, including monitoring, alerting, incident management, change management, and IT service management (ITSM).
  • Proficiency in programming languages commonly used in data science and AI (e.g., Python, Scala, Java).
  • Hands-on experience with big data technologies (e.g., Spark, Hadoop, Kafka) and cloud platforms (AWS, Azure, GCP).
  • Solid grasp of machine learning algorithms (e.g., supervised, unsupervised, deep learning) and statistical modeling.
  • Excellent communication, interpersonal, and leadership skills with the ability to articulate complex technical concepts to non-technical stakeholders.
  • Demonstrated ability to drive strategic initiatives, manage complex projects, and deliver results in a fast-paced environment.

Preferred Qualifications:

  • Experience with specific AIOps platforms or tools (e.g., Splunk, Dynatrace, Moogsoft, PagerDuty, ServiceNow, Datadog, ELK stack).
  • Familiarity with IT service management frameworks (e.g., ITIL).
  • Experience with containerization (Docker, Kubernetes) and microservices architectures.
  • Knowledge of MLOps best practices and tools for automating and managing the ML lifecycle.
  • Experience in a large-scale enterprise environment with diverse and complex IT infrastructure.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You