Posted:21 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You will work with diverse datasets, from structured logs to unstructured events, to build intelligent systems for event correlation, root cause analysis, predictive maintenance, and autonomous remediation, ultimately driving significant operational efficiencies and improving service availability. This position requires a blend of deep technical expertise in machine learning, a strong understanding of IT operations, and a commitment to operationalizing AI solutions at scale.

Responsibilities

As a Senior Data Scientist, your responsibilities will include, but are not limited to:
  • Machine Learning Solution Development:
    • Design, develop, and implement advanced machine learning models (supervised and unsupervised) to solve complex IT Operations problems, including Event Correlation, Anomaly Detection, Root Cause Analysis, Predictive Analytics, and Auto-Remediation.
    • Leverage structured and unstructured datasets, performing extensive feature engineering and data preprocessing to optimize model performance.
    • Apply strong statistical modeling, hypothesis testing, and experimental design principles to ensure rigorous model validation and reliable insights.
  • AI/ML Product & Platform Development:
    • Lead the end-to-end development of Data Science products, from conceptualization and prototyping to deployment and maintenance.
    • Develop and deploy AI Agents for automating workflows in IT operations, particularly within Networks and CyberSecurity domains.
    • Implement RAG (Retrieval Augmented Generation) based retrieval frameworks for state-of-the-art models to enhance contextual understanding and response generation.
    • Adopt AI to detect and redact sensitive data in logs, and implement central data tagging for all logs to improve AI Model performance and governance.
  • MLOps & Deployment:
    • Drive the operationalization of machine learning models through robust MLOps/LLMOps practices, ensuring scalability, reliability, and maintainability.
    • Implement models as a service via APIs, utilizing containerization technologies (Docker, Kubernetes) for efficient deployment and management.
    • Design, build, and automate resilient Data Pipelines in cloud environments (GCP/Azure) using AI Agents and relevant cloud services.
  • Cloud & DevOps Integration:
    • Integrate data science solutions with existing IT infrastructure and AIOps platforms (e.g., IBM Cloud Paks, Moogsoft, BigPanda, Dynatrace).
    • Enable and optimize AIOps features within Data Analytics tools, Monitoring tools, or dedicated AIOps platforms.
    • Champion DevOps practices, including CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions), infrastructure-as-code (Terraform, Ansible, CloudFormation), and automation to streamline development and deployment workflows.
  • Performance & Reliability:
    • Monitor and optimize platform performance, ensuring systems are running efficiently and meeting defined Service Level Agreements (SLAs).
    • Lead incident management efforts related to data science systems and implement continuous improvements to enhance reliability and resilience.
  • Leadership & Collaboration:
    • Translate complex business problems into data science solutions, understanding their strategic implications and potential business value.
    • Collaborate effectively with cross-functional teams including engineering, product management, and operations to define project scope, requirements, and success metrics.
    • Mentor junior data scientists and engineers, fostering a culture of technical excellence, continuous learning, and innovation.
    • Clearly articulate complex technical concepts, findings, and recommendations to both technical and non-technical audiences, influencing decision-making and driving actionable outcomes.
  • Best Practices:
    • Uphold best engineering practices, including rigorous code reviews, comprehensive testing, and thorough documentation.
    • Maintain a strong focus on building maintainable, scalable, and secure systems.

Qualifications

  • Education:
    • Bachelors or Master's in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Statistics, or a related quantitative field.
  • Experience:
    • 8+ years of IT and 5+yrs of progressive experience as a Data Scientist, with a significant focus on applying ML/AI in IT Operations, AIOps, or a related domain.
    • Proven track record of building and deploying machine learning models into production environments.
    • Demonstrated experience with MLOps/LLMOps principles and tools.
    • Experience with designing and implementing microservices and serverless architectures.
    • Hands-on experience with containerization technologies (Docker, Kubernetes).
  • Technical Skills:
    • Programming: Proficiency in at least one major programming language, preferably Python, sufficient to effectively communicate with and guide engineering teams. (Java is also a plus).
    • Machine Learning: Strong theoretical and practical understanding of various ML algorithms (e.g., classification, regression, clustering, time-series analysis, deep learning) and their application to IT operational data.
    • Cloud Platforms:
      • Expertise with Google Cloud Platform (GCP) services is highly preferred, including Dataflow, Pub/Sub, Cloud Logging, Compute Engine, Kubernetes Engine, Cloud Functions, BigQuery, Cloud Storage, and Vertex AI.
      • Experience with other major cloud providers (AWS, Azure) is also valuable.
    • DevOps & Tools:
      • Experience with CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions).
      • Familiarity with infrastructure-as-code tools (e.g., Terraform, Ansible, CloudFormation).
    • AIOps/Observability:
      • Knowledge of AIOps platforms such as IBM Cloud Paks, Moogsoft, BigPanda, Dynatrace, etc.
      • Experience with log analytics platforms and data tagging strategies.
  • Soft Skills:
    • Exceptional analytical and problem-solving skills, with a track record of tackling ambiguous and complex challenges independently.
    • Strong communication and presentation skills, with the ability to articulate complex technical concepts and findings to diverse audiences and influence stakeholders.
    • Ability to take end-to-end ownership of data science projects.
    • Commitment to best engineering practices, including code reviews, testing, and documentation.
    • A strong desire to stay current with the latest advancements in AI, ML, and cloud technologies.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You