Senior Data Scientist
Experience: 5+ years | Location: Bengaluru (Hybrid)
Akaike Technologies is a dynamic and innovative AI-driven company dedicated to building impactful solutions. Our mission is to empower businesses by harnessing the power of data and AI to drive growth, efficiency, and value. We foster a culture of collaboration, creativity, and continuous learning.
4 years as relevant experience into Data Science
large-scale
Key Responsibilities:
Large-Scale Data Handling, PySpark, & Databricks Deployment
- Efficiently handle and model
billions of data points
using multi-cluster data processing frameworks (PySpark
, Spark SQL
). Expertise on Databricks/AWS is a must have:
Ability to design, write, scale, and monitor end-to-end ML Pipelines
on Databricks/AWS.- Proven expertise to run and manage Databricks data pipelines in real time for low-latency decision-making.
- Develop and implement scalable deployment pipelines using Docker and AWS services (ECR, Lambda, Step Functions).
Classical Machine Learning
Owning the entire workstreams end to end, from use-case identification, to initial designs & POC by building custom machine learning solutions as needed till the business impact calculation of the use-case while ensuring modularity, scalablity, and production-ready codebase.
- Design and implement custom models, loss functions and be able to handle nuanced conversations of trade offs between various modelling choices.
- Apply specialized modeling for marketing scenarios (Targeting, Budget optimisation, Churn) and data limitations (Sparse/incomplete labels, Single class learning).
Generative AI & Large Language Models :
- Practical experience in building
LLM-ready Data Management layers
for large-scale structured and unstructured data. - Apply foundational understanding of
LLM Agents
and multi-agent systems (e.g., Agent-Critique, ReACT
, Agent Collaboration), advanced prompting, LLM evaluation, confidence grading
, and Human-in-the-Loop systems
.
Team Mentorship and Stakeholder Management.
- Mentor, support and manage a cross-functional team.
- Bring in structure across the client engagement - both internally as well as externally, with effective and top down communication.
- Act as the primary contact for clients, translating complex data needs into tasks. Present data insights to stakeholders, highlighting business impacts. Collaborate with cross-functional teams to align AI initiatives with business goals.
Must Have Technical Skills
Data Pipelines, PySpark & Databricks
Proficiency in Python
and its data science ecosystem (NumPy, Pandas, Dask, PySpark
) for large-scale data processing
.Expert, hands-on experience with Databricks
for MLOps, pipeline orchestration, and real-time deployment.- Ability to perform effective feature engineering by understanding complex business objectives.
Core Machine Learning & Deep Learning
- In-depth knowledge of Classical ML : Tree Based Models, GLMs, Clustering Models etc.
- Deep Learning : ANN, 1D/2D/3D
Convolutional Neural Networks (ConvNets)
, LSTMs, Transformer models.
- Strong proficiency in
PU learning
, single-class learning
, representation learning
, alongside traditional ML approaches. - Advanced understanding and application of
model explainability techniques
(e.g., SHAP, LIME). - Hands-on experience with ML/DL libraries such as Scikit-learn,
TensorFlow/Keras, and PyTorch
.
Others
- Experience utilizing large-scale language models (GPT-4, Mistral, Llama, Claude) through
prompt engineering
and custom finetuning
. - Code Versioning Systems : Github, Git
Must Have Soft Skills
- Communication Skills : Of all the things, this is perhaps the most important soft skill for us, you must be able toCapture the attention of your audience - usually in client calls Succinctly put across your ideas to your team members Bring clarity of thought and next steps to the table and present it well.
- Presentation Skills : Be able to visually present your ideas on a white board Be able to build compelling presentation for CxOs in a top down manner with an angle of business impact in mind.
- Problem Solving Skills : Be able to leverage various internal tools, client datasets to craft a problem in shortest time possible. Be able to make trade-offs keeping the timelines in mind.
Relevant to Have
- Background in Pharma Domain.
- Knowledge of Recommender Systems & Next Best Action Systems.
Benefits and Perks
Competitive ESOP grants.
Support for publishing papers and attending academic/industry conferences.
High visibility across all functions at Akaike.