Position Summary We are seeking a talented and driven Data Scientist to join our dynamic team at Illumina. In this role, you will collaborate with cross-functional teams of scientists, engineers, and bioinformaticians to analyze complex biological data, develop advanced models, and deliver actionable insights that propel Illumina s research, development, and commercial objectives. Your work will directly support initiatives across genomics, clinical applications, and product development, helping to shape the future of personalized medicine and health care.
Key Responsibilities - Design, develop, and implement robust statistical models, machine learning algorithms, and analytical pipelines.
- Partner with internal teams including research, product, informatics, and engineering to define project goals, data needs, and analytical approaches that align with Illumina s strategic objectives.
- Apply data mining and predictive modeling techniques to identify patterns, trends, and correlations in diverse datasets (e.g., sequencing data, clinical outcomes, operational metrics).
- Evaluate and validate model performance, ensuring reproducibility, scalability, and reliability of analytical solutions.
- Collaborate with software engineers to deploy analytical tools and integrate models into production-grade software platforms for internal and customer-facing applications.
- Communicate complex data-driven findings clearly and effectively to both technical and non-technical stakeholders through presentations, documentation, and visualizations.
- Stay current with emerging trends, tools, and best practices in data science, machine learning, and computational biology; continuously seek opportunities to improve processes and outcomes.
- Contribute to the publication and dissemination of results in peer-reviewed journals, conferences, and internal reports as appropriate.
- Collaborate with AI-systems designers to implement LLM driven solutions in support of enterprise use cases (LLM driven chatbots)
Required Qualifications - Education: Bachelor s degree in Data Science, Computer Science, Statistics, Mathematics, Bioinformatics, Computational Biology, Engineering, or a closely related field. Master s degree or Ph.D. is preferred.
- Proven experience working with large and complex datasets, with a strong background in data wrangling, statistical analysis, and machine learning.
- Demonstrated proficiency in at least one major programming language used for data analysis (such as Python, R, or Julia).
- Experience with cloud computing platforms, including AWS, MS Azure, as well as modern data warehousing solutions such as Snowflake.
- Familiarity with enterprise data management / data processing tools Kubernetes, Tableau, Apache
- Understanding of version control systems, especially Git, for collaborative code development and review.
- Excellent communication skills, including the ability to translate technical findings into actionable recommendations for diverse audiences.
- Analytical mindset and a passion for problem-solving in an interdisciplinary, fast-paced environment.
- Self-motivated, detail-oriented, and able to manage multiple projects concurrently with minimal supervision.
Preferred Qualifications - Master s degree or Ph.D. in a relevant field (Bioinformatics, Computational Biology, Data Science, etc.).
- Typically requires a Bachelor s degree and a minimum of 2 years of related experience; or an advanced degree without experience; or equivalent work experience
- Hands-on experience with genomic data analysis, including familiarity with next-generation sequencing (NGS) platforms, omics data types, and relevant bioinformatics tools.
- History of contributing to open-source projects or publications in scientific journals.
- Experience working in highly regulated environments and understanding of data privacy standards (e.g., HIPAA, GDPR) as applied to biological and clinical data.
- Background in healthcare, life sciences, or biotechnology industry.
Key Skills and Competencies
- Strong foundation in statistics, probability, and experimental design.
- Expertise in supervised and unsupervised machine learning techniques (e.g., regression, classification, clustering, dimensionality reduction).
- Comfortable working in a regulated environment and managing code, solution designs within these constraints.
- Proficiency in data cleaning, preprocessing, and feature engineering for structured and unstructured data.
- Ability to assess and select appropriate models, evaluate metrics, and iterate solutions using best practices.
- Capacity to work collaboratively in multidisciplinary teams and adapt to evolving project requirements.
- Innovative thinker with a desire to apply data science solutions to real-world challenges.