Overview
Keysight accelerates innovation to connect and secure the world. Our solutions span wireless communications, semiconductors, aerospace/defense, automotive, and beyond. We combine measurement science, simulation, and advanced AI to help engineers design, simulate, and validate the world s most advanced systems. The Keysight AI Labs is pioneering scientific machine learning and physics & T&M-informed AI to transform Keysight s software, simulation, and measurement products.
We are seeking a highly experienced and technically versatile Senior Data Engineer to join our AI infrastructure team. This role is critical to enabling scalable, secure, and high-performance data pipelines that support machine learning workflows. You will work closely with MLOps engineers, data scientists, and software developers to design and implement robust data systems that power model training, evaluation, and deployment at scale.
Responsibilities
Key Responsibilities
- Architect and maintain end to end data pipelines for structured and unstructured data using tools like Spark, Airflow, and LakeFS.
- Design and implement data versioning, lineage, and governance strategies to support reproducibility , reliability and compliance.
- Collaborate with MLOps and ML engineering teams to integrate data workflows into model training and deployment pipelines.
- Optimize data storage layers and retrieval across hybrid environments (on-prem and AWS), ensuring performance and cost-efficiency.
- Develop APIs and SDKs to enable self-service data access for internal ML teams.
- Lead data infrastructure design reviews and mentor junior engineers on best practices.
- Contribute to the development of internal tools for data curation, labeling, and augmentation.
Qualifications
Required Qualifications
- Bachelor s or Master s degree in Computer Science, Data Engineering, or related field.
- 7+ years of experience in data engineering, with a strong focus on ML/AI infrastructure.
- Proficiency in Python, SQL, and distributed data processing frameworks (e.g., Spark, Dask).
- Experience with data orchestration tools (e.g., Airflow, Prefect) and version control systems (e.g., LakeFS, DVC).
- Deep understanding of data modeling, ETL/ELT pipelines, and data quality frameworks.
- Proficient with cloud platforms (AWS preferred) and containerized environments (Docker, Kubernetes).
Preferred Qualifications
- Experience supporting MLOps workflows and integrating with ML platforms (e.g., MLflow, Kubeflow).
- Knowledge of data privacy, security, and compliance in regulated environments.
- Strong communication skills and ability to work cross-functionally in a fast-paced R&D environment.
- Advanced statistical and analytics background to provide transformation checks, quality checks, informative data profiling, etc.
Careers Privacy Statement Keysight is an Equal Opportunity Employer.
Key Responsibilities
- Architect and maintain end to end data pipelines for structured and unstructured data using tools like Spark, Airflow, and LakeFS.
- Design and implement data versioning, lineage, and governance strategies to support reproducibility , reliability and compliance.
- Collaborate with MLOps and ML engineering teams to integrate data workflows into model training and deployment pipelines.
- Optimize data storage layers and retrieval across hybrid environments (on-prem and AWS), ensuring performance and cost-efficiency.
- Develop APIs and SDKs to enable self-service data access for internal ML teams.
- Lead data infrastructure design reviews and mentor junior engineers on best practices.
- Contribute to the development of internal tools for data curation, labeling, and augmentation.