On-site
Part Time
Job Responsibilities:
Develop scalable data pipelines using AWS Glue, Lambda and other AWS services
Define workflow for data ingestion, cleansing, transformation, and storage using S3, Athena, Glue and other AWS services
Ensure data security, compliance, and governance
Implement robust monitoring and alerting mechanisms using CloudWatch and custom metrics for pipeline health and data quality
Contribute to cloud optimization, cost control, and architectural design reviews
Analyzes complex datasets to identify key trends, patterns, and potential data quality issues that could impact model performance or downstream analytics, working under general supervision
Develops and implements efficient data pipelines to extract, transform, and load data from various sources, ensuring data integrity, consistency, and adherence to data governance standard
Deploys machine learning and AI models in production environments and develops approaches for AI DevOps, adhering to best practices for security, scalability, and model explainability which may involve containerization and orchestration for robust deployments.
Monitors and maintains data pipelines and AI models, proactively identifying and resolving performance bottlenecks or potential issues to ensure continuous functionality and optimal model performance.
Documents data pipelines, models, and processes thoroughly, ensuring clarity, maintainability, and effective knowledge transfer within the team. This documentation should be tailored for both technical and non-technical audiences.
Troubleshoots data quality problems, performs root cause analysis to identify the source of data quality issues and collaborate with data scientists to design and implement effective solutions.
Communicates effectively technical findings and insights to both technical and non-technical audiences, tailoring communication style and detail level to the specific audience.
Learns and adapts skillset by staying up-to-date on the latest advancements in data engineering and AI technologies, explores new tools and techniques to improve ability to deliver efficient and impactful data solutions.
Works with business teams to understand their data needs and challenges and collaborates with data scientists to translate those needs into well-defined technical requirements and actionable data solutions.
Supports data scientists throughout the entire AI lifecycle by preparing data for analysis, building and optimizing data infrastructure for specific needs, and automating data workflows to streamline the model development process.
Minimum required Education:
Bachelor's / Master's Degree in Computer Science, Information Management, Data Science, Econometrics, Artificial Intelligence, Applied Mathematics, Statistics or equivalent.
Minimum required Experience:
Minimum 10 years of experience with Bachelor's in areas such as Data Handling, Data Analytics, AI Modeling or equivalent OR no prior experience required with Master's Degree.
Preferred Experience:
Strong hands on experience in AWS Data Engineering - S3, Lambda, Athena, S3, API Gateway, CloudFront, ECS and Glue
Solid understanding of data modeling, partitioning strategies, and performance tuning in large datasets
Familiarity with tools like CloudFormation for infrastructure automation
Experience in using programming languages such as Python, R, JAVA
Strong in SQL, Datawarehousing, Data Modelling
Awareness of latest datalake architectures (Iceberg, S3 tables, duckdb)
Knowledge of serviceability domain use cases such as diagnostics, telemetry, and predictive service operations is a plus
Strong communication and stakeholder management skills.
Preferred Certification:
Artificial Intelligence Board of America (ARTiBA) certified
Preferred Skills:
Awareness of end 2 end AI development process
Experience with working on data engineering for data science
Data governance, data privacy, data management principles, concepts and standards
Awareness and working with healthcare data standards such as DICOM, HL7, FHIR and other structured data.
Streaming data processing architectures
optimizing of data pipelines for GenAI and other AI solutions
awareness of data annotations pipelines and tooling
Synthetic data generation
Data lineage, provenance
Creation of semantic data layer and Graph databases
How we work together
We believe that we are better together than apart. For our office-based teams, this means working in-person at least 3 days per week.
this role is an office role.
#LI-Hybrid
#LI-PHILIN
Philips
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now
hyderabad, telangana
Salary: Not disclosed
bengaluru, karnataka, india
Salary: Not disclosed
bengaluru
2.016 - 3.0 Lacs P.A.
illinois, united states
0.973 - 1.947 Lacs P.A.
gurgaon
Salary: Not disclosed
hyderabad, telangana, india
Salary: Not disclosed
chennai, tamil nadu
Salary: Not disclosed
maharashtra
Salary: Not disclosed
hyderabad, bengaluru
15.0 - 17.0 Lacs P.A.
bengaluru
22.5 - 25.0 Lacs P.A.