Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
4.0 - 6.0 years
7 - 10 Lacs
Hyderabad
Work from Office
What you will do In this vital role you will be part of Researchs Semantic Graph Team is seeking a dedicated and skilled Semantic Data Engineer to build and optimize knowledge graph-based software and data resources. This role primarily focuses on working with technologies such as RDF, SPARQL, and Python. In addition, the position involves semantic data integration and cloud-based data engineering. The ideal candidate should possess experience in the pharmaceutical or biotech industry, demonstrate deep technical skills, and be proficient with big data technologies and demonstrate experience in semantic modeling. A deep understanding of data architecture and ETL processes is also essential for this role. In this role, you will be responsible for constructing semantic data pipelines, integrating both relational and graph-based data sources, ensuring seamless data interoperability, and leveraging cloud platforms to scale data solutions effectively. Roles & Responsibilities: Develop and maintain semantic data pipelines using Python, RDF, SPARQL, and linked data technologies. Develop and maintain semantic data models for biopharma scientific data Integrate relational databases (SQL, PostgreSQL, MySQL, Oracle, etc.) with semantic frameworks. Ensure interoperability across federated data sources, linking relational and graph-based data. Implement and optimize CI/CD pipelines using GitLab and AWS. Leverage cloud services (AWS Lambda, S3, Databricks, etc.) to support scalable knowledge graph solutions. Collaborate with global multi-functional teams, including research scientists, Data Architects, Business SMEs, Software Engineers, and Data Scientists to understand data requirements, design solutions, and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions. Collaborate with data scientists, engineers, and domain experts to improve research data accessibility. Adhere to standard processes for coding, testing, and designing reusable code/components. Explore new tools and technologies to improve ETL platform performance. Participate in sprint planning meetings and provide estimations on technical implementation. Maintain comprehensive documentation of processes, systems, and solutions. Harmonize research data to appropriate taxonomies, ontologies, and controlled vocabularies for context and reference knowledge. Basic Qualifications and Experience: Doctorate Degree OR Masters degree with 4 - 6 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field OR Bachelors degree with 6 - 8 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field OR Diploma with 10 - 12 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field Preferred Qualifications and Experience: 6+ years of experience in designing and supporting biopharma scientific research data analytics (software platforms) Functional Skills: Must-Have Skills: Advanced Semantic and Relational Data Skills: Proficiency in Python, RDF, SPARQL, Graph Databases (e.g. Allegrograph), SQL, relational databases, ETL pipelines, big data technologies (e.g. Databricks), semantic data standards (OWL, W3C, FAIR principles), ontology development and semantic modeling practices. Cloud and Automation Expertise: Good experience in using cloud platforms (preferably AWS) for data engineering, along with Python for automation, data federation techniques, and model-driven architecture for scalable solutions. Technical Problem-Solving: Excellent problem-solving skills with hands-on experience in test automation frameworks (pytest), scripting tasks, and handling large, complex datasets. Good-to-Have Skills: Experience in biotech/drug discovery data engineering Experience applying knowledge graphs, taxonomy and ontology concepts in life sciences and chemistry domains Experience with graph databases (Allegrograph, Neo4j, GraphDB, Amazon Neptune) Familiarity with Cypher, GraphQL, or other graph query languages Experience with big data tools (e.g. Databricks) Experience in biomedical or life sciences research data management Soft Skills: Excellent critical-thinking and problem-solving skills Good communication and collaboration skills Demonstrated awareness of how to function in a team setting Demonstrated presentation skills
Posted 3 days ago
4.0 - 6.0 years
4 - 6 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
On-site
Roles & Responsibilities: Lead conversations with business collaborators to elucidate semantic models of pharmaceutical business concepts, aligned definitions, and relationships. Negotiate and debate across collaborators to drive alignment and create system-independent information models, taking a data-centric approach aligned with business data domains. Develop comprehensive business information models and ontologies that capture industry-specific concepts, including CMC, Clinical, and Operations data. Facilitate whiteboarding sessions with business subject matter experts to elicit knowledge, drive interoperability across pharmaceutical domains, and interface between data producers and consumers. Educate peers on the practical use and differentiating value of Linked Data and FAIR+ data principles. Champion standards for master data & reference data. Formalize data models in RDF as OWL and SHACL ontologies that interoperate with each other and with relevant industry standards like FHIR and IDMP for healthcare data exchange. Build a broad semantic knowledge graph that threads data together across end-to-end business processes and enables the transformation to data-centricity and new ways of working Apply pragmatic semantic abstraction to simplify diverse pharmaceutical and healthcare data patterns effectively. Basic Qualifications: Doctorate degree OR Masters degree and 4 to 6 years of Data Science experience OR Bachelors degree and 6 to 8 years of Data Science experience OR Diploma and 10 to 12 years of Data Science experience Preferred Qualifications: About the role You will play a key role in a regulatory submission content automation initiative which will modernize and digitize the regulatory submission process, positioning Amgen as a leader in regulatory innovation. The initiative uses state-of-the-art technologies, including Generative AI, Structured Content Management, and integrated data to automate the creation, review, and approval of regulatory content. Role Description: The Sr Data Scientist is responsible for developing interconnected business information models and ontologies that capture real-world meaning of data by studying the business, our data, and the industry. With a focus on pharmaceutical industry-specific data, including Clinical, Operations, and Chemistry, Manufacturing, and Controls (CMC), this role involves creating robust semantic models based on data-centric principles to realize a connected data ecosystem that empowers consumers. The Information Modeler drives seamless cross-functional data interoperability, enables efficient decision-making, and supports digital transformation in pharmaceutical operations. Functional Skills: Must-Have Skills: Proven ability to lead and develop successful teams. Strong problem-solving, analytical, and critical thinking skills to address complex data challenges. Deep understanding of pharmaceutical industry data, including CMC, Process Development, Manufacturing, Engineering Quality, Supply Chain, and Operations. Advanced skills in semantic modeling, RDF, OWL, SHACL, and ontology development in TopBraid and/or Protg. Demonstrated experience creating knowledge graphs with semantic RDF technologies (e.g. Stardog, AllegroGraph, GraphDB, Neptune) and testing models with real data. Highly proficient with RDF, SPARQL, Linked Data concepts, and interacting with triple stores. Highly proficient at facilitating, capturing, and organizing collaborative discussions through tools such as Miro, Lucidspark, Lucidchart, and Confluence. Expertise in FAIR data principles and their application in healthcare and pharmaceutical data models. Good-to-Have Skills: Experience in regulatory data modeling and compliance requirements in the pharmaceutical domain. Familiarity with pharmaceutical lifecycle data (PLM), including product development and regulatory submissions. Knowledge of supply chain and operations data modeling in the pharmaceutical industry. Proficiency in integrating data from various sources, such as LIMS, EDC systems, and MES. Hands-on data analysis and wrangling experience including SQL-based data transformation and solving integration challenges arising from differences in data structure, meaning, or terminology Expertise in FHIR data standards and their application in healthcare and pharmaceutical data models. Soft Skills: Exceptional interpersonal, business analysis, facilitation, and communication skills. Ability to interpret complex regulatory and operational requirements into data models. Analytical thinking for problem-solving in a highly regulated environment. Adaptability to manage and prioritize multiple projects in a dynamic setting. Strong appreciation for customer- and user-centric product design thinking.
Posted 2 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.