Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in chennai
>
UST
>
Lead I - Data Engineering

Lead I - Data Engineering

UST

5 - 7 years

5 Lacs

chennai bengaluru thiruvananthapuram

Posted:1 day ago| Platform:

Apply

Skills Required

python development pyspark data warehousing microsoft azure data pipeline tools data engineering sql analytics tensorflow etl tool gcp collaboration design optimization techniques keras bigquery aws etl scripting languages data integration programming communication skills

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role Proficiency:

This role requires proficiency in data pipeline development including coding and testing data pipelines for ingesting wrangling transforming and joining data from various sources. Must be skilled in ETL tools such as Informatica Glue Databricks and DataProc with coding expertise in Python PySpark and SQL. Works independently and has a deep understanding of data warehousing solutions including Snowflake BigQuery Lakehouse and Delta Lake. Capable of calculating costs and understanding performance issues related to data solutions.

Outcomes:

Act creatively to develop pipelines and applications by selecting appropriate technical options optimizing application development maintenance and performance using design patterns and reusing proven solutions.rnInterpret requirements to create optimal architecture and design developing solutions in accordance with specifications.
Document and communicate milestones/stages for end-to-end delivery.
Code adhering to best coding standards debug and test solutions to deliver best-in-class quality.
Perform performance tuning of code and align it with the appropriate infrastructure to optimize efficiency.
Validate results with user representatives integrating the overall solution seamlessly.
Develop and manage data storage solutions including relational databases NoSQL databases and data lakes.
Stay updated on the latest trends and best practices in data engineering cloud technologies and big data tools.
Influence and improve customer satisfaction through effective data solutions.

Measures of Outcomes:

Adherence to engineering processes and standards
Adherence to schedule / timelines
Adhere to SLAs where applicable
# of defects post delivery
# of non-compliance issues
Reduction of reoccurrence of known defects
Quickly turnaround production bugs
Completion of applicable technical/domain certifications
Completion of all mandatory training requirements
Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times).
Average time to detect respond to and resolve pipeline failures or data issues.
Number of data security incidents or compliance breaches.

Outputs Expected:

Code Development:

Develop data processing code independently
ensuring it meets performance and scalability requirements.
Define coding standards
templatesand checklists.
Review code for team members and peers.

Documentation:

Create and review templates
checklistsguidelinesand standards for designprocessesand development.
Create and review deliverable documents
including design documentsarchitecture documentsinfrastructure costingbusiness requirementssource-target mappingstest casesand results.

Configuration:

Define and govern the configuration management plan.
Ensure compliance within the team.

Testing:

Review and create unit test cases
scenariosand execution plans.
Review the test plan and test strategy developed by the testing team.
Provide clarifications and support to the testing team as needed.

Domain Relevance:

Advise data engineers on the design and development of features and components
demonstrating a deeper understanding of business needs.
Learn about customer domains to identify opportunities for value addition.
Complete relevant domain certifications to enhance expertise.

Project Management:

Manage the delivery of modules effectively.

Defect Management:

Perform root cause analysis (RCA) and mitigation of defects.
Identify defect trends and take proactive measures to improve quality.

Estimation:

Create and provide input for effort and size estimation for projects.

Knowledge Management:

Consume and contribute to project-related documents
SharePointlibrariesand client universities.
Review reusable documents created by the team.

Release Management:

Execute and monitor the release process to ensure smooth transitions.

Design Contribution:

Contribute to the creation of high-level design (HLD)
low-level design (LLD)and system architecture for applicationsbusiness componentsand data models.

Customer Interface:

Clarify requirements and provide guidance to the development team.
Present design options to customers and conduct product demonstrations.

Team Management:

Set FAST goals and provide constructive feedback.
Understand team members' aspirations and provide guidance and opportunities for growth.
Ensure team engagement in projects and initiatives.

Certifications:

Obtain relevant domain and technology certifications to stay competitive and informed.

Skill Examples:

Proficiency in SQL Python or other programming languages used for data manipulation.
Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF.
Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery).
Conduct tests on data pipelines and evaluate results against data quality and performance specifications.
Experience in performance tuning of data processes.
Expertise in designing and optimizing data warehouses for cost efficiency.
Ability to apply and optimize data models for efficient storage retrieval and processing of large datasets.
Capacity to clearly explain and communicate design and development aspects to customers.
Ability to estimate time and resource requirements for developing and debugging features or components.

Knowledge Examples:

Knowledge Examples

Knowledge of various ETL services offered by cloud providers including Apache PySpark AWS Glue GCP DataProc/DataFlow Azure ADF and ADLF.
Proficiency in SQL for analytics including windowing functions.
Understanding of data schemas and models relevant to various business contexts.
Familiarity with domain-related data and its implications.
Expertise in data warehousing optimization techniques.
Knowledge of data security concepts and best practices.
Familiarity with design patterns and frameworks in data engineering.

Additional Comments:

As a Data Engineer, you will be responsible for designing, building and maintaining data pipelines and infrastructure that support the client Data Science team. You will work with Google Cloud Platform (GCP) tools and technologies to collect, store and process data from various sources and formats. You will also write scripts to transform and load data explicitly for use in the machine learning (ML) pipelines process. You will optimize data access and performance for ML applications and ensure data quality and availability for analytics/ML teams. You will collaborate with data scientists, analysts, engineers and other stakeholders to deliver data solutions that enable data-driven decision making and innovation. Key Responsibilities As a Data Engineer for the Data Science team, you will perform the following tasks: - Architect, build and maintain scalable, reliable and secure data pipelines and infrastructure using GCP tools such as BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, etc. - Write scripts to transform and load data for use in ML pipelines using languages such as Python, SQL, etc. - Optimize data access and performance for ML applications using techniques such as partitioning, indexing, caching, etc. - Ensure data quality and availability for analytics/ML teams using tools such as Data Catalog, Data Studio, Data Quality, etc. - Monitor, troubleshoot and debug data issues and errors using tools such as Stackdriver, Cloud Logging, Cloud Monitoring, etc. - Document and maintain data pipelines and infrastructure specifications, standards and best practices using tools such as Git, Jupyter, etc. - Collaborate with data scientists, analysts, engineers, and other stakeholders to understand data requirements, provide data solutions and support data-driven projects and initiatives. - Support generative AI projects by providing data for training, testing and evaluation of generative models such as Gemini, Claude 3, etc. - Implement data augmentation, data anonymization and data synthesis techniques to enhance data quality and diversity for generative AI projects. Qualifications To be successful in this role, you will need to have the following qualifications: - Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics or related field. - At least years of experience in data engineering, data warehousing, data integration or related field. - Experience 6 in a health care-related field or working with health care data is preferred. - Proficient in GCP tools and technologies for data engineering, such as BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, etc., or equivalents in other cloud platforms - Proficient in scripting languages such as Python, SQL, etc. for data transformation and loading. - Knowledge of data modeling, data quality, data governance and data security principles and practices. - Knowledge of ML concepts, frameworks and tools such as TensorFlow, Keras, Scikit-learn, etc. - Knowledge of generative AI concepts, frameworks and tools. - Excellent communication, collaboration and problem-solving skills.

Required Skills

Gcp,Machine Learning,Architect

More Jobs at UST

Architect II - Technical SAP ABAP

Bengaluru

12 - 15 yrs

INR 0 - 0 Lacs

Lead I - SAP FICO with PS (Project Systems) and RAR

Bengaluru

5 - 7 yrs

INR 0 - 0 Lacs

Lead II - SAP ABAP

Bengaluru

7 - 9 yrs

INR 0 - 0 Lacs

Procurement

Bengaluru, Karnataka

Experience: Not specified

Salary: Not disclosed

Manual & Python Automation

Bengaluru, Karnataka

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

UST

IT Services and IT Consulting

Aliso Viejo CA

Login to

Please Verify Your Phone or Email

Confirm Action

Lead I - Data Engineering