Must Have Skills
- GCP - GCS, PubSub, Dataflow or DataProc, Bigquery, Airflow/Composer, Python(preferred)/Java
- ETL on GCP Cloud - Build pipelines (Python/Java) + Scripting, Best Practices, Challenges
- Knowledge of Batch and Streaming data ingestion, build End to Data pipelines on GCP
- Knowledge of Databases (SQL, NoSQL), On-Premise and On-Cloud, SQL vs No SQL, Types of No-SQL DB (At least 2 databases)
- Data Warehouse concepts - Beginner to Intermediate level
Role & Responsibilities
- Work with business users and other stakeholders to understand business processes.
- Ability to design and implement Dimensional and Fact tables
- Identify and implement data transformation/cleansing requirements
- Develop a highly scalable, reliable, and high-performance data processing pipeline to extract, transform and load data
from various systems to the Enterprise Data Warehouse
- Develop conceptual, logical, and physical data models with associated metadata including data lineage and technical
data definitions
- Design, develop and maintain ETL workflows and mappings using the appropriate data load technique
- Provide research, high-level design, and estimates for data transformation and data integration from source
applications to end-user BI solutions.
- Provide production support of ETL processes to ensure timely completion and availability of data in the data
warehouse for reporting use.
- Analyze and resolve problems and provide technical assistance as necessary. Partner with the BI team to evaluate,
design, develop BI reports and dashboards according to functional specifications while maintaining data integrity and
data quality.
- Work collaboratively with key stakeholders to translate business information needs into well-defined data
requirements to implement the BI solutions.
- Leverage transactional information, data from ERP, CRM, HRIS applications to model, extract and transform into
reporting & analytics.
- Define and document the use of BI through user experience/use cases, prototypes, test, and deploy BI solutions.
- Develop and support data governance processes, analyze data to identify and articulate trends, patterns, outliers,
quality issues, and continuously validate reports, dashboards and suggest improvements.
- Train business end-users, IT analysts, and developers.
Skills:- Google Cloud Platform (GCP), ETL, Python, Big Data, SQL, Data integration, dataproc, Apache Airflow and bigquery