Job
Description
Not Applicable Specialism SAP & Summary At PwC, our people in data and analytics engineering focus on leveraging advanced technologies and techniques to design and develop robust data solutions for clients. They play a crucial role in transforming raw data into actionable insights, enabling informed decisionmaking and driving business growth. In data engineering at PwC, you will focus on designing and building data infrastructure and systems to enable efficient data processing and analysis. You will be responsible for developing and implementing data pipelines, data integration, and data transformation solutions. At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purposeled and valuesdriven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us . & Summary A career within Data and Analytics services will provide you with the opportunity to help organisations uncover enterprise insights and drive business results using smarter data analytics. We focus on a collection of organisational technology capabilities, including business intelligence, data management, and data assurance that help our clients drive innovation, growth, and change within their organisations in order to keep up with the changing nature of customers and technology. We make impactful decisions by mixing mind and machine to leverage data, understand and navigate risk, and help our clients gain a competitive edge. Responsibilities Design, develop, and optimize data pipelines and ETL processes using PySpark or Scala to extract, transform, and load large volumes of structured and unstructured data from diverse sources. Implement data ingestion, processing, and storage solutions on Azure cloud platform, leveraging services such as Azure Databricks, Azure Data Lake Storage, and Azure Synapse Analytics. Develop and maintain data models, schemas, and metadata to support efficient data access, query performance, and analytics requirements. Monitor pipeline performance, troubleshoot issues, and optimize data processing workflows for scalability, reliability, and costeffectiveness. Implement data security and compliance measures to protect sensitive information and ensure regulatory compliance. Requirement Proven experience as a Data Engineer, with expertise in building and optimizing data pipelines using PySpark, Scala, and Apache Spark. Handson experience with cloud platforms, particularly Azure, and proficiency in Azure services such as Azure Databricks, Azure Data Lake Storage, Azure Synapse Analytics, and Azure SQL Database. Strong programming skills in Python and Scala, with experience in software development, version control, and CI/CD practices. Familiarity with data warehousing concepts, dimensional modeling, and relational databases (e.g., SQL Server, PostgreSQL, MySQL). Experience with big data technologies and frameworks (e.g., Hadoop, Hive, HBase) is a plus. Mandatory skill sets Spark, Pyspark, Azure Preferred skill sets Spark, Pyspark, Azure Years of experience required 4 8 Education qualification B.Tech / M.Tech / MBA / MCA Education Degrees/Field of Study required Bachelor of Engineering, Master of Business Administration, Master of Engineering Degrees/Field of Study preferred Required Skills Python (Programming Language) Accepting Feedback, Accepting Feedback, Active Listening, Agile Scalability, Amazon Web Services (AWS), Analytical Thinking, Apache Airflow, Apache Hadoop, Azure Data Factory, Coaching and Feedback, Communication, Creativity, Data Anonymization, Data Architecture, Database Administration, Database Management System (DBMS), Database Optimization, Database Security Best Practices, Databricks Unified Data Analytics Platform, Data Engineering, Data Engineering Platforms, Data Infrastructure, Data Integration, Data Lake, Data Modeling {+ 32 more} No