Data Platform Engineer

3.0 years

0 Lacs

Andhra Pradesh

Posted:2 days ago| Platform:

Apply Now

Skills Required

data pyspark developer design hadoop test spark python nosql pipeline testing sql unix scripting hive tokenization obfuscation processing integration regression jenkins git agile development scrum documentation triage code deployment statistics writing exporting etl informatica leadership communication teamwork collaboration

Work Mode

On-site

Job Type

Part Time

Job Description

We are looking for a PySpark solutions developer and data engineer who can design and build solutions for one of our Fortune 500 Client programs, which aims towards building a data standardized and curation needs on Hadoop cluster. This is high visibility, fast-paced key initiative will integrate data across internal and external sources, provide analytical insights, and integrate with the customers critical systems. Key Responsibilities Ability to design, build and unit test applications on Spark framework on Python. Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well. Develop and execute data pipeline testing processes and validate business rules and policies. Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD's. Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec respectively. Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec. Build data tokenization libraries and integrate with Hive & Spark for column-level obfuscation. Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources. Create and maintain integration and regression testing framework on Jenkins integrated with Bit Bucket and/or GIT repositories. Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings. Work collaboratively with onsite and offshore team. Develop & review technical documentation for artifacts delivered. Ability to solve complex data-driven scenarios and triage towards defects and production issues. Ability to learn-unlearn-relearn concepts with an open and analytical mindset. Participate in code release and production deployment. Challenge and inspire team members to achieve business results in a fast paced and quickly changing environment. Preferred Qualifications BE/B.Tech/ B.Sc. in Computer Science/ Statistics from an accredited college or university. Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications. Expertise in handling complex large-scale Big Data environments preferably (20Tb+). Minimum 3 years of experience in the following: HIVE, YARN, HDFS. Hands-on experience writing complex SQL queries, exporting, and importing large amounts of data using utilities. Ability to build abstracted, modularized reusable code components. Prior experience on ETL tools preferably Informatica PowerCenter is advantageous. Able to quickly adapt and learn. Able to jump into an ambiguous situation and take the lead on resolution. Able to communicate and coordinate across various teams. Are comfortable tackling new challenges and new ways of working Are ready to move from traditional methods and adapt into agile ones Comfortable challenging your peers and leadership team. Can prove yourself quickly and decisively. Excellent communication skills and Good Customer Centricity. Strong Target & High Solution Orientation. About Virtusa Teamwork, quality of life, professional and personal development: values that Virtusa is proud to embody. When you join us, you join a team of 27,000 people globally that cares about your growth — one that seeks to provide you with exciting projects, opportunities and work with state of the art technologies throughout your career with us. Great minds, great potential: it all comes together at Virtusa. We value collaboration and the team environment of our company, and seek to provide great minds with a dynamic place to nurture new ideas and foster excellence. Virtusa was founded on principles of equal opportunity for all, and so does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.

Mock Interview

Boost Confidence & Sharpen Skills

Start Data Interview Now
Virtusa
Virtusa

Information Technology and Services

Southborough

20,000+ Employees

2370 Jobs

    Key People

  • Kris Canekeratne

    Chairman and CEO
  • Sanjay Singh

    President and COO

RecommendedJobs for You

Indore, Madhya Pradesh, India