Jobs
Interviews

170 Impala Jobs - Page 7

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

2.0 - 5.0 years

14 - 17 Lacs

Mumbai

Work from Office

As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too, Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations, Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala

Posted 2 months ago

Apply

3.0 - 7.0 years

10 - 20 Lacs

Bengaluru, Mumbai (All Areas)

Work from Office

Responsibilities Design & Develop new automation framework for ETL processing Support existing framework and become technical point of contact for all related teams. Enhance existing ETL automation framework as per user requirements Performance tuning of spark, snowflake ETL jobs New technology POC and suitability analysis for Cloud migration. Process optimization with the help of automation and new utility development. Support any batch issue Support application team teams with any queries Required Skills Must be strong in UNIX Shell, Python scripting knowledge Must be strong in Spark Must have strong knowledge of SQL Hands-on knowledge on how HDFS/Hive/Impala/Spark works Strong in logical reasoning capabilities Should have working knowledge of Github, DevOps, CICD/ Enterprise code management tools Strong collaboration and communication skills Must possess strong team-player skills and should have excellent written and verbal communication skills Ability to create and maintain a positive environment of shared success. Ability to execute and prioritize a tasks and resolve issues without aid from direct manager or project sponsor. Good to have working experience on snowflake & any data integration tool i.e. informatica cloud Primary skills Apache Hadoop Apache Spark Unix Shell scripting Python SQL Good to have skills: Snowflake/Azure/AWS any cloud IDMC/any ETL tool

Posted 2 months ago

Apply

5.0 - 8.0 years

4 - 8 Lacs

Pune

Work from Office

Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Hadoop. Experience5-8 Years.

Posted 2 months ago

Apply

3.0 - 6.0 years

9 - 14 Lacs

Mumbai

Work from Office

Role Overview : We are looking for aTalend Data Catalog Specialistto drive enterprise data governance initiatives by implementingTalend Data Catalogand integrating it withApache Atlasfor unified metadata management within a Cloudera-based data lakehouse. The role involves establishing metadata lineage, glossary harmonization, and governance policies to enhance trust, discovery, and compliance across the data ecosystem Key Responsibilities: o Set up and configure Talend Data Catalog to ingest and manage metadata from source systems, data lake (HDFS), Iceberg tables, Hive metastore, and external data sources. o Develop and maintain business glossaries , data classifications, and metadata models. o Design and implement bi-directional integration between Talend Data Catalog and Apache Atlas to enable metadata synchronization , lineage capture, and policy alignment across the Cloudera stack. o Map technical metadata from Hive/Impala to business metadata defined in Talend. o Capture end-to-end lineage of data pipelines (e.g., from ingestion in PySpark to consumption in BI tools) using Talend and Atlas. o Provide impact analysis for schema changes, data transformations, and governance rule enforcement. o Support definition and rollout of enterprise data governance policies (e.g., ownership, stewardship, access control). o Enable role-based metadata access , tagging, and data sensitivity classification. o Work with data owners, stewards, and architects to ensure data assets are well-documented, governed, and discoverable. o Provide training to users on leveraging the catalog for search, understanding, and reuse. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 6–12 years in data governance or metadata management, with at least 2–3 years in Talend Data Catalog. Talend Data Catalog, Apache Atlas, Cloudera CDP, Hive/Impala, Spark, HDFS, SQL. Business glossary, metadata enrichment, lineage tracking, stewardship workflows. Hands-on experience in Talend–Atlas integration , either through REST APIs, Kafka hooks, or metadata bridges. Preferred technical and professional experience .

Posted 2 months ago

Apply

3.0 - 5.0 years

5 - 7 Lacs

Mumbai, Delhi / NCR, Bengaluru

Work from Office

We are seeking a skilled Big Data Developer with 3+ years of experience to develop, maintain, and optimize large-scale data pipelines using frameworks like Spark, PySpark, and Airflow. The role involves working with SQL, Impala, Hive, and PL/SQL for advanced data transformations and analytics, designing scalable data storage systems, and integrating structured and unstructured data using tools like Sqoop. The ideal candidate will collaborate with cross-functional teams to implement data warehousing strategies and leverage BI tools for insights. Proficiency in Python programming, workflow orchestration with Airflow, and Unix/Linux environments is essential. Locations : Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, Remote

Posted 2 months ago

Apply

3.0 - 5.0 years

5 - 9 Lacs

New Delhi, Ahmedabad

Work from Office

We are seeking a skilled Big Data Developer with 3+ years of experience to develop, maintain, and optimize large-scale data pipelines using frameworks like Spark, PySpark, and Airflow. The role involves working with SQL, Impala, Hive, and PL/SQL for advanced data transformations and analytics, designing scalable data storage systems, and integrating structured and unstructured data using tools like Sqoop. The ideal candidate will collaborate with cross-functional teams to implement data warehousing strategies and leverage BI tools for insights. Proficiency in Python programming, workflow orchestration with Airflow, and Unix/Linux environments is essential. Location: Remote- Delhi / NCR,Bangalore/Bengaluru,Hyderabad/Secunderabad,Chennai,Pune,Kolkata,Ahmedabad,Mumbai

Posted 2 months ago

Apply

6.0 - 10.0 years

10 - 16 Lacs

Mumbai

Work from Office

Responsibilities Design and Implement Big Data solutions, complex ETL pipelines and data modernization projects. Required Past Experience: 6+ years of overall experience in developing, testing & implementing big data projects using Hadoop, Spark, Hive and Sqoop. Hands-on experience playing lead role in big data projects, responsible for implementing one or more tracks within projects, identifying and assigning tasks within the team and providing technical guidance to team members. Experience in setting up Hadoop services, implementing Extract transform and load/Extract load and transform (ETL/ELT) pipelines, working with Terabytes/Petabytes of data ingestion & processing from varied systems Experience working in onshore/offshore model, leading technical discussions with customers, mentoring and guiding teams on technology, preparing High-Level Design & Low-Level Design (HDD & LDD) documents. Required Skills and Abilities: Mandatory Skills Spark, Scala/Pyspark, Hadoop ecosystem including Hive, Sqoop, Impala, Oozie, Hue, Java, Python, SQL, Flume, bash (shell scripting) Secondary Skills Apache Kafka, Storm, Distributed systems, good understanding of networking, security (platform & data) concepts, Kerberos, Kubernetes Understanding of Data Governance concepts and experience implementing metadata capture, lineage capture, business glossary Experience implementing Continuous integration/Continuous delivery (CI/CD) pipelines and working experience with tools like Source code management (SCD) tools such as GIT, Bit bucket, etc. Ability to assign and manage tasks for team members, provide technical guidance, work with architects on High-Level Design, Low-Level Design (HDD & LDD) and Proof of concept. Hands on experience in writing data ingestion pipelines, data processing pipelines using spark and sql, experience in implementing slowly changing dimension (SCD) type 1 & 2, auditing, exception handling mechanism Data Warehousing projects implementation with either Java, or Scala based Hadoop programming background. Proficient with various development methodologies like waterfall, agile/scrum. Exceptional communication, organization, and time management skills Collaborative approach to decision-making & Strong analytical skills Good To Have - Certifications in any of GCP, AWS or Azure, Cloudera' Work on multiple Projects simultaneously, prioritizing appropriately

Posted 2 months ago

Apply

5.0 - 10.0 years

10 - 20 Lacs

Pune, Bengaluru, Delhi / NCR

Hybrid

Job Description for the Data Engineering: Location: PAN INDIA Experience between 7 - 14 years in performing the Data Engineering engagement Have experienced in Cloudera, Hadoop and SnowFlake Worked in the Impala and Kudu systems and can write code in Spark, PySpark Experienced in setting up Ooizee workflows Should be very good in SQL Have experience in Performance Tuning

Posted 2 months ago

Apply

10.0 - 15.0 years

25 - 35 Lacs

Pune

Work from Office

Education and Qualifications • Bachelors degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent. Work Experience • Minimum 10 years of experience in data analytics field Minimum 6 years of experience in running operation and support in Cloud Data Lakehouse environment Experience with Azure Databricks Experience in building and optimizing data pipelines, architectures and data sets Excellent experience in Scala or Python Ability to troubleshoot and optimize complex queries on the Spark platform Knowledgeable on structured and unstructured data design / modeling, data access and data storage techniques Experience with DevOps tools and environment Technical / Professional Skills Please provide at least 3 • Azure Databricks Python / Scala / Java HIVE / HBase / Impala / Parquet Sqoop, Kafka, Flume SQL and RDBMS Airflow Jenkins / Bamboo Github / Bitbucket Nexus Have you worked in sizing clusters for Databricks in Azure cloud environment? Have you done hand-on configuration and administration of Databricks platform on Azure Cloud? Have you experience in cluster management, storage management, workspace management, key management etc? Have you done cost optimization exercises to reduce the consumption cost of Databricks clusters? Have you done cost forecasting of Databricks platform on Azure Cloud? How you do monitor cost anomaly, identify cost driver and come up with recommendation? Have you done any RBAC configuration in Databricks platform on Azure Cloud? Have you configured connectivity from Databricks to internal/external sources/applications such as Power BI, Google Analytics, SharePoint etc What have you implemented/how do you monitor the health of Databricks Platform, its services, the health of ETL pipeline and the end-points What kind of proactive or self-healing process are put in place to ensure service availability?

Posted 2 months ago

Apply

5 - 8 years

5 - 9 Lacs

Bengaluru

Work from Office

Wipro Limited (NYSEWIT, BSE507685, NSEWIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients’ most complex digital transformation needs. Leveraging our holistic portfolio of capabilities in consulting, design, engineering, and operations, we help clients realize their boldest ambitions and build future-ready, sustainable businesses. With over 230,000 employees and business partners across 65 countries, we deliver on the promise of helping our customers, colleagues, and communities thrive in an ever-changing world. For additional information, visit us at www.wipro.com. About The Role Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. ? Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements ? Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s ? Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks ? Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Hadoop. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 8 years

6 - 10 Lacs

Bengaluru

Work from Office

About The Role Role Purpose The purpose of this role is to design, test and maintain software programs for operating systems or applications which needs to be deployed at a client end and ensure its meet 100% quality assurance parameters B?ig Data Developer - Spark,Scala,Pyspark Big Data Developer - Spark, Scala, Pyspark Coding & scripting Years of Experience5 to 12 years LocationBangalore Notice Period0 to 30 days Key Skills: - Proficient in Spark,Scala,Pyspark coding & scripting - Fluent in big data engineering development using the Hadoop/Spark ecosystem - Hands-on experience in Big Data - Good Knowledge of Hadoop Eco System - Knowledge of cloud architecture AWS - Data ingestion and integration into the Data Lake using the Hadoop ecosystem tools such as Sqoop, Spark, Impala, Hive, Oozie, Airflow etc. - Candidates should be fluent in the Python / Scala language - Strong communication skills ? 2. Perform coding and ensure optimal software/ module development Determine operational feasibility by evaluating analysis, problem definition, requirements, software development and proposed software Develop and automate processes for software validation by setting up and designing test cases/scenarios/usage cases, and executing these cases Modifying software to fix errors, adapt it to new hardware, improve its performance, or upgrade interfaces. Analyzing information to recommend and plan the installation of new systems or modifications of an existing system Ensuring that code is error free or has no bugs and test failure Preparing reports on programming project specifications, activities and status Ensure all the codes are raised as per the norm defined for project / program / account with clear description and replication patterns Compile timely, comprehensive and accurate documentation and reports as requested Coordinating with the team on daily project status and progress and documenting it Providing feedback on usability and serviceability, trace the result to quality risk and report it to concerned stakeholders ? 3. Status Reporting and Customer Focus on an ongoing basis with respect to project and its execution Capturing all the requirements and clarifications from the client for better quality work Taking feedback on the regular basis to ensure smooth and on time delivery Participating in continuing education and training to remain current on best practices, learn new programming languages, and better assist other team members. Consulting with engineering staff to evaluate software-hardware interfaces and develop specifications and performance requirements Document and demonstrate solutions by developing documentation, flowcharts, layouts, diagrams, charts, code comments and clear code Documenting very necessary details and reports in a formal way for proper understanding of software from client proposal to implementation Ensure good quality of interaction with customer w.r.t. e-mail content, fault report tracking, voice calls, business etiquette etc Timely Response to customer requests and no instances of complaints either internally or externally ? Deliver No. Performance Parameter Measure 1. Continuous Integration, Deployment & Monitoring of Software 100% error free on boarding & implementation, throughput %, Adherence to the schedule/ release plan 2. Quality & CSAT On-Time Delivery, Manage software, Troubleshoot queries, Customer experience, completion of assigned certifications for skill upgradation 3. MIS & Reporting 100% on time MIS & report generation Mandatory Skills: Python for Insights. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 8 years

6 - 10 Lacs

Pune

Work from Office

About The Role Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. ? Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements ? Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s ? Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks ? Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Big Data. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 10 years

20 - 35 Lacs

Bengaluru

Work from Office

Position - Pyspark Developer Experience - 5+ yrs Location - Bangalore Notice Period - Immediate - 30 days Roles & Responsibilities: 5+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. PySpark: Advanced proficiency in PySpark, including working with RDDs, Data Frames, and optimization techniques. Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools. Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks. Scripting and Automation: Strong scripting skills in Linux. Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes. Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives. Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations.

Posted 2 months ago

Apply

12 - 16 years

35 - 40 Lacs

Bengaluru

Work from Office

As AWS Data Engineer at organization, you will play a crucial role in the design, development, and maintenance of our data infrastructure. Your work will empower data-driven decision-making and contribute to the success of our data-driven initiatives You will design and maintain scalable data pipelines using AWS data analytical resources, enabling efficient data processing and analytics. Key Responsibilities: - Highly experiences in developing ETL pipelines using AWS Glue and EMR with PySpark/Scala. - Utilize AWS services (S3, Glue, Lambda, EMR, Step Functions) for data solutions. - Design scalable data models for analytics and reporting. - Implement data validation, quality, and governance practices. - Optimize Spark jobs for cost and performance efficiency. - Automate ETL workflows with AWS Step Functions and Lambda. - Collaborate with data scientists and analysts on data needs. - Maintain documentation for data architecture and pipelines. - Experience with Open source bigdata file formats such as Iceberg or delta or Hundi - Desirable to have experience in terraforming AWS data analytical resources. Must-Have Skills: - AWS (S3, Glue, EMR Lambda, EMR), PySpark or Scala, SQL, ETL development. Good-to-Have Skills: - Snowflake, Cloudera Hadoop (HDFS, Hive, Impala), Iceberg

Posted 2 months ago

Apply

12 - 16 years

35 - 40 Lacs

Chennai

Work from Office

As AWS Data Engineer at organization, you will play a crucial role in the design, development, and maintenance of our data infrastructure. Your work will empower data-driven decision-making and contribute to the success of our data-driven initiatives You will design and maintain scalable data pipelines using AWS data analytical resources, enabling efficient data processing and analytics. Key Responsibilities: - Highly experiences in developing ETL pipelines using AWS Glue and EMR with PySpark/Scala. - Utilize AWS services (S3, Glue, Lambda, EMR, Step Functions) for data solutions. - Design scalable data models for analytics and reporting. - Implement data validation, quality, and governance practices. - Optimize Spark jobs for cost and performance efficiency. - Automate ETL workflows with AWS Step Functions and Lambda. - Collaborate with data scientists and analysts on data needs. - Maintain documentation for data architecture and pipelines. - Experience with Open source bigdata file formats such as Iceberg or delta or Hundi - Desirable to have experience in terraforming AWS data analytical resources. Must-Have Skills: - AWS (S3, Glue, EMR Lambda, EMR), PySpark or Scala, SQL, ETL development. Good-to-Have Skills: - Snowflake, Cloudera Hadoop (HDFS, Hive, Impala), Iceberg

Posted 2 months ago

Apply

12 - 16 years

35 - 40 Lacs

Mumbai

Work from Office

As AWS Data Engineer at organization, you will play a crucial role in the design, development, and maintenance of our data infrastructure. Your work will empower data-driven decision-making and contribute to the success of our data-driven initiatives You will design and maintain scalable data pipelines using AWS data analytical resources, enabling efficient data processing and analytics. Key Responsibilities: - Highly experiences in developing ETL pipelines using AWS Glue and EMR with PySpark/Scala. - Utilize AWS services (S3, Glue, Lambda, EMR, Step Functions) for data solutions. - Design scalable data models for analytics and reporting. - Implement data validation, quality, and governance practices. - Optimize Spark jobs for cost and performance efficiency. - Automate ETL workflows with AWS Step Functions and Lambda. - Collaborate with data scientists and analysts on data needs. - Maintain documentation for data architecture and pipelines. - Experience with Open source bigdata file formats such as Iceberg or delta or Hundi - Desirable to have experience in terraforming AWS data analytical resources. Must-Have Skills: - AWS (S3, Glue, EMR Lambda, EMR), PySpark or Scala, SQL, ETL development. Good-to-Have Skills: - Snowflake, Cloudera Hadoop (HDFS, Hive, Impala), Iceberg

Posted 2 months ago

Apply

12 - 16 years

35 - 40 Lacs

Kolkata

Work from Office

As AWS Data Engineer at organization, you will play a crucial role in the design, development, and maintenance of our data infrastructure. Your work will empower data-driven decision-making and contribute to the success of our data-driven initiatives You will design and maintain scalable data pipelines using AWS data analytical resources, enabling efficient data processing and analytics. Key Responsibilities: - Highly experiences in developing ETL pipelines using AWS Glue and EMR with PySpark/Scala. - Utilize AWS services (S3, Glue, Lambda, EMR, Step Functions) for data solutions. - Design scalable data models for analytics and reporting. - Implement data validation, quality, and governance practices. - Optimize Spark jobs for cost and performance efficiency. - Automate ETL workflows with AWS Step Functions and Lambda. - Collaborate with data scientists and analysts on data needs. - Maintain documentation for data architecture and pipelines. - Experience with Open source bigdata file formats such as Iceberg or delta or Hundi - Desirable to have experience in terraforming AWS data analytical resources. Must-Have Skills: - AWS (S3, Glue, EMR Lambda, EMR), PySpark or Scala, SQL, ETL development. Good-to-Have Skills: - Snowflake, Cloudera Hadoop (HDFS, Hive, Impala), Iceberg

Posted 2 months ago

Apply

7 - 11 years

50 - 60 Lacs

Mumbai, Delhi / NCR, Bengaluru

Work from Office

Role :- Resident Solution ArchitectLocation: RemoteThe Solution Architect at Koantek builds secure, highly scalable big data solutions to achieve tangible, data-driven outcomes all the while keeping simplicity and operational effectiveness in mind This role collaborates with teammates, product teams, and cross-functional project teams to lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture This role is responsible for implementing securely architected big data solutions that are operationally reliable, performant, and deliver on strategic initiatives Specific requirements for the role include: Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake Expert-level hands-on coding experience in Python, SQL ,Spark/Scala,Python or Pyspark In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services Extensive hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark SQL/Scala Able to build ingestion to ADLS and enable BI layer for Analytics with strong understanding of Data Modeling and defining conceptual logical and physical data models Proficient level experience with architecture design, build and optimization of big data collection, ingestion, storage, processing, and visualization Responsibilities : Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigationGuide customers in transforming big data projects,including development and deployment of big data and AI applications Promote, emphasize, and leverage big data solutions to deploy performant systems that appropriately auto-scale, are highly available, fault-tolerant, self-monitoring, and serviceable Use a defense-in-depth approach in designing data solutions and AWS/Azure/GCP infrastructure Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling Aid developers to identify, design, and implement process improvements with automation tools to optimizing data delivery Implement processes and systems to monitor data quality and security, ensuring production data is accurate and available for key stakeholders and the business processes that depend on it Employ change management best practices to ensure that data remains readily accessible to the business Implement reusable design templates and solutions to integrate, automate, and orchestrate cloud operational needs and experience with MDM using data governance solutions Qualifications : Overall experience of 12+ years in the IT field Hands-on experience designing and implementing multi-tenant solutions using Azure Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions Design and development experience with scalable and cost-effective Microsoft Azure/AWS/GCP data architecture and related solutions Experience in a software development, data engineering, or data analytics field using Python, Scala, Spark, Java, or equivalent technologies Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience Good to have- - Advanced technical certifications: Azure Solutions Architect Expert, - AWS Certified Data Analytics, DASCA Big Data Engineering and Analytics - AWS Certified Cloud Practitioner, Solutions Architect - Professional Google Cloud Certified Location : - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, Remote

Posted 2 months ago

Apply

6 - 11 years

20 - 25 Lacs

Hyderabad

Hybrid

6+ years of total IT experience 3+ years of experience with Hadoop (Cloudera)/big data technologies Knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr) Experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Java Scala or Python. Experience with Spark programming (Pyspark, Scala, or Java) Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required. Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus. Hand on experience in CI/CD, Scheduling and Scripting Ensure automation through CI/CD across platforms both in cloud and on-premises System level understanding - Data structures, algorithms, distributed storage & compute Can-do attitude on solving complex business problems, good interpersonal and teamwork skills

Posted 2 months ago

Apply

2 - 5 years

14 - 17 Lacs

Hyderabad

Work from Office

As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations.. Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala"

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies