Jobs
Interviews

8521 Pyspark Jobs - Page 27

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

1.0 years

4 - 8 Lacs

Bengaluru

On-site

Department Command Centre Job posted on Jul 28, 2025 Employee Type Permanent Experience range (Years) 1 year - 2 years Job Description : Data Analyst Location: Bengaluru Job Summary: We are seeking a highly motivated Data Analyst to join our growing team. This role involves analyzing complex datasets, generating actionable insights, and supporting data-driven decision-making across various business functions. You will work closely with cross-functional teams to help optimize performance and improve business outcomes. Key Responsibilities: Work closely with big data engineering teams on data availability and quality. Analyze large and complex datasets to identify trends, patterns, and actionable insights. Track KPIs and performance metrics to support operational and strategic decision-making. Translate business needs into data analysis problems and deliver clear, actionable insights. Conduct root cause analysis on business challenges using structured data approaches. Communicate data insights effectively through presentations and reports. Identify gaps in data and opportunities for process automation. Develop and maintain documentation of reports, dashboards, and analytics processes. Qualifications: Bachelor’s degree in engineering, Statistics, Computer Science, Business, Economics, or a related field. 1-2+ years of professional experience in a Data Analyst or Business Analyst role. Proficiency in SQL is mandatory. Experience with Python (Pandas, Numpy)/R for data analysis is mandatory. Strong Excel/Google Sheets skills. Experience with data visualization tools (Tableau, Power BI, Looker, or Superset) is an additionalplus. Basic understanding of statistical methods (descriptive stats, hypothesis testing). Knowledge of PySpark is an additional plus. Key Skills: Analytical Thinking & Problem-Solving Communication & Presentation Skills Data Storytelling Attention to Detail

Posted 1 week ago

Apply

4.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Workmode: Hybrid work location : PAN INDIA Work Timing : 2 Pm to 11 PM Primary Skill : Data Engineer Experience in data engineering, with a proven focus on data ingestion and extraction using Python/PySpark.. Extensive AWS experience is mandatory, with proficiency in Glue, Lambda, SQS, SNS, AWS IAM, AWS Step Functions, S3, and RDS (Oracle, Aurora Postgres). 4+ years of experience working with both relational and non-relational/NoSQL databases is required. Strong SQL experience is necessary, demonstrating the ability to write complex queries from scratch. Also, experience in Redshift is required along with other SQL DB experience Strong scripting experience with the ability to build intricate data pipelines using AWS serverless architecture. understanding of building an end-to end Data pipeline. Secondary Skills Strong understanding of Kinesis, Kafka, CDK. Experience with Kafka and ECS is also required. strong understanding of data concepts related to data warehousing, business intelligence (BI), data security, data quality, and data profiling is required Experience in Node Js and CDK. JDResponsibilities Lead the architectural design and development of a scalable, reliable, and flexible metadata-driven data ingestion and extraction framework on AWS using Python/PySpark. Design and implement a customizable data processing framework using Python/PySpark. This framework should be capable of handling diverse scenarios and evolving data processing requirements. Implement data pipeline for data Ingestion, transformation and extraction leveraging the AWS Cloud Services Seamlessly integrate a variety of AWS services, including S3,Glue, Kafka, Lambda, SQL, SNS, Athena, EC2, RDS (Oracle, Postgres, MySQL), AWS Crawler to construct a highly scalable and reliable data ingestion and extraction pipeline. Facilitate configuration and extensibility of the framework to adapt to evolving data needs and processing scenarios. Develop and maintain rigorous data quality checks and validation processes to safeguard the integrity of ingested data. Implement robust error handling, logging, monitoring, and alerting mechanisms to ensure the reliability of the entire data pipeline. Qualifications Must Have Over 6 years of hands-on experience in data engineering, with a proven focus on data ingestion and extraction using Python/PySpark. Extensive AWS experience is mandatory, with proficiency in Glue, Lambda, SQS, SNS, AWS IAM, AWS Step Functions, S3, and RDS (Oracle, Aurora Postgres). 4+ years of experience working with both relational and non-relational/NoSQL databases is required. Strong SQL experience is necessary, demonstrating the ability to write complex queries from scratch. Strong working experience in Redshift is required along with other SQL DB experience. Strong scripting experience with the ability to build intricate data pipelines using AWS serverless architecture. Complete understanding of building an end-to end Data pipeline. Nice to have Strong understanding of Kinesis, Kafka, CDK. A strong understanding of data concepts related to data warehousing, business intelligence (BI), data security, data quality, and data profiling is required. Experience in Node Js and CDK.

Posted 1 week ago

Apply

5.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Overview: TekWissen is a global workforce management provider throughout India and many other countries in the world. The below clientis a global company with shared ideals and a deep sense of family. From our earliest days as a pioneer of modern transportation, we have sought to make the world a better place – one that benefits lives, communities and the planet Job Title: Software Engineer Senior Location: Chennai Work Type: Hybrid Position Description: As part of the client's DP&E Platform Observability team, you'll help build a top-tier monitoring platform focused on latency, traffic, errors, and saturation. You'll design, develop, and maintain a scalable, reliable platform, improving MTTR/MTTX, creating dashboards, and optimizing costs. Experience with large systems, monitoring tools (Prometheus, Grafana, etc.), and cloud platforms (AWS, Azure, GCP) is ideal. The focus is a centralized observability source for data-driven decisions and faster incident response. Skills Required: Spring Boot, Angular, Cloud Computing Skills Preferred: Google Cloud Platform - Biq Query, Data Flow, Dataproc, Data Fusion, TERRAFORM, Tekton,Cloud SQL, AIRFLOW, POSTGRES, Airflow PySpark, Python, API Experience Required: 5+ years of overall experience with proficiency in Java, angular or any javascript technology with experience in designing and deploying cloud-based data pipelines and microservices using GCP tools like BigQuery, Dataflow, and Dataproc. Ability to leverage best in-class data platform technologies (Apache Beam, Kafka,...) to deliver platform features, and design & orchestrate platform services to deliver data platform capabilities. Service-Oriented Architecture and Microservices: Strong understanding of SOA, microservices, and their application within a cloud data platform context. Develop robust, scalable services using Java Spring Boot, Python, Angular, and GCP technologies. Full-Stack Development: Knowledge of front-end and back-end technologies, enabling collaboration on data access and visualization layers (e.g., React, Node.js). Design and develop RESTful APIs for seamless integration across platform services. Implement robust unit and functional tests to maintain high standards of test coverage and quality. Database Management: Experience with relational (e.g., PostgreSQL, MySQL) and NoSQL databases, as well as columnar databases like BigQuery. Data Governance and Security: Understanding of data governance frameworks and implementing RBAC, encryption, and data masking in cloud environments. CI/CD and Automation: Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform, and automation frameworks. Manage code changes with GitHub and troubleshoot and resolve application defects efficiently. Ensure adherence to SDLC best practices, independently managing feature design, coding, testing, and production releases. Problem-Solving: Strong analytical skills with the ability to troubleshoot complex data platform and microservices issues. Experience Preferred: GCP Data Engineer, GCP Professional Cloud Education Required: Bachelor's Degree TekWissen® Group is an equal opportunity employer supporting workforce diversity.

Posted 1 week ago

Apply

0 years

6 - 7 Lacs

Noida

On-site

Posted On: 27 Jul 2025 Location: Noida, UP, India Company: Iris Software Why Join Us? Are you inspired to grow your career at one of India’s Top 25 Best Workplaces in IT industry? Do you want to do the best work of your life at one of the fastest growing IT services companies ? Do you aspire to thrive in an award-winning work culture that values your talent and career aspirations ? It’s happening right here at Iris Software. About Iris Software At Iris Software, our vision is to be our client’s most trusted technology partner, and the first choice for the industry’s top professionals to realize their full potential. With over 4,300 associates across India, U.S.A, and Canada, we help our enterprise clients thrive with technology-enabled transformation across financial services, healthcare, transportation & logistics, and professional services. Our work covers complex, mission-critical applications with the latest technologies, such as high-value complex Application & Product Engineering, Data & Analytics, Cloud, DevOps, Data & MLOps, Quality Engineering, and Business Automation. Working at Iris Be valued, be inspired, be your best. At Iris Software, we invest in and create a culture where colleagues feel valued, can explore their potential, and have opportunities to grow. Our employee value proposition (EVP) is about “Being Your Best” – as a professional and person. It is about being challenged by work that inspires us, being empowered to excel and grow in your career, and being part of a culture where talent is valued. We’re a place where everyone can discover and be their best version. Job Description Key skills needed are: 1. SQL, 2. Python, 3.. Pyspark and Machine Learning. Mandatory Competencies Data Science and Machine Learning - Data Science and Machine Learning - Amazon Machine Learning Data Science and Machine Learning - Data Science and Machine Learning - Python Beh - Communication Big Data - Big Data - Pyspark Database - Database Programming - SQL Perks and Benefits for Irisians At Iris Software, we offer world-class benefits designed to support the financial, health and well-being needs of our associates to help achieve harmony between their professional and personal growth. From comprehensive health insurance and competitive salaries to flexible work arrangements and ongoing learning opportunities, we're committed to providing a supportive and rewarding work environment. Join us and experience the difference of working at a company that values its employees' success and happiness.

Posted 1 week ago

Apply

6.0 years

14 - 24 Lacs

India

On-site

6+ years of experience as a Data Engineer. Strong proficiency in SQL. Hands-on experience with modern cloud data warehousing solutions (Snowflake, Big Query, Redshift) Expertise in ETL/ELT processes, batch, and streaming data processing. Proven ability to troubleshoot data issues and propose effective solutions. Knowledge of AWS services (S3, DMS, Glue, Athena). Familiarity with DBT for data transformation and modeling. Must be fluent in English communication. Desired Experience 3 years of Experience with additional AWS services (EC2, ECS, EKS, VPC, IAM). Knowledge of Infrastructure as Code (IaC) tools like Terraform and Terragrunt. Proficiency in Python for data engineering tasks. Experience with orchestration tools like Dagster, Airflow, or AWS Step Functions. Familiarity with pub-sub, queuing, and streaming frameworks (AWS Kinesis, Kafka, SQS, SNS). Experience with CI/CD pipelines and automation for data processes. Skills: python,glue,automation,cloud data warehousing (snowflake, big query, redshift),pub-sub frameworks (aws kinesis, kafka, sqs, sns),aws (s3, dms, glue, athena),etl/elt processes,orchestration tools (dagster, airflow, aws step functions),batch processing,ci/cd pipelines,data,ci,streaming data processing,aws,infrastructure as code (iac),etl,sql,dbt,pyspark,lambda,terragrunt,aws services (ec2, ecs, eks, vpc, iam),cd,terraform,s3

Posted 1 week ago

Apply

3.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Line of Service Advisory Industry/Sector FS X-Sector Specialism Data, Analytics & AI Management Level Senior Associate Job Description & Summary At PwC, our people in data and analytics focus on leveraging data to drive insights and make informed business decisions. They utilise advanced analytics techniques to help clients optimise their operations and achieve their strategic goals. In business intelligence at PwC, you will focus on leveraging data and analytics to provide strategic insights and drive informed decision-making for clients. You will develop and implement innovative solutions to optimise business performance and enhance competitive advantage. *Why PWC At PwC , you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us . At PwC , we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations Responsibilities: Design, build, and maintain scalable data pipelines for a variety of cloud platforms including AWS, Azure, Databricks. - Implement data ingestion and transformation processes to facilitate efficient data warehousing. - Utilize cloud services to enhance data processing capabilities: - AWS: Glue, Athena, Lambda, Redshift, Step Functions, DynamoDB, SNS. - Azure: Data Factory, Synapse Analytics, Functions, Cosmos DB, Event Grid, Logic Apps, Service Bus. - Optimize Spark job performance to ensure high efficiency and reliability. - Stay proactive in learning and implementing new technologies to improve data processing frameworks. - Collaborate with cross-functional teams to deliver robust data solutions. - Work on Spark Streaming for real-time data processing as necessary. Qualifications: - 3-8 years of experience in data engineering with a strong focus on cloud environments. - Proficiency in PySpark or Spark is mandatory. - Proven experience with data ingestion, transformation, and data warehousing. - In-depth knowledge and hands-on experience with cloud services(AWS/Azure): - Demonstrated ability in performance optimization of Spark jobs. - Strong problem-solving skills and the ability to work independently as well as in a team. - Cloud Certification (AWS, Azure) is a plus. - Familiarity with Spark Streaming is a bonus. Mandatory skill sets: · Pl/Sql Developer Preferred skill sets: · Pl/Sql Developer Years of experience required: 7+ Education qualification: BE/BTech/MBA/MCA Education (if blank, degree and/or field of study not specified) Degrees/Field of Study required: Bachelor of Technology, Master of Business Administration Degrees/Field of Study preferred: Certifications (if blank, certifications not specified) Required Skills Business Analyzer Optional Skills Accepting Feedback, Accepting Feedback, Active Listening, Analytical Thinking, Business Case Development, Business Data Analytics, Business Intelligence and Reporting Tools (BIRT), Business Intelligence Development Studio, Communication, Competitive Advantage, Continuous Process Improvement, Creativity, Data Analysis and Interpretation, Data Architecture, Database Management System (DBMS), Data Collection, Data Pipeline, Data Quality, Data Science, Data Visualization, Embracing Change, Emotional Regulation, Empathy, Inclusion, Industry Trend Analysis {+ 16 more} Desired Languages (If blank, desired languages not specified) Travel Requirements Not Specified Available for Work Visa Sponsorship? No Government Clearance Required? No Job Posting End Date

Posted 1 week ago

Apply

5.0 - 7.0 years

0 Lacs

Vadodara, Gujarat, India

On-site

We’re reinventing the market research industry. Let’s reinvent it together. At Numerator, we believe tomorrow’s success starts with today’s market intelligence. We empower the world’s leading brands and retailers with unmatched insights into consumer behavior and the influencers that drive it. We are seeking a highly skilled Senior Data Engineer with extensive experience in designing, building, and optimizing high-volume data pipelines. The ideal candidate will have strong expertise in Python, Databricks on Azure Cloud services, DevOps, and CI/CD tools, along with a solid understanding of AI/ML techniques and big data processing frameworks like Apache Spark and PySpark. Responsibilities Adhere to coding and Numerator technology standards Build suitable automation test suites within Azure DevOps Maintain and update automation test suites as required Carry out manual testing, load testing, exploratory testing as required Work closely with Business Analysts and Senior Developers to consistently achieve sprint goals Assist in estimation of sprint-by-sprint stories and tasks Pro-actively take a responsible approach to product delivery What You'll Bring to Numerator Requirements 5-7 years of experience in data engineering roles Good Python skills Experience working with Microsoft Azure Cloud Experience in Agile methodologies (Scrum/Kanban) Experience with Apache Spark, PySpark, Databricks Experience working with Devops pipeline, preferably Azure DevOps Preferred Qualifications Bachelor's or master's degree in computer science, Information Technology, Data Science, or a related field. Experience working in a support focused role Certification in relevant Data Engineering discipline or related fields.

Posted 1 week ago

Apply

7.0 - 11.0 years

0 Lacs

India

Remote

JD: AWS Data Engineer Exp Range: 7 to 11 Years Location: Remote Shift Timings: 12 PM to 9 PM Primary Skills: Python, Pyspark, SQL, AWS JD Responsibilities  Data Architecture: Develop and maintain the overall data architecture, ensuring scalability, performance, and data quality.  AWS Data Services: Expertise in using AWS data services such as AWS Glue, S3, SNS, SES, Dynamo DB, Redshift, Cloud formation, Cloud watch, IAM, DMS, Event bridge scheduler etc.  Data Warehousing: Design and implement data warehouses on AWS, leveraging AWS Redshift or other suitable options.  Data Lakes: Build and manage data lakes on AWS using AWS S3 and other relevant services.  Data Pipelines: Design and develop efficient data pipelines to extract, transform, and load data from various sources.  Data Quality: Implement data quality frameworks and best practices to ensure data accuracy, completeness, and consistency.  Cloud Optimization: Optimize data engineering solutions for performance, cost-efficiency, and scalability on the AWS cloud.  Team Leadership: Mentor and guide data engineers, ensuring they adhere to best practices and meet project deadlines. Qualifications  Bachelor’s degree in computer science, Engineering, or a related field.  6-7 years of experience in data engineering roles, with a focus on AWS cloud platforms.  Strong understanding of data warehousing and data lake concepts.  Proficiency in SQL and at least one programming language (Python/Pyspark).  Good to have - Experience with any big data technologies like Hadoop, Spark, and Kafka.  Knowledge of data modeling and data quality best practices.  Excellent problem-solving, analytical, and communication skills.  Ability to work independently and as part of a team. Preferred Qualifications  Certifications in AWS Certified Data Analytics - Specialty or AWS Certified Solutions Architect - Data. If Intrested. Please submit your CV to Khushboo@Sourcebae.com or share it via WhatsApp at 8827565832 khuStay updated with our latest job opportunities and company news by following us on LinkedIn: :https://www.linkedin.com/company/sourcebae

Posted 1 week ago

Apply

7.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Bangalore/Gurugram/Hyderabad YOE - 7+ years We are seeking a talented Data Engineer with strong expertise in Databricks, specifically in Unity Catalog, PySpark, and SQL, to join our data team. You’ll play a key role in building secure, scalable data pipelines and implementing robust data governance strategies using Unity Catalog. Key Responsibilities: Design and implement ETL/ELT pipelines using Databricks and PySpark. Work with Unity Catalog to manage data governance, access controls, lineage, and auditing across data assets. Develop high-performance SQL queries and optimize Spark jobs. Collaborate with data scientists, analysts, and business stakeholders to understand data needs. Ensure data quality and compliance across all stages of the data lifecycle. Implement best practices for data security and lineage within the Databricks ecosystem. Participate in CI/CD, version control, and testing practices for data pipelines Required Skills: Proven experience with Databricks and Unity Catalog (data permissions, lineage, audits). Strong hands-on skills with PySpark and Spark SQL. Solid experience writing and optimizing complex SQL queries. Familiarity with Delta Lake, data lakehouse architecture, and data partitioning. Experience with cloud platforms like Azure or AWS. Understanding of data governance, RBAC, and data security standards. Preferred Qualifications: Databricks Certified Data Engineer Associate or Professional. Experience with tools like Airflow, Git, Azure Data Factory, or dbt. Exposure to streaming data and real-time processing. Knowledge of DevOps practices for data engineering.

Posted 1 week ago

Apply

4.0 - 9.0 years

4 - 8 Lacs

Pune

Work from Office

Experience: 4+ Years. Expertise in Python Language is MUST. SQL (should be able to write complex SQL Queries) is MUST Hands on experience in Apache Flink Streaming Or Spark Streaming MUST Hands On expertise in Apache Kafka experience is MUST Data Lake Development experience. Orchestration (Apache Airflow is preferred). Spark and Hive: Optimization of Spark/PySpark and Hive apps Trino/(AWS Athena) (Good to have) Snowflake (good to have). Data Quality (good to have). File Storage (S3 is good to have) Our Offering:- Global cutting-edge IT projects that shape the future of digital and have a positive impact on environment. Wellbeing programs & work-life balance - integration and passion sharing events. Attractive Salary and Company Initiative Benefits Courses and conferences. Attractive Salary. Hybrid work culture.

Posted 1 week ago

Apply

7.0 - 11.0 years

15 - 25 Lacs

Hyderabad

Hybrid

Role Purpose: The Senior Data Engineer will support and enable the Data Architecture and the Data Strategy. Supporting solution architecture and engineering for data ingestion and modelling challenges. The role will support the deduplication of enterprise data tools, working with the Lonza Data Governance Board, Digital Council and IT to drive towards a single Data and Information Architecture. This will be a hands-on engineering role with a focus on business and digital transformation. The role will be responsible for managing and maintain the Data Architecture and solutions that deliver the platform at with operational support and troubleshooting. The Senior Data Engineer will also manage (no reporting line changes but from day-to-day delivery) and coordinate the Data Engineering team members (Internal and External) working on the various project implementations. Experience : 7-10 years experience with digital transformation and data projects. Experience in designing, delivering and managing data infrastructures. Proficiency in using Cloud Services (Azure) for data engineering, storage and analytics. Strong SQL and NoSQL experience Data Modelling Hands on developing pipelines, setting-up architectures in Azure Fabric. Team management experience (internal and external resources). Good understanding of data warehousing, data virtualization and analytics. Experience in working with data analysts, data scientists and BI teams to deliver on data requirements. Data Catalogue experience is a plus. ETL Pipeline Design is a plus Python Development skills is a plus Realtime data ingestion (E.g. Kafka) Licenses or Certifications Beneficial; ITIL, PM, CSM, Six Sigma, Lean Knowledge Good understanding about integration, ETL, API and Data sharing concepts. Understanding / Awareness of Visualization tools is a plus Knowledge and understanding of relevant legal and regulatory requirements, such as CFR 21 part 11, EU General Data Protection Regulation, Health Insurance Portability and Accountability Act (HIPAA) and GxP validation process would be a plus. Skills The position requires a pragmatic leader with sound knowledge of data, integration and analytics. Excellent written and verbal communication skills, interpersonal and collaborative skills, and the ability to communicate technical concepts to nontechnical audiences. Exhibit excellent analytical skills, the ability to manage and contribute to multiple projects under strict timelines, as well as the ability to work well in a demanding, dynamic environment and meet overall objectives. Project management skills: scheduling and resource management are a plus. Ability to motivate cross-functional, interdisciplinary teams to achieve tactical and strategic goals. Data Catalogue Project and Team management skills are plus. Strong SAP skills are a plus.

Posted 1 week ago

Apply

5.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Qualification Skills: 5+ years of experience with Java + Bigdata as minimum required skill . Java, Micorservices ,Sprintboot, API ,Bigdata-Hive, Spark,Pyspark Role Skills: 5+ years of experience with Java + Bigdata as minimum required skill . Java, Micorservices ,Sprintboot, API ,Bigdata-Hive, Spark,Pyspark Experience 5 to 7 years Job Reference Number 13049

Posted 1 week ago

Apply

6.0 - 11.0 years

22 - 25 Lacs

Hyderabad

Hybrid

Proficiency in Python and SQL Hands-on experience with big data processing using PySpark/Spark Experience with Snowflake or Databricks Familiarity with AWS services, particularly in data pipeline development (Glue, Athena, Crawler, S3, Lambda, Redshift, EMR) Strong preference will be given to candidates with experience in: End-to-end data pipeline design Operationalizing machine learning models

Posted 1 week ago

Apply

2.0 - 5.0 years

0 Lacs

India

Remote

Data Engineer (Remote) Experience Required: 2 to 5 years Location: Remote Budget: ~1.15 Lakh per month (for 4–5 years experience) ~1.00 Lakh per month (for 2 years experience) Key Skills: Azure Data Factory (ADF) Azure Databricks PySpark Job Description: We are looking for a skilled Data Engineer with 2–5 years of experience to join our remote team. The ideal candidate should have hands-on experience working with Azure Data Factory, Azure Databricks, and PySpark, and must be capable of building scalable and efficient data pipelines. Roles & Responsibilities: Design and build data pipelines using ADF and Databricks Optimize ETL processes for performance and scalability Work with cross-functional teams to gather requirements and deliver data solutions Ensure data quality and implement data governance practices Troubleshoot and debug pipeline issues

Posted 1 week ago

Apply

6.0 - 8.0 years

0 Lacs

Indore, Madhya Pradesh, India

On-site

Qualification 6-8 years of good hands on exposure with Big Data technologies – pySpark (Data frame and SparkSQL), Hadoop, and Hive Good hands on experience of python and Bash Scripts Good understanding of SQL and data warehouse concepts Strong analytical, problem-solving, data analysis and research skills Demonstrable ability to think outside of the box and not be dependent on readily available tools Excellent communication, presentation and interpersonal skills are a must Hands-on experience with using Cloud Platform provided Big Data technologies (i.e. IAM, Glue, EMR, RedShift, S3, Kinesis) Orchestration with Airflow and Any job scheduler experience Experience in migrating workload from on-premise to cloud and cloud to cloud migrations Good to have: Role Develop efficient ETL pipelines as per business requirements, following the development standards and best practices. Perform integration testing of different created pipeline in AWS env. Provide estimates for development, testing & deployments on different env. Participate in code peer reviews to ensure our applications comply with best practices. Create cost effective AWS pipeline with required AWS services i.e S3,IAM, Glue, EMR, Redshift etc. Experience 6 to 8 years Job Reference Number 13024

Posted 1 week ago

Apply

6.0 years

0 Lacs

Pune, Maharashtra, India

On-site

We are hiring a Data Engineer for Pune/Hyderabad/Bangalore. Experience: 6+ Years Designation: Senior Software Engineer/Lead Software Engineer –Data Engineer Skill Tech stack: AWS Data Engineer, Python, PySpark, SQL, Data Pipeline, AWS, AWS Glue, Lambda JD: 6+ years of experience in data engineering, specifically in cloud environments like AWS. Proficiency in Python and PySpark for data processing and transformation tasks. Solid experience with AWS Glue for ETL jobs and managing data workflows. Hands-on experience with AWS Data Pipeline (DPL) for workflow orchestration. Strong experience with AWS services such as S3, Lambda, Redshift, RDS, and EC2. Technical Skills: Deep understanding of ETL concepts and best practices. Strong knowledge of SQL for querying and manipulating relational and semi-structured data. Experience with Data Warehousing and Big Data technologies, specifically within AWS. Additional Skills: Experience with AWS Lambda for serverless data processing and orchestration. Understanding of AWS Redshift for data warehousing and analytics. Familiarity with Data Lakes, Amazon EMR, and Kinesis for streaming data processing. Knowledge of data governance practices, including data lineage and auditing. Familiarity with CI/CD pipelines and Git for version control. Experience with Docker and containerization for building and deploying applications. Design and Build Data Pipelines: Design, implement, and optimize data pipelines on AWS using PySpark, AWS Glue, and AWS Data Pipeline to automate data integration, transformation, and storage processes. ETL Development: Develop and maintain Extract, Transform, and Load (ETL) processes using AWS Glue and PySpark to efficiently process large datasets. Data Workflow Automation: Build and manage automated data workflows using AWS Data Pipeline, ensuring seamless scheduling, monitoring, and management of data jobs. Data Integration: Work with different AWS data storage services (e.g., S3, Redshift, RDS) to ensure smooth integration and movement of data across platforms. Optimization and Scaling: Optimize and scale data pipelines for high performance and cost efficiency, utilizing AWS services like Lambda, S3, and EC2. Interested or know someone who fits? Send your resume to gautam@mounttalent.com.

Posted 1 week ago

Apply

0 years

0 Lacs

Coimbatore, Tamil Nadu, India

On-site

Company:IT Services Organization Key Skills: Spark, Azure Databricks, Azure, Python, Pyspark Roles and Responsibilities: Develop and maintain scalable data processing systems using Apache Spark and Azure Databricks. Implement data integration from various sources including RDBMS, ERP systems, and files. Design and optimize SQL queries, stored procedures, and relational schemas. Build stream-processing systems using technologies such as Apache Storm or Spark-Streaming. Utilize messaging systems like Kafka or RabbitMQ for data ingestion. Ensure performance tuning of Spark jobs for optimal efficiency. Collaborate with cross-functional teams to deliver high-quality data solutions. Lead and mentor a team of data engineers, fostering a culture of continuous improvement and Agile practices. Skills Required: Proficient in Apache Spark and Azure Databricks Strong experience with Azure ecosystem and Python Working knowledge of Pyspark (Nice-to-Have) Experience in data integration from varied sources Expertise in SQL optimization and stream-processing systems Familiarity with Kafka or RabbitMQ Ability to lead and mentor engineering teams Strong understanding of distributed computing principles Education: Bachelor's degree in Computer Science, Information Technology, or a related field.

Posted 1 week ago

Apply

7.5 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Project Role : Application Lead Project Role Description : Lead the effort to design, build and configure applications, acting as the primary point of contact. Must have skills : PySpark Good to have skills : NA Minimum 7.5 Year(s) Of Experience Is Required Educational Qualification : 15 years full time education Summary: As an Application Lead, you will lead the effort to design, build, and configure applications, acting as the primary point of contact. Your typical day will involve collaborating with various teams to ensure project milestones are met, facilitating discussions to address challenges, and guiding your team in implementing effective solutions. You will also engage in strategic planning sessions to align project goals with organizational objectives, ensuring that all stakeholders are informed and involved in the development process. Your role will require you to balance technical oversight with team management, fostering an environment of innovation and collaboration. Roles & Responsibilities: - Expected to be an SME. - Collaborate and manage the team to perform. - Responsible for team decisions. - Engage with multiple teams and contribute on key decisions. - Provide solutions to problems for their immediate team and across multiple teams. - Mentor junior team members to enhance their skills and knowledge. - Facilitate regular team meetings to discuss progress and address any roadblocks. Professional & Technical Skills: - Must To Have Skills: Proficiency in PySpark. - Strong understanding of data processing frameworks and distributed computing. - Experience with data integration and ETL processes. - Familiarity with cloud platforms and services related to data processing. - Ability to troubleshoot and optimize performance of applications. Additional Information: - The candidate should have minimum 7.5 years of experience in PySpark. - This position is based in Chennai. - A 15 years full time education is required.

Posted 1 week ago

Apply

175.0 years

0 Lacs

Bengaluru South, Karnataka, India

On-site

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express. How will you make an impact in this role? As a Data Engineer, you will be responsible for designing, developing, and maintaining robust and scalable framework/services/application/pipelines for processing huge volume of data. You will work closely with cross-functional teams to deliver high-quality software solutions that meet our organizational needs. Key Responsibilities: Design and develop solutions using Bigdata tools and technologies like Bigquery, Hive, Spark etc. Extensive hands-on experience in object-oriented programming using Python, PySpark APIs etc. Experience in building data pipelines for huge volume of data. Experience in designing, implementing, and managing various ETL job execution flows. Experience in implementing and maintaining Data Ingestion process. Hands on experience in writing basic to advance level of optimized queries using HQL, SQL & Spark. Hands on experience in designing, implementing, and maintaining Data Transformation jobs using most efficient tools/technologies. Ensure the performance, quality, and responsiveness of solutions. Participate in code reviews to maintain code quality. Should be able to write shell scripts. Utilize Git for source version control. Set up and maintain CI/CD pipelines. Troubleshoot, debug, and upgrade existing application & ETL job chains. Required Skills and Qualifications: Bachelor’s degree in Computer Science Engineering, or a related field. Proven experience as Data Engineer or similar role. Strong proficiency in Object Oriented programming using Python. Experience with ETL jobs design principles. Solid understanding of HQL, SQL and data modelling. Knowledge on Unix/Linux and Shell scripting principles. Familiarity with Git and version control systems. Experience with Jenkins and CI/CD pipelines. Knowledge of software development best practices and design patterns. Excellent problem-solving skills and attention to detail. Strong communication and collaboration skills. Experience with cloud platforms such as Google Cloud. We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally: Competitive base salaries Bonus incentives Support for financial-well-being and retirement Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location) Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need Generous paid parental leave policies (depending on your location) Free access to global on-site wellness centers staffed with nurses and doctors (depending on location) Free and confidential counseling support through our Healthy Minds program Career development and training opportunities American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law. Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.

Posted 1 week ago

Apply

170.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Area(s) of responsibility About Us Birlasoft, a global leader at the forefront of Cloud, AI, and Digital technologies, seamlessly blends domain expertise with enterprise solutions. The company’s consultative and design-thinking approach empowers societies worldwide, enhancing the efficiency and productivity of businesses. As part of the multibillion-dollar diversified CKA Birla Group, Birlasoft with its 12,000+ professionals, is committed to continuing the Group’s 170-year heritage of building sustainable communities. Azure Data Engineer with Databricks (7+ Years) Experience: 7+ Years Job Description Experience in Perform Design, Development & Deployment using Azure Services ( Data Factory, Azure Data Lake Storage, Databricks, PySpark, SQL) Develop and maintain scalable data pipelines and build out new Data Source integrations to support continuing increases in data volume and complexity. Experience in create the Technical Specification Design, Application Interface Design. Files Processing – XML, CSV, Excel, ORC, Parquet file Formats Develop batch processing, streaming and integration solutions and process Structured and Non-Structured Data Good to have experience with ETL development both on-premises and in the cloud using SSIS, Data Factory, and related Microsoft and other ETL technologies (Informatica preferred) Demonstrated in depth skills with Azure Data Factory, Azure Databricks, PySpark, ADLS (must have) with the ability to configure and administrate all aspects of Azure SQL DB. Collaborate and engage with BI & analytics and business team Deep understanding of the operational dependencies of applications, networks, systems, security and policy (both on premise and in the cloud; VMs, Networking, VPN (Express Route), Active Directory, Storage (Blob, etc.),

Posted 1 week ago

Apply

6.0 years

0 Lacs

Delhi, India

On-site

Key Responsibilities: Lead and mentor a team of data engineers across projects and ensure high-quality delivery. Design, build, and optimize large-scale data pipelines and data integration workflows using ADF and Synapse Analytics . Architect and implement scalable data solutions on Azure cloud , including Databricks and Microsoft Fabric . Write efficient and maintainable code using PySpark and SQL for data transformations and processing. Collaborate with data architects, analysts, and business stakeholders to define data strategies and requirements. Implement and advocate for Data Mesh principles within the organization. Provide architectural guidance and perform solutioning for new and existing data projects on Azure. Ensure data quality, governance, and security best practices are followed. Stay updated with evolving Azure services and data technologies. Required Skills & Experience: 6+ years of professional experience in data engineering and solution architecture. Expertise in Azure Data Factory (ADF) and Azure Synapse Analytics . Strong hands-on experience with Databricks , PySpark , and advanced SQL . Good knowledge of Microsoft Fabric and its use cases. Deep understanding of Azure cloud services related to data storage, processing, and integration. Familiarity with Data Mesh architecture and distributed data product ownership. Strong problem-solving and debugging skills. Excellent communication and stakeholder management abilities. Good to Have: Experience with CI/CD pipelines for data solutions. Knowledge of data security and compliance practices on Azure. Certification in Azure Data Engineering or Solution Architecture.

Posted 1 week ago

Apply

5.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

UWorld is a worldwide leader in online test prep for college entrance, undergraduate, graduate, and professional licensing exams throughout the United States. Since 2003, over 2 million students have trusted us to help them prepare for high-stakes examinations. We are seeking a Data Engineer who is passionate about creating an excellent user experience and enjoys taking on new challenges. The Data Engineer will be responsible for the design, development, testing, deployment, and support of our Data Analytics and the Data warehouse platform . Requirement Minimum Experience: Master's/bachelor’s degree in computer science or a related field. 5+ years of experience as a Data Engineer with experience in Data Analysis, ingestion, cleansing, validation, verification, and presentation (reports and dashboards). 3+ years of working knowledge and experience utilizing the following: Python, Spark/PySpark, Big Data Platforms (Data Bricks/Delta Lake), REST services, MS SQL Server/MySQL, MongoDB, and Azure Cloud. Experience with SQL, PL/SQL, and Relational Databases (MS SQL Server/MySQL/Oracle). Experience with Tableau/Power BI, NoSQL (MongoDB), and Kafka is a plus. Experience with REST API, Web Services, JSON, Build and Deployment pipelines (Maven, Ansible, Git), and Cloud environments (Azure, AWS, GCP) is desirable. Job Responsibilities: The software developer will perform the following duties: Understand data services and analytics needs across the organization and work on the data warehouse and reporting infrastructure to empower them with accurate information for decision-making. Develop and maintain a data warehouse that aggregates data from multiple content sources, including NoSQL DBs, RDBMS, Big Query, Salesforce, social media, other 3rd party web services (RESTful, JSON), flat-file stores, and application databases (OLTPs). Use Python, Spark/PySpark, Data Bricks, Delta Lake, SQL Server, Mongo DB, Jira, Git/Bit Bucket, Confluence, Data Bricks/Delta Lake, REST services, Tableau, Unix/Linux shell scripting, and Azure Cloud for data ingestion, processing, transformations, warehousing, and reporting. Develop scalable data pipelines using Data connectors, distributed processing transformations, schedulers, and data warehouse. Understanding of data structures, analytics, data modeling, and software architecture and applying this knowledge to problem solving. Develop, modify, and test algorithms that can be used in scripts to store, locate, cleanse, verify, validate, and retrieve specific documents, data, and information. Develop analytics to understand product sales, marketing impact, and application usage for UWorld products and applications. Employ best practices for code sharing and development to ensure common code base abstraction across all applications. Continuously be up-to-date on industry standard practices in big data and analytics and adopt solutions to the UWorld data warehousing platform. Work with QA engineers to ensure the quality and reliability of all reports, extracts, and dashboards by process of continuous improvement. Collaborate with technical architects, developers, subject matter experts, QA team, and customer care team to drive new enhancements or fix bugs promptly. Work in an agile environment such as Scrum. Soft Skills. Working proficiency and communication skills in verbal and written English. Excellent attention to detail and organization skills and ability to articulate ideas clearly and concisely. Ability to work effectively within a changing environment that is going through high growth. Exceptional follow-through, personal drive, and ability to understand direction and feedback. Positive attitude with a willingness to put aside ego for the sake of what is best for the team.

Posted 1 week ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Cloud and AWS Expertise: In-depth knowledge of AWS services related to data engineering: EC2, S3, RDS, DynamoDB, Redshift, Glue, Lambda, Step Functions, Kinesis, Iceberg, EMR, and Athena. Strong understanding of cloud architecture and best practices for high availability and fault tolerance. Data Engineering Concepts : Expertise in ETL/ELT processes, data modeling, and data warehousing. Knowledge of data lakes, data warehouses, and big data processing frameworks like Apache Hadoop and Spark. Proficiency in handling structured and unstructured data. Programming and Scripting: Proficiency in Python, Pyspark and SQL for data manipulation and pipeline development. Expertise in working with data warehousing solutions like Redshift.

Posted 1 week ago

Apply

10.0 years

0 Lacs

India

Remote

🚀 We’re Hiring: Senior Data Engineer (Remote – India | Full-time or Contract) 10+ Years (Willing to work U.S. overlapping hours) 💼 Position: Senior Data Engineer 🌍 Location: Remote (India) 📅 Type: Full-Time / Contract 📊 Experience: 10+ Years We are helping our client hire a Senior Data Engineer with over 10+years of experience in modern data platforms. This is a remote role open across India, and available on both full-time and contract basis. 🔧 Must-Have Skills: Data Engineering, Data Warehousing, ETL Azure Databricks PySpark, SparkSQL Python, SQL 👀 What We’re Looking For: We are hiring for two different positions as follows: Lead Developer: (PySpark, Azure Databricks) Databricks Admin  A strong background in building and managing data pipelines Hands-on experience in cloud platforms, especially Azure Ability to work independently and collaborate in distributed teams 📩 How to Apply: Please send your resume to [your email] with the subject line: "Senior Data Engineer – Remote India" ⚠️ Along with your resume, kindly include the following details: Full Name Mobile Number Total Experience Relevant Experience Current CTC Expected CTC Notice Period Current Location Are you fine with Contract or Full time or both? Willing to work IST/US overlapping hours: Yes/No Do you have a PF account? (Yes/No) 🔔 Follow our company page to stay updated on future job openings! #DataEngineer #AzureDatabricks #ADF #PySpark #SQL #RemoteJobsIndia #HiringNow #ContractJobs #IndiaJobs

Posted 1 week ago

Apply

4.0 years

0 Lacs

Greater Delhi Area

On-site

4+ Years Location : Delhi NP : immediate – 15 Days Develop and implement machine learning models to solve business problems and improve decision-making. Perform data analysis, feature engineering, and model evaluation using statistical and analytical techniques. Collaborate with cross-functional teams to understand requirements and deliver data-driven solutions. Communicate insights and model outcomes effectively to both technical and non-technical stakeholders. Work with tools and technologies including Python, PySpark, SQL, and machine learning libraries.

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies