Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
8.0 - 12.0 years
0 Lacs
karnataka
On-site
Working with data on a day-to-day basis excites you, and you are interested in building robust data architecture to identify data patterns and optimize data consumption for customers who will forecast and predict actions based on data. If this excites you, then working in our intelligent automation team at Schneider AI Hub is the perfect fit for you. As a Lead Data Engineer at Schneider AI Hub, you will play a crucial role in the AI transformation of Schneider Electric by developing AI-powered solutions. Your responsibilities will include expanding and optimizing data and data pipeline architecture, ensuring optimal data flow and collection for cross-functional teams, and supporting software engineers, data analysts, and data scientists on data initiatives. You will be responsible for creating and maintaining optimal data pipeline architecture, designing the right schema to support functional requirements, and building production data pipelines from ingestion to consumption. Additionally, you will create preprocessing and postprocessing for various forms of data, develop data visualization and business intelligence tools, and implement internal process improvements for automating manual data processes. To qualify for this role, you should hold a bachelor's or master's degree in computer science, information technology, or other quantitative fields and have a minimum of 8 years of experience as a data engineer supporting large data transformation initiatives related to machine learning. Strong analytical skills, experience with Azure cloud services, ETLs using Spark, and proficiency in scripting languages like Python and Pyspark are essential requirements for this position. As a team player committed to the success of the team and projects, you will collaborate with various stakeholders to ensure data delivery architecture is consistent and secure across multiple data centers. Join us at Schneider Electric, where we create connected technologies that reshape industries, transform cities, and enrich lives, with a diverse and inclusive culture that values the contribution of every individual. If you are passionate about success and eager to contribute to cutting-edge projects, we invite you to be part of our dynamic team at Schneider Electric in Bangalore, India.,
Posted 1 week ago
6.0 - 10.0 years
30 - 35 Lacs
Bengaluru
Work from Office
We are seeking an experienced PySpark Developer / Data Engineer to design, develop, and optimize big data processing pipelines using Apache Spark and Python (PySpark). The ideal candidate should have expertise in distributed computing, ETL workflows, data lake architectures, and cloud-based big data solutions. Key Responsibilities: Develop and optimize ETL/ELT data pipelines using PySpark on distributed computing platforms (Hadoop, Databricks, EMR, HDInsight). Work with structured and unstructured data to perform data transformation, cleansing, and aggregation. Implement data lake and data warehouse solutions on AWS (S3, Glue, Redshift), Azure (ADLS, Synapse), or GCP (BigQuery, Dataflow). Optimize PySpark jobs for performance tuning, partitioning, and caching strategies. Design and implement real-time and batch data processing solutions. Integrate data pipelines with Kafka, Delta Lake, Iceberg, or Hudi for streaming and incremental updates. Ensure data security, governance, and compliance with industry best practices. Work with data scientists and analysts to prepare and process large-scale datasets for machine learning models. Collaborate with DevOps teams to deploy, monitor, and scale PySpark jobs using CI/CD pipelines, Kubernetes, and containerization. Perform unit testing and validation to ensure data integrity and reliability. Required Skills & Qualifications: 6+ years of experience in big data processing, ETL, and data engineering. Strong hands-on experience with PySpark (Apache Spark with Python). Expertise in SQL, DataFrame API, and RDD transformations. Experience with big data platforms (Hadoop, Hive, HDFS, Spark SQL). Knowledge of cloud data processing services (AWS Glue, EMR, Databricks, Azure Synapse, GCP Dataflow). Proficiency in writing optimized queries, partitioning, and indexing for performance tuning. Experience with workflow orchestration tools like Airflow, Oozie, or Prefect. Familiarity with containerization and deployment using Docker, Kubernetes, and CI/CD pipelines. Strong understanding of data governance, security, and compliance (GDPR, HIPAA, CCPA, etc.). Excellent problem-solving, debugging, and performance optimization skills.
Posted 1 month ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough