Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
3.0 years
0 Lacs
Greater Kolkata Area
On-site
Primary Roles And Responsibilities Developing Modern Data Warehouse solutions using Snowflake, Databricks and ADF. Ability to provide solutions that are forward-thinking in data engineering and analytics space Collaborate with DW/BI leads to understand new ETL pipeline development requirements. Triage issues to find gaps in existing pipelines and fix the issues Work with business to understand the need in the reporting layer and develop a data model to fulfill reporting needs Help joiner team members to resolve issues and technical challenges. Drive technical discussions with client architect and team members Orchestrate the data pipelines in the scheduler via Airflow Skills And Qualifications Skills: sql,pl/sql,spark,star and snowflake dimensional modeling,databricks,snowsight,terraform,git,unix shell scripting,snowsql,cassandra,circleci,azure,pyspark,snowpipe,mongodb,neo4j,azure data factory,snowflake,python Bachelor's and/or master’s degree in computer science or equivalent experience. Must have total 6+ yrs. of IT experience and 3+ years' experience in Data warehouse/ETL projects. Expertise in Snowflake security, Snowflake SQL and designing/implementing other Snowflake objects. Hands-on experience with Snowflake utilities, SnowSQL, Snowpipe, Snowsight and Snowflake connectors. Deep understanding of Star and Snowflake dimensional modeling. Strong knowledge of Data Management principles Good understanding of Databricks Data & AI platform and Databricks Delta Lake Architecture Should have hands-on experience in SQL and Spark (PySpark) Experience in building ETL / data warehouse transformation processes Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) Experience working with structured and unstructured data including imaging & geospatial data. Experience working in a Dev/Ops environment with tools such as Terraform, CircleCI, GIT. Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning, troubleshooting and Query Optimization. Databricks Certified Data Engineer Associate/Professional Certification (Desirable). Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects Should have experience working in Agile methodology Strong verbal and written communication skills. Strong analytical and problem-solving skills with a high attention to detail. Mandatory Skills: Snowflake/Azure Data Factory/ PySpark / Databricks Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Bengaluru, Karnataka, India
On-site
Primary Roles And Responsibilities Developing Modern Data Warehouse solutions using Snowflake, Databricks and ADF. Ability to provide solutions that are forward-thinking in data engineering and analytics space Collaborate with DW/BI leads to understand new ETL pipeline development requirements. Triage issues to find gaps in existing pipelines and fix the issues Work with business to understand the need in the reporting layer and develop a data model to fulfill reporting needs Help joiner team members to resolve issues and technical challenges. Drive technical discussions with client architect and team members Orchestrate the data pipelines in the scheduler via Airflow Skills And Qualifications Skills: sql,pl/sql,spark,star and snowflake dimensional modeling,databricks,snowsight,terraform,git,unix shell scripting,snowsql,cassandra,circleci,azure,pyspark,snowpipe,mongodb,neo4j,azure data factory,snowflake,python Bachelor's and/or master’s degree in computer science or equivalent experience. Must have total 6+ yrs. of IT experience and 3+ years' experience in Data warehouse/ETL projects. Expertise in Snowflake security, Snowflake SQL and designing/implementing other Snowflake objects. Hands-on experience with Snowflake utilities, SnowSQL, Snowpipe, Snowsight and Snowflake connectors. Deep understanding of Star and Snowflake dimensional modeling. Strong knowledge of Data Management principles Good understanding of Databricks Data & AI platform and Databricks Delta Lake Architecture Should have hands-on experience in SQL and Spark (PySpark) Experience in building ETL / data warehouse transformation processes Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) Experience working with structured and unstructured data including imaging & geospatial data. Experience working in a Dev/Ops environment with tools such as Terraform, CircleCI, GIT. Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning, troubleshooting and Query Optimization. Databricks Certified Data Engineer Associate/Professional Certification (Desirable). Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects Should have experience working in Agile methodology Strong verbal and written communication skills. Strong analytical and problem-solving skills with a high attention to detail. Mandatory Skills: Snowflake/Azure Data Factory/ PySpark / Databricks Show more Show less
Posted 1 week ago
6.0 years
0 Lacs
Mumbai, Maharashtra, India
On-site
This role is for one of our clients Industry: Technology, Information and Media Seniority level: Mid-Senior level Min Experience: 6 years Location: mumbai JobType: full-time We are looking for a seasoned Lead Data Visualization Specialist to join our growing Data & Analytics team. This role is perfect for a visualization expert who thrives at the intersection of design and data. You will lead the development of high-impact dashboards and data storytelling experiences that help stakeholders make strategic business decisions. Working closely with data engineers, analysts, and business users, you will transform complex data into actionable insights using tools like Power BI, Tableau, and Qlik Sense , while also leveraging Azure Databricks , PySpark , and SQL to manage the backend data workflows. What You’ll Do: Data Visualization & Dashboard Development Build sleek, interactive, and scalable dashboards using Power BI, Tableau, and Qlik Sense. Develop intuitive layouts and user journeys for business users to explore KPIs and trends. Embed visual storytelling principles to make data interpretation easy and insightful. Data Integration & Modeling Collaborate with data engineers to clean, shape, and model data from diverse sources. Use SQL and PySpark to query, transform, and enrich data pipelines. Manage complex datasets within Microsoft Azure cloud environments, including Databricks and Data Factory. Performance & Optimization Design dashboards optimized for speed, usability, and enterprise-scale data volumes. Troubleshoot performance bottlenecks and enhance backend queries or models accordingly. Stakeholder Engagement & Insight Delivery Work with cross-functional teams to understand business needs and translate them into analytical visuals. Present data-driven insights to non-technical audiences, tailoring messaging to various stakeholders. Governance, Standards & Mentorship Champion visualization standards and data governance best practices. Mentor junior visualizers and analysts on tools, techniques, and storytelling principles. Help define scalable templates and reusable components for the organization. What You Bring: 6+ years of experience in data visualization or business intelligence roles. Mastery of at least two of the following tools: Power BI , Tableau , Qlik Sense . Strong SQL capabilities and hands-on experience with PySpark for large-scale data processing. Deep knowledge of the Azure data ecosystem , including Databricks , Azure Synapse , and Data Factory . Proven ability to translate raw data into powerful, intuitive stories through visuals. Strong grasp of UX principles as applied to data dashboards. Ability to work autonomously and manage multiple stakeholders and priorities. Excellent verbal and written communication skills. Bonus Points for: Certifications in Power BI, Tableau, or Microsoft Azure. Experience in predictive modeling, trend analysis, or machine learning environments. Exposure to agile methodologies and product-based data teams. Show more Show less
Posted 1 week ago
5.0 - 8.0 years
15 - 18 Lacs
Hyderabad, Bengaluru
Hybrid
Cloud and AWS Expertise: In-depth knowledge of AWS services related to data engineering: EC2, S3, RDS, DynamoDB, Redshift, Glue, Lambda, Step Functions, Kinesis, Iceberg, EMR, and Athena. Strong understanding of cloud architecture and best practices for high availability and fault tolerance. Data Engineering Concepts: Expertise in ETL/ELT processes, data modeling, and data warehousing. Knowledge of data lakes, data warehouses, and big data processing frameworks like Apache Hadoop and Spark. Proficiency in handling structured and unstructured data. Programming and Scripting: Proficiency in Python, Pyspark and SQL for data manipulation and pipeline development. Expertise in working with data warehousing solutions like Redshift.
Posted 1 week ago
0 years
0 Lacs
Thiruvananthapuram, Kerala, India
Remote
Brief Description The Cloud Data Engineer will play a critical implementation role on the Data Engineering and Data Products team and be responsible for data pipeline solutions design and development, troubleshooting, and optimization tuning on the next generation data and analytics platform being developed with leading edge big data technologies in a highly secure cloud infrastructure. The Cloud Data Engineer will serve as a liaison to platform user groups ensuring successful implementation of capabilities on the new platform. Data Engineer Responsibilities : Deliver end-to-end data and analytics capabilities, including data ingest, data transformation, data science, and data visualization in collaboration with Data and Analytics stakeholder groups Design and deploy databases and data pipelines to support analytics projects Develop scalable and fault-tolerant workflows Clearly document issues, solutions, findings and recommendations to be shared internally & externally Learn and apply tools and technologies proficiently, including: Languages: Python, PySpark, ANSI SQL, Python ML libraries Frameworks/Platform: Spark, Snowflake, Airflow, Hadoop , Kafka Cloud Computing: AWS Tools/Products: PyCharm, Jupyter, Tableau, PowerBI Performance optimization for queries and dashboards Develop and deliver clear, compelling briefings to internal and external stakeholders on findings, recommendations, and solutions Analyze client data & systems to determine whether requirements can be met Test and validate data pipelines, transformations, datasets, reports, and dashboards built by team Develop and communicate solutions architectures and present solutions to both business and technical stakeholders Provide end user support to other data engineers and analysts Candidate Requirements : Expert experience in the following[ Should have / Good to have ]: SQL, Python, PySpark, Python ML libraries. Other programming languages (R, Scala, SAS, Java, etc.) are a plus Data and analytics technologies including SQL/NoSQL/Graph databases, ETL, and BI Knowledge of CI/CD and related tools such as Gitlab, AWS CodeCommit etc. AWS services including EMR, Glue, Athena, Batch, Lambda CloudWatch, DynamoDB, EC2, CloudFormation, IAM and EDS Exposure to Snowflake and Airflow. Solid scripting skills (e.g., bash/shell scripts, Python) Proven work experience in the following: Data streaming technologies Big Data technologies including, Hadoop, Spark, Hive, Teradata, etc. Linux command-line operations Networking knowledge (OSI network layers, TCP/IP, virtualization) Candidate should be able to lead the team, communicate with business, gather and interpret business requirements Experience with agile delivery methodologies using Jira or similar tools Experience working with remote teams AWS Solutions Architect / Developer / Data Analytics Specialty certifications, Professional certification is a plus Bachelor Degree in Computer Science relevant field, Masters Degree is a plus Show more Show less
Posted 1 week ago
4.0 years
0 Lacs
Mumbai, Maharashtra, India
On-site
Greetings from TCS!!!!!! We have an opportunity for Big Data Major Skill: Pyspark and Hive Experience: 4+ Years Work Mode: Work from office Location: Chennai/Mumbai/Pune/ Jd: Ingest data from disparate sources (Structured, unstructured and semi-structured) and develop ETL jobs using the above skills. Do impact analysis and come up with estimates Take responsibility for end-to-end deliverable. Create Project Plan & Work on Implementation Strategy Need to have comprehensive understanding on ETL concepts and Cross Environment Data Transfers Need to Handle Customer Communications and Management Reporting Show more Show less
Posted 1 week ago
4.0 - 6.0 years
0 Lacs
Mumbai Metropolitan Region
On-site
Responsible for developing, optimize, and maintaining business intelligence and data warehouse systems, ensuring secure, efficient data storage and retrieval, enabling self-service data exploration, and supporting stakeholders with insightful reporting and analysis. Grade - T5 Please note that the Job will close at 12am on Posting Close date, so please submit your application prior to the Close Date Accountabilities What your main responsibilities are: Data Pipeline - Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity Data Integration - Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization. Data pre-processing including collecting, parsing, managing, analyzing and visualizing large sets of data Data Quality Management - Cleanse the data and improve data quality and readiness for analysis. Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms Data Transformation - Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process Data Enablement - Ensure data is accessible and useable to wider enterprise to enable a deeper and more timely understanding of operation. Qualifications & Specifications Masters /Bachelor’s degree in Engineering /Computer Science/ Math/ Statistics or equivalent. Strong programming skills in Python/Pyspark/SAS. Proven experience with large data sets and related technologies – Hadoop, Hive, Distributed computing systems, Spark optimization. Experience on cloud platforms (preferably Azure) and it's services Azure Data Factory (ADF), ADLS Storage, Azure DevOps. Hands-on experience on Databricks, Delta Lake, Workflows. Should have knowledge of DevOps process and tools like Docker, CI/CD, Kubernetes, Terraform, Octopus. Hands-on experience with SQL and data modeling to support the organization's data storage and analysis needs. Experience on any BI tool like Power BI (Good to have). Cloud migration experience (Good to have) Cloud and Data Engineering certification (Good to have) Working in an Agile environment 4-6 Years Of Relevant Work Experience Is Required. Experience with stakeholder management is an added advantage. What We Are Looking For Education: Bachelor's degree or equivalent in Computer Science, MIS, Mathematics, Statistics, or similar discipline. Master's degree or PhD preferred. Knowledge, Skills And Abilities Fluency in English Analytical Skills Accuracy & Attention to Detail Numerical Skills Planning & Organizing Skills Presentation Skills Data Modeling and Database Design ETL (Extract, Transform, Load) Skills Programming Skills FedEx was built on a philosophy that puts people first, one we take seriously. We are an equal opportunity/affirmative action employer and we are committed to a diverse, equitable, and inclusive workforce in which we enforce fair treatment, and provide growth opportunities for everyone. All qualified applicants will receive consideration for employment regardless of age, race, color, national origin, genetics, religion, gender, marital status, pregnancy (including childbirth or a related medical condition), physical or mental disability, or any other characteristic protected by applicable laws, regulations, and ordinances. Our Company FedEx is one of the world's largest express transportation companies and has consistently been selected as one of the top 10 World’s Most Admired Companies by "Fortune" magazine. Every day FedEx delivers for its customers with transportation and business solutions, serving more than 220 countries and territories around the globe. We can serve this global network due to our outstanding team of FedEx team members, who are tasked with making every FedEx experience outstanding. Our Philosophy The People-Service-Profit philosophy (P-S-P) describes the principles that govern every FedEx decision, policy, or activity. FedEx takes care of our people; they, in turn, deliver the impeccable service demanded by our customers, who reward us with the profitability necessary to secure our future. The essential element in making the People-Service-Profit philosophy such a positive force for the company is where we close the circle, and return these profits back into the business, and invest back in our people. Our success in the industry is attributed to our people. Through our P-S-P philosophy, we have a work environment that encourages team members to be innovative in delivering the highest possible quality of service to our customers. We care for their well-being, and value their contributions to the company. Our Culture Our culture is important for many reasons, and we intentionally bring it to life through our behaviors, actions, and activities in every part of the world. The FedEx culture and values have been a cornerstone of our success and growth since we began in the early 1970’s. While other companies can copy our systems, infrastructure, and processes, our culture makes us unique and is often a differentiating factor as we compete and grow in today’s global marketplace. Show more Show less
Posted 1 week ago
6.0 years
0 Lacs
Pune, Maharashtra, India
On-site
At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same. Your Role As a senior software engineer with Capgemini, you will have 6 + years of experience in Azure technology with strong project track record In this role you will play a key role in: Strong customer orientation, decision making, problem solving, communication and presentation skills Very good judgement skills and ability to shape compelling solutions and solve unstructured problems with assumptions Very good collaboration skills and ability to interact with multi-cultural and multi-functional teams spread across geographies Strong executive presence and entrepreneurial spirit Superb leadership and team building skills with ability to build consensus and achieve goals through collaboration rather than direct line authority Your Profile Experience with Azure Data Bricks, Data Factory Experience with Azure Data components such as Azure SQL Database, Azure SQL Warehouse, SYNAPSE Analytics Experience in Python/Pyspark/Scala/Hive Programming Experience with Azure Databricks/ADB is must have Experience with building CI/CD pipelines in Data environments Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem. Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
CACI India, RMZ Nexity, Tower 30 4th Floor Survey No.83/1, Knowledge City Raidurg Village, Silpa Gram Craft Village, Madhapur, Serilingampalle (M), Hyderabad, Telangana 500081, India Req #1097 02 May 2025 CACI International Inc is an American multinational professional services and information technology company headquartered in Northern Virginia. CACI provides expertise and technology to enterprise and mission customers in support of national security missions and government transformation for defense, intelligence, and civilian customers. CACI has approximately 23,000 employees worldwide. Headquartered in London, CACI Ltd is a wholly owned subsidiary of CACI International Inc., a publicly listed company on the NYSE with annual revenue in excess of US $6.2bn. Founded in 2022, CACI India is an exciting, growing and progressive business unit of CACI Ltd. CACI Ltd currently has over 2000 intelligent professionals and are now adding many more from our Hyderabad and Pune offices. Through a rigorous emphasis on quality, the CACI India has grown considerably to become one of the UKs most well-respected Technology centres. About Data Platform The Data Platform will be built and managed “as a Product” to support a Data Mesh organization. The Data Platform focusses on enabling decentralized management, processing, analysis and delivery of data, while enforcing corporate wide federated governance on data, and project environments across business domains. The goal is to empower multiple teams to create and manage high integrity data and data products that are analytics and AI ready, and consumed internally and externally. What does a Data Infrastructure Engineer do? A Data Infrastructure Engineer will be responsible to develop, maintain and monitor the data platform infrastructure and operations. The infrastructure and pipelines you build will support data processing, data analytics, data science and data management across the CACI business. The data platform infrastructure will conform to a zero trust, least privilege architecture, with a strict adherence to data and infrastructure governance and control in a multi-account, multi-region AWS environment. You will use Infrastructure as Code and CI/CD to continuously improve, evolve and repair the platform. You will be able to design architectures and create re-useable solutions to reflect the business needs. Responsibilities Will Include Collaborating across CACI departments to develop and maintain the data platform Building infrastructure and data architectures in Cloud Formation, and SAM. Designing and implementing data processing environments and integrations using AWS PaaS such as Glue, EMR, Sagemaker, Redshift, Aurora and Snowflake Building data processing and analytics pipelines as code, using python, SQL, PySpark, spark, CloudFormation, lambda, step functions, Apache Airflow Monitoring and reporting on the data platform performance, usage and security Designing and applying security and access control architectures to secure sensitive data You Will Have 3+ years of experience in a Data Engineering role. Strong experience and knowledge of data architectures implemented in AWS using native AWS services such as S3, DataZone, Glue, EMR, Sagemaker, Aurora and Redshift. Experience administrating databases and data platforms Good coding discipline in terms of style, structure, versioning, documentation and unit tests Strong proficiency in Cloud Formation, Python and SQL Knowledge and experience of relational databases such as Postgres, Redshift Experience using Git for code versioning, and lifecycle management Experience operating to Agile principles and ceremonies Hands-on experience with CI/CD tools such as GitLab Strong problem-solving skills and ability to work independently or in a team environment. Excellent communication and collaboration skills. A keen eye for detail, and a passion for accuracy and correctness in numbers Whilst not essential, the following skills would also be useful: Experience using Jira, or other agile project management and issue tracking software Experience with Snowflake Experience with Spatial Data Processing More About The Opportunity The Data Engineer is an excellent opportunity, and CACI Services India reward their staff well with a competitive salary and impressive benefits package which includes: Learning: Budget for conferences, training courses and other materials Health Benefits: Family plan with 4 children and parents covered Future You: Matched pension and health care package We understand the importance of getting to know your colleagues. Company meetings are held every quarter, and a training/work brief weekend is held once a year, amongst many other social events. CACI is an equal opportunities employer. Therefore, we embrace diversity and are committed to a working environment where no one will be treated less favourably on the grounds of their sex, race, disability, sexual orientation religion, belief or age. We have a Diversity & Inclusion Steering Group and we always welcome new people with fresh perspectives from any background to join the group An inclusive and equitable environment enables us to draw on expertise and unique experiences and bring out the best in each other. We champion diversity, inclusion and wellbeing and we are supportive of Veterans and people from a military background. We believe that by embracing diverse experiences and backgrounds, we can collaborate to create better outcomes for our people, our customers and our society. Other details Pay Type Salary Apply Now Show more Show less
Posted 1 week ago
7.0 years
0 Lacs
India
On-site
Job Description: We are seeking a skilled and experienced Azure Data Engineer to join our data engineering team. The ideal candidate will have a strong background in building and optimizing data pipelines and data sets, utilizing Azure Data Factory, Databricks, PySpark, and SQL. You will work closely with data architects, data scientists, and business stakeholders to design and implement scalable, reliable, and high-performance data solutions on the Azure platform. Required Skills and Qualifications: Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field. 7+ years of experience as a Data Engineer or in a similar role. Strong experience with Azure Data Factory for ETL/ELT operations. Proficiency in Databricks and PySpark for big data processing and transformation. Advanced SQL skills for data manipulation and reporting. Hands-on experience with data modeling, ETL development, and data warehousing. Experience with Azure services like Azure Synapse, Azure Blob Storage, and Azure SQL Database. Understanding of data governance principles and best practices. Strong analytical and problem-solving skills . Familiarity with Python or other scripting languages. Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Andhra Pradesh, India
On-site
We are looking for a PySpark solutions developer and data engineer who can design and build solutions for one of our Fortune 500 Client programs, which aims towards building a data standardized and curation needs on Hadoop cluster. This is high visibility, fast-paced key initiative will integrate data across internal and external sources, provide analytical insights, and integrate with the customers critical systems. Key Responsibilities Ability to design, build and unit test applications on Spark framework on Python. Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well. Develop and execute data pipeline testing processes and validate business rules and policies. Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec. Create and maintain integration and regression testing framework on Jenkins integrated with Bit Bucket and/or GIT repositories. Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings. Work collaboratively with onsite and offshore team. Develop & review technical documentation for artifacts delivered. Ability to solve complex data-driven scenarios and triage towards defects and production issues. Ability to learn-unlearn-relearn concepts with an open and analytical mindset. Participate in code release and production deployment. Preferred Qualifications BE/B.Tech/ B.Sc. in Computer Science/ Statistics from an accredited college or university. Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications. Expertise in handling complex large-scale Big Data environments preferably (20Tb+). Minimum 3 years of experience in the following: HIVE, YARN, HDFS. Hands-on experience writing complex SQL queries, exporting, and importing large amounts of data using utilities. Ability to build abstracted, modularized reusable code components. Prior experience on ETL tools preferably Informatica PowerCenter is advantageous. Able to quickly adapt and learn. Able to jump into an ambiguous situation and take the lead on resolution. Able to communicate and coordinate across various teams. Are comfortable tackling new challenges and new ways of working Are ready to move from traditional methods and adapt into agile ones Comfortable challenging your peers and leadership team. Can prove yourself quickly and decisively. Excellent communication skills and Good Customer Centricity. Strong Target & High Solution Orientation. Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Andhra Pradesh, India
On-site
We are looking for a PySpark solutions developer and data engineer who can design and build solutions for one of our Fortune 500 Client programs, which aims towards building a data standardized and curation needs on Hadoop cluster. This is high visibility, fast-paced key initiative will integrate data across internal and external sources, provide analytical insights, and integrate with the customers critical systems. Key Responsibilities Ability to design, build and unit test applications on Spark framework on Python. Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well. Develop and execute data pipeline testing processes and validate business rules and policies. Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD's. Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec respectively. Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec. Build data tokenization libraries and integrate with Hive & Spark for column-level obfuscation. Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources. Create and maintain integration and regression testing framework on Jenkins integrated with Bit Bucket and/or GIT repositories. Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings. Work collaboratively with onsite and offshore team. Develop & review technical documentation for artifacts delivered. Ability to solve complex data-driven scenarios and triage towards defects and production issues. Ability to learn-unlearn-relearn concepts with an open and analytical mindset. Participate in code release and production deployment. Challenge and inspire team members to achieve business results in a fast paced and quickly changing environment. Preferred Qualifications BE/B.Tech/ B.Sc. in Computer Science/ Statistics from an accredited college or university. Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications. Expertise in handling complex large-scale Big Data environments preferably (20Tb+). Minimum 3 years of experience in the following: HIVE, YARN, HDFS. Hands-on experience writing complex SQL queries, exporting, and importing large amounts of data using utilities. Ability to build abstracted, modularized reusable code components. Prior experience on ETL tools preferably Informatica PowerCenter is advantageous. Able to quickly adapt and learn. Able to jump into an ambiguous situation and take the lead on resolution. Able to communicate and coordinate across various teams. Are comfortable tackling new challenges and new ways of working Are ready to move from traditional methods and adapt into agile ones Comfortable challenging your peers and leadership team. Can prove yourself quickly and decisively. Excellent communication skills and Good Customer Centricity. Strong Target & High Solution Orientation. Show more Show less
Posted 1 week ago
5.0 years
0 Lacs
Trivandrum, Kerala, India
On-site
Job Family Data Science & Analysis (India) Travel Required None Clearance Required None What You Will Do Design, develop, and maintain robust, scalable, and efficient data pipelines and ETL/ELT processes. Lead and execute data engineering projects from inception to completion, ensuring timely delivery and high quality. Build and optimize data architectures for operational and analytical purposes. Collaborate with cross-functional teams to gather and define data requirements. Implement data quality, data governance, and data security practices. Manage and optimize cloud-based data platforms ( Azure\AWS). Develop and maintain Python/PySpark libraries for data ingestion, Processing and integration with both internal and external data sources. Design and optimize scalable data pipelines using Azure data factory and Spark(Databricks) Work with stakeholders, including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs. Develop frameworks for data ingestion, transformation, and validation. Mentor junior data engineers and guide best practices in data engineering. Evaluate and integrate new technologies and tools to improve data infrastructure. Ensure compliance with data privacy regulations (HIPAA, etc.). Monitor performance and troubleshoot issues across the data ecosystem. Automated deployment of data pipelines using GIT hub actions \ Azure devops What You Will Need Bachelors or master’s degree in computer science, Information Systems, Statistics, Math, Engineering, or related discipline. Minimum 5 + years of solid hands-on experience in data engineering and cloud services. Extensive working experience with advanced SQL and deep understanding of SQL. Good Experience in Azure data factory (ADF), Databricks , Python and PySpark. Good experience in modern data storage concepts data lake, lake house. Experience in other cloud services (AWS) and data processing technologies will be added advantage. Ability to enhance , develop and resolve defects in ETL process using cloud services. Experience handling large volumes (multiple terabytes) of incoming data from clients and 3rd party sources in various formats such as text, csv, EDI X12 files and access database. Experience with software development methodologies (Agile, Waterfall) and version control tools Highly motivated, strong problem solver, self-starter, and fast learner with demonstrated analytic and quantitative skills. Good communication skill. What Would Be Nice To Have AWS ETL Platform – Glue , S3 One or more programming languages such as Java, .Net Experience in US health care domain and insurance claim processing. What We Offer Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace. About Guidehouse Guidehouse is an Equal Opportunity Employer–Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation. Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco. If you have visited our website for information about employment opportunities, or to apply for a position, and you require an accommodation, please contact Guidehouse Recruiting at 1-571-633-1711 or via email at RecruitingAccommodation@guidehouse.com. All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodation. All communication regarding recruitment for a Guidehouse position will be sent from Guidehouse email domains including @guidehouse.com or guidehouse@myworkday.com. Correspondence received by an applicant from any other domain should be considered unauthorized and will not be honored by Guidehouse. Note that Guidehouse will never charge a fee or require a money transfer at any stage of the recruitment process and does not collect fees from educational institutions for participation in a recruitment event. Never provide your banking information to a third party purporting to need that information to proceed in the hiring process. If any person or organization demands money related to a job opportunity with Guidehouse, please report the matter to Guidehouse’s Ethics Hotline. If you want to check the validity of correspondence you have received, please contact recruiting@guidehouse.com. Guidehouse is not responsible for losses incurred (monetary or otherwise) from an applicant’s dealings with unauthorized third parties. Guidehouse does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Guidehouse and Guidehouse will not be obligated to pay a placement fee. Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Gurugram, Haryana, India
On-site
Primary Roles And Responsibilities Developing Modern Data Warehouse solutions using Snowflake, Databricks and ADF. Ability to provide solutions that are forward-thinking in data engineering and analytics space Collaborate with DW/BI leads to understand new ETL pipeline development requirements. Triage issues to find gaps in existing pipelines and fix the issues Work with business to understand the need in the reporting layer and develop a data model to fulfill reporting needs Help joiner team members to resolve issues and technical challenges. Drive technical discussions with client architect and team members Orchestrate the data pipelines in the scheduler via Airflow Skills And Qualifications Skills: sql,pl/sql,spark,star and snowflake dimensional modeling,databricks,snowsight,terraform,git,unix shell scripting,snowsql,cassandra,circleci,azure,pyspark,snowpipe,mongodb,neo4j,azure data factory,snowflake,python Bachelor's and/or master’s degree in computer science or equivalent experience. Must have total 6+ yrs. of IT experience and 3+ years' experience in Data warehouse/ETL projects. Expertise in Snowflake security, Snowflake SQL and designing/implementing other Snowflake objects. Hands-on experience with Snowflake utilities, SnowSQL, Snowpipe, Snowsight and Snowflake connectors. Deep understanding of Star and Snowflake dimensional modeling. Strong knowledge of Data Management principles Good understanding of Databricks Data & AI platform and Databricks Delta Lake Architecture Should have hands-on experience in SQL and Spark (PySpark) Experience in building ETL / data warehouse transformation processes Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) Experience working with structured and unstructured data including imaging & geospatial data. Experience working in a Dev/Ops environment with tools such as Terraform, CircleCI, GIT. Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning, troubleshooting and Query Optimization. Databricks Certified Data Engineer Associate/Professional Certification (Desirable). Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects Should have experience working in Agile methodology Strong verbal and written communication skills. Strong analytical and problem-solving skills with a high attention to detail. Mandatory Skills: Snowflake/Azure Data Factory/ PySpark / Databricks Show more Show less
Posted 1 week ago
3.0 - 8.0 years
0 - 1 Lacs
Noida, Chennai
Hybrid
Role Overview: We are looking for an experienced Python Programming Mentor with strong hands-on expertise in core and advanced Python, including OOP, data structures, file handling, python libraries, basics of MySQL The ideal candidate should be capable of delivering live mentoring sessions, reviewing code, guiding real world projects, and supporting learners through hands-on problem-solving. Role & responsibilities Mentor learners on Python fundamentals, advanced topics, and practical applications. Conduct live sessions, code walkthroughs, and 1:1 doubt-clearing interactions. Guide learners through mini-projects and real-world capstone projects. Review code submissions and provide constructive feedback. Promote coding best practices, testing (PyTest/unittest), and Git workflows.
Posted 1 week ago
2.0 - 5.0 years
0 Lacs
Pune, Maharashtra, India
On-site
Location Hinjewadi, Pune - Maharashtra, India Pacesetting. Passionate. Together. HELLA, one of the leading automotive suppliers worldwide, has shaped the industry with innovative lighting systems and vehicle electronics. In addition, the company is one of the most important partners of the aftermarket and independent workshops. What motivates us: Shaping the mobility of tomorrow and fostering the central market trends such as autonomous driving, efficiency and electrification, connectivity and digitization as well as individualization. Every day, 36,000 employees worldwide are committed to this with passion, know-how and innovative strength. YOUR TASKS We are looking for a talented Data Engineer and AI Deployment Engineer to join our innovative team. The ideal candidate will have a strong foundation in data engineering and AI deployment, with hands-on experience in creating and deploying data pipelines, working with data lakes, and performing ETL processes. You will be responsible for designing, implementing, and maintaining scalable data and AI solutions to support our business needs. Join our team in designing and deploying AI solutions to enhance the cost, quality, and efficiency of manufacturing and development processes. Key Responsibilities Design, develop, and maintain data pipelines using Python, PySpark, and SQL. Implement ETL processes to extract, transform, and load data from various sources. Deploy and manage containerized applications using Docker. Develop and maintain CI/CD pipelines to automate deployment processes. Collaborate with cross-functional teams to understand data and AI requirements and deliver solutions. Deploy and manage AI/ML models in production environments. Create and manage data lakes to store and process large volumes of data. Required Skills Proficiency in Python, PySpark, RDBMS, and SQL. Hands-on experience writing optimized PySpark and SQL scripts using best practices. Experience in creating and deploying data pipelines. Knowledge of data lake architecture and management. Strong understanding of ETL processes. Hands-on experience with Docker and Git. Experience in CI/CD pipeline development and maintenance. Strict adherence to software development best practices. Excellent communication skills in English with an independent and team-focused working style. Knowledge of Palantir. Familiarity with data streaming services like Apache Kafka, RabbitMQ, etc. Experience with Azure DevOps pipelines. Experience with Apache Airflow. Exposure to AI/ML and MLOps. Your Qualifications Qualifications & Experience: BE in Computer Science, Information Technology. (Engineering Qualification is a must) 2 to 5 years of industry experience in software development. Minimum 2 years of relevant experience as a Data Engineer. Work Location - Hinjewadi Phase -1 with Hybrid Working. Immediate Joiner Prefererd. Take the opportunity to reveal your potential within a global, family-run company that offers you the best possible conditions for progressing in your career. Please send us your application through our careers portal, citing reference number req16080. HELLA India Automotive Pvt Ltd. Rimsha Shaikh Show more Show less
Posted 1 week ago
3.0 years
0 Lacs
Pune, Maharashtra, India
On-site
Primary Roles And Responsibilities Developing Modern Data Warehouse solutions using Snowflake, Databricks and ADF. Ability to provide solutions that are forward-thinking in data engineering and analytics space Collaborate with DW/BI leads to understand new ETL pipeline development requirements. Triage issues to find gaps in existing pipelines and fix the issues Work with business to understand the need in the reporting layer and develop a data model to fulfill reporting needs Help joiner team members to resolve issues and technical challenges. Drive technical discussions with client architect and team members Orchestrate the data pipelines in the scheduler via Airflow Skills And Qualifications Skills: sql,pl/sql,spark,star and snowflake dimensional modeling,databricks,snowsight,terraform,git,unix shell scripting,snowsql,cassandra,circleci,azure,pyspark,snowpipe,mongodb,neo4j,azure data factory,snowflake,python Bachelor's and/or master’s degree in computer science or equivalent experience. Must have total 6+ yrs. of IT experience and 3+ years' experience in Data warehouse/ETL projects. Expertise in Snowflake security, Snowflake SQL and designing/implementing other Snowflake objects. Hands-on experience with Snowflake utilities, SnowSQL, Snowpipe, Snowsight and Snowflake connectors. Deep understanding of Star and Snowflake dimensional modeling. Strong knowledge of Data Management principles Good understanding of Databricks Data & AI platform and Databricks Delta Lake Architecture Should have hands-on experience in SQL and Spark (PySpark) Experience in building ETL / data warehouse transformation processes Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) Experience working with structured and unstructured data including imaging & geospatial data. Experience working in a Dev/Ops environment with tools such as Terraform, CircleCI, GIT. Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning, troubleshooting and Query Optimization. Databricks Certified Data Engineer Associate/Professional Certification (Desirable). Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects Should have experience working in Agile methodology Strong verbal and written communication skills. Strong analytical and problem-solving skills with a high attention to detail. Mandatory Skills: Snowflake/Azure Data Factory/ PySpark / Databricks Show more Show less
Posted 1 week ago
175.0 years
0 Lacs
Gurugram, Haryana, India
On-site
At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express. How will you make an impact in this role? The Digital Data Strategy Team within the broader EDEA (Enterprise Digital Experimentation & Analytics) in EDDS supports all other EDEA VP teams and product & marketing partner teams with data strategy, automation & insights and creates and manages automated insight packs and multiple derived data layers. The team partners with Technology to enable end to end MIS Automation, ODL(Organized Data Layer) creation, drives process automation, optimization, Data & MIS Quality in an efficient manner. The team also supports strategic Data & Platform initiatives. This role will report to the Manager – Digital Data Strategy, EDEA and will be based in Gurgaon. The candidate will be responsible for delivery of high impactful data and automated insights products to enable other analytics partners, marketing partners and product owners to optimize across our platform, demand generation, acquisition and membership experience domains. Your responsibilities include: Elevate Data Intelligence: Set vision for Intuitive, integrated and intelligent frameworks to enable smart Insights. Discover new sources of information for strong enrichment of business applications. Modernization: Keep up with the latest industry research and emerging technologies to ensure we are appropriately leveraging new techniques and capabilities and drive strategic change in tools & capabilities. Develop roadmap to transition our analytical and production usecases to the cloud platform and develop next generation MIS products through modern full stack BI tools & enable self-serve analytics Define digital data strategy vision as the business owner of digital analytics data & partner to achieve the vision of Data as a Service to enable Unified, Scalable & Secure data assets for business applications Strong understanding of key drivers & dynamics of Digital Data, Data Architecture & Design, Data Linkage & Usages. In depth knowledge of platforms like Big Data/Cornerstone, Lumi/Google Cloud Platform, Data Ingestion and Organized Data Layers. Being abreast of the latest industry & enterprise wide data governance, data quality practices, privacy policies and engrain the same in all data products & capabilities and be a guiding light for broader team. Partner and collaborate with multiple partners, agency & colleagues to develop Capabilities that will help in maximizing demand generation program ROI. Minimum Qualifications 1-3 years with relevant experience in the Automation, Data Product Management/Data Strategy with adequate data quality, economies of scale and process governance Proven thought leadership, Solid project management skills, strong communication, collaboration, relationship and conflict management skills Bachelors or Master’s degree in Engineering/Management Knowledge of Big Data oriented tools (e.g. Big query, Hive, SQL, Python/R, PySpark); Advanced Excel/VBA and PowerPoint; Experience of managing complex processes and integration with upstream and downstream systems/processes. Hands on experience on visualization tools like Tableau, Power BI, Sisense etc. Preferred Qualifications Strong analytical/conceptual thinking competence to solve unstructured and complex business problems and articulate key findings to leaders/partners in a succinct and concise manner. Strong understanding of internal platforms like Big Data/Cornerstone, Lumi/Google Cloud Platform. Knowledge of Agile tools and methodologies Enterprise Leadership Behaviors: Set the Agenda: Define What Winning Looks Like, Put Enterprise Thinking First, Lead with an External Perspective Bring Others with You: Build the Best Team, Seek & Provide Coaching Feedback, Make Collaboration Essential Do It the Right Way: Communicate Frequently, Candidly & Clearly, Make Decisions Quickly & Effectively, Live the Blue Box Values, Great Leadership Demands Courage We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally: Competitive base salaries Bonus incentives Support for financial-well-being and retirement Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location) Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need Generous paid parental leave policies (depending on your location) Free access to global on-site wellness centers staffed with nurses and doctors (depending on location) Free and confidential counseling support through our Healthy Minds program Career development and training opportunities American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law. Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations. Show more Show less
Posted 1 week ago
4.0 - 8.0 years
10 - 18 Lacs
Bengaluru
Work from Office
If Interested please fill the below application link : https://forms.office.com/r/Zc8wDfEGEH Responsibilities: Deliver projects integrating data flows within and across technology systems. Lead data modeling sessions with end user groups, project stakeholders, and technology teams to produce logical and physical data models. Design end-to-end job flows that span across systems, including quality checks and controls. Create technology delivery plans to implement system changes. Perform data analysis, data profiling, and data sourcing in relational and Big Data environments. Convert functional requirements into logical and physical data models. Assist in ETL development, testing, and troubleshooting ETL issues. Troubleshoot data issues and work with data providers for resolution; provide L3 support when needed. Design and develop ETL workflows using modern coding and testing standards. Participate in agile ceremonies and actively drive towards team goals. Collaborate with a global team of technologists. Lead with ideas and innovation. Manage communication and partner with end users to design solutions. Required Skills: Must have: Total experience required 4-10 years (relevant experience minimum 5 years) 5 years of project experience in Python/Shell scripting in Data Engineering (experience in building and optimizing data pipelines, architectures, and data sets with large data volumes). 3+ years of experience in PySpark scripting, including the architecture framework of Spark. 3-5 years of strong experience in database development (Snowflake/ SQL Server/Oracle/Sybase/DB2) in designing schema, complex procedures, complex data scripts, query authoring (SQL), and performance optimization. Strong understanding of Unix environment and batch scripting languages (Shell/Python). Strong knowledge of Big Data/Hadoop platform. Strong engineering skills with the ability to understand existing system designs and enhance or migrate them. Strong logical data modeling skills within the Financial Services domain. Experience in data integration and data conversions. Strong collaboration and communication skills. Strong organizational and planning skills. Strong analytical, profiling, and troubleshooting skills. Good to Have: Experience with ETL tools (e.g Informatica, Azure Data Factory) and pipelines across disparate sources is a plus. Experience working with Databricks is a plus. Familiarity with standard Agile & DevOps methodology & tools (Jenkins, Sonar, Jira). Good understanding of developing ETL processes using Informatica or other ETL tools. Experience working with Source Code Management solutions (e.g., Git). Knowledge of Investment Management Business. Experience with job scheduling tools (e.g., Autosys). Experience with data visualization software (e.g., Tableau). Experience with data modeling tools (e.g., Power Designer). Basic familiarity with using metadata stores to maintain a repository of Critical Data Elements. (e.g. Collibra) Familiarity with XML or other markup languages. Mandatory skill sets: ETL,Python/Shell scripting , building pipelines,pyspark, database, sql Preferred skill sets: informatica, hadoop, databricks, collibra
Posted 1 week ago
5.0 - 10.0 years
15 - 25 Lacs
Hyderabad
Work from Office
We are seeking a Senior Data Engineer to support a high-impact data platform engagement for a leading utility company in the USA . This 5-day on-site contract role offers an excellent growth opportunity for professionals experienced in developing scalable data pipelines and distributed data systems. The ideal candidate brings deep proficiency in Python , SQL , and PySpark , along with proven experience in ETL development , data lakes , and big data frameworks . Key Responsibilities Design, build, and maintain robust and scalable data pipelines for ingesting and transforming large volumes of structured and semi-structured data. Enhance platform automation, orchestration, monitoring, and alerting to improve system reliability and developer efficiency. Implement end-to-end ETL workflows and support data lake architecture and big data processing. Partner with engineering, product, and business teams to deliver high-quality, production-ready datasets and services. Continuously identify and implement improvements for pipeline efficiency, automation, and usability. Support platform users by providing technical assistance and clear documentation. Define and maintain best practices for data quality, observability, and operational excellence. Required Qualifications 5+ years of professional experience in data engineering . Expert-level knowledge of Python , SQL , and PySpark . Solid experience in building data pipelines , ETL development , and working with data lakes and big data frameworks . Demonstrated ability to collaborate with cross-functional teams in a distributed engineering environment. Hands-on experience with version control and CI/CD workflows (Git, GitLab, Bitbucket, etc.). Familiarity with platform monitoring and alerting best practices. Nice to Have Experience with Palantir Foundry or similar enterprise data platforms. Knowledge of cloud-based data architectures and infrastructure-as-code. Why Join Contribute to a critical data transformation effort for a major US utility provider. Work in a fast-paced environment with opportunities for career growth and development . Engage in meaningful, hands-on engineering with modern data infrastructure and tooling.
Posted 1 week ago
4.0 - 9.0 years
15 - 25 Lacs
Hyderabad
Work from Office
We are seeking a Senior Data Engineer to support a high-impact data platform engagement for a leading utility company in the USA . This 5-day on-site contract role offers an excellent growth opportunity for professionals experienced in developing scalable data pipelines and distributed data systems. The ideal candidate brings deep proficiency in Python , SQL , and PySpark , along with proven experience in ETL development , data lakes , and big data frameworks . Key Responsibilities Design, build, and maintain robust and scalable data pipelines for ingesting and transforming large volumes of structured and semi-structured data. Enhance platform automation, orchestration, monitoring, and alerting to improve system reliability and developer efficiency. Implement end-to-end ETL workflows and support data lake architecture and big data processing. Partner with engineering, product, and business teams to deliver high-quality, production-ready datasets and services. Continuously identify and implement improvements for pipeline efficiency, automation, and usability. Support platform users by providing technical assistance and clear documentation. Define and maintain best practices for data quality, observability, and operational excellence. Required Qualifications 4+ years of professional experience in data engineering . Expert-level knowledge of Python , SQL , and PySpark . Solid experience in building data pipelines , ETL development , and working with data lakes and big data frameworks . Demonstrated ability to collaborate with cross-functional teams in a distributed engineering environment. Hands-on experience with version control and CI/CD workflows (Git, GitLab, Bitbucket, etc.). Familiarity with platform monitoring and alerting best practices. Nice to Have Experience with Palantir Foundry or similar enterprise data platforms. Knowledge of cloud-based data architectures and infrastructure-as-code. Why Join Contribute to a critical data transformation effort for a major US utility provider. Work in a fast-paced environment with opportunities for career growth and development . Engage in meaningful, hands-on engineering with modern data infrastructure and tooling.
Posted 1 week ago
2.0 - 7.0 years
4 - 9 Lacs
Hyderabad
Work from Office
Overview PepsiCo operates in an environment undergoing immense and rapid change. Big-data and digital technologies are driving business transformation that is unlocking new capabilities and business innovations in areas like eCommerce, mobile experiences and IoT. The key to winning in these areas is being able to leverage enterprise data foundations built on PepsiCo s global business scale to enable business insights, advanced analytics, and new product development. PepsiCo s Data Management and Operations team is tasked with the responsibility of developing quality data collection processes, maintaining the integrity of our data foundations, and enabling business leaders and data scientists across the company to have rapid access to the data they need for decision-making and innovation. Maintain a predictable, transparent, global operating rhythm that ensures always-on access to high-quality data for stakeholders across the company. Responsible for day-to-day data collection, transportation, maintenance/ curation, and access to the PepsiCo corporate data asset Work cross-functionally across the enterprise to centralize data and standardize it for use by business, data science or other stakeholders. Increase awareness about available data and democratize access to it across the company . As a data enginee r , you will be the key technical expert building PepsiCo's data product s to drive a strong vision. You'll be empowered to create data pipelines into various source systems, rest data on the PepsiCo Data Lake, and enable exploration and access for analytics, visualization, machine learning, and product development efforts across the company. As a member of the data engineering team, you will help developing very large and complex data applications into public cloud environments directly impacting the design, architecture, and implementation of PepsiCo's flagship data products around topics like revenue management, supply chain, manufacturing, and logistics . You will work closely with process owners, product owners and business users. You'll be working in a hybrid environment with in-house, on-premises data sources as well as cloud and remote systems. Responsibilities Act as a subject matter expert across different digital projects. Oversee work with internal clients and external partners to structure and store data into unified taxonomies and link them together with standard identifiers. Manage and scale data pipelines from internal and external data sources to support new product launches and drive data quality across data products. Build and own the automation and monitoring frameworks that captures metrics and operational KPIs for data pipeline quality and performance. Responsible for implementing best practices around systems integration, security, performance, and data management. Empower the business by creating value through the increased adoption of data, data science and business intelligence landscape. Collaborate with internal clients (data science and product teams) to drive solutioning and POC discussions. Evolve the architectural capabilities and maturity of the data platform by engaging with enterprise architects and strategic internal and external partners. Develop and optimize procedures to productionalize data science models. Define and manage SLA s for data products and processes running in production. Support large-scale experimentation done by data scientists. Prototype new approaches and build solutions at scale. Research in state-of-the-art methodologies. Create documentation for learnings and knowledge transfer. Create and audit reusable packages or libraries. Qualifications 4+ years of overall technology experience that includes at least 3+ years of hands-on software development, data engineering, and systems architecture. 3+ years of experience with Data Lake Infrastructure, Data Warehousing, and Data Analytics tools. 3+ years of experience in SQL optimization and performance tuning, and development experience in programming languages like Python, PySpark, Scala etc.). 2+ years in cloud data engineering experience in Azure. Fluent with Azure cloud services. Azure Certification is a plus. Experience in Azure Log Analytics Experience with integration of multi cloud services with on-premises technologies. Experience with data modelling, data warehousing, and building high-volume ETL/ELT pipelines. Experience with data profiling and data quality tools like Apache Griffin, Deequ, and Great Expectations. Experience building/operatinghighly available, distributed systems of data extraction, ingestion, and processing of large data sets. Experience with at least one MPP database technology such as Redshift, Synapse or Snowflake. Experience with Azure Data Factory, Azure Databricks and Azure Machine learning tools. Experience with Statistical/ML techniques is a plus. Experience with building solutions in the retail or in the supply chain space is a plus. Experience with version control systems like Github and deployment & CI tools. Working knowledge of agile development, including DevOps and DataOps concepts. B Tech/ BA/ BS in Computer Science, Math, Physics, or other technical fields. Skills, Abilities, Knowledge: Excellent communication skills, both verbal and written, along with the ability to influence and demonstrate confidence in communications with senior level management. Strong change manager. Comfortable with change, especially that which arises through company growth. Ability to understand and translate business requirements into data and technical requirements. High degree of organization and ability to manage multiple, competing projects and priorities simultaneously. Positive and flexible attitude to enable adjusting to different needs in an ever-changing environment. Strong organizational and interpersonal skills; comfortable managing trade-offs. Foster a team culture of accountability, communication, and self-management. Proactively drives impact and engagement while bringing others along. Consistently attain/exceed individual and team goals. 4+ years of overall technology experience that includes at least 3+ years of hands-on software development, data engineering, and systems architecture. 3+ years of experience with Data Lake Infrastructure, Data Warehousing, and Data Analytics tools. 3+ years of experience in SQL optimization and performance tuning, and development experience in programming languages like Python, PySpark, Scala etc.). 2+ years in cloud data engineering experience in Azure. Fluent with Azure cloud services. Azure Certification is a plus. Experience in Azure Log Analytics Experience with integration of multi cloud services with on-premises technologies. Experience with data modelling, data warehousing, and building high-volume ETL/ELT pipelines. Experience with data profiling and data quality tools like Apache Griffin, Deequ, and Great Expectations. Experience building/operatinghighly available, distributed systems of data extraction, ingestion, and processing of large data sets. Experience with at least one MPP database technology such as Redshift, Synapse or Snowflake. Experience with Azure Data Factory, Azure Databricks and Azure Machine learning tools. Experience with Statistical/ML techniques is a plus. Experience with building solutions in the retail or in the supply chain space is a plus. Experience with version control systems like Github and deployment & CI tools. Working knowledge of agile development, including DevOps and DataOps concepts. B Tech/BA/BS in Computer Science, Math, Physics, or other technical fields.
Posted 1 week ago
8.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Data Job Details About Salesforce We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place. We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place. We’re looking for an experienced Data Scientist who will help us build marketing attribution, causal inference, and uplift models to improve the effectiveness and efficiency of our marketing efforts. This person will also design experiments and help us drive consistent approach to experimentation and campaign measurement to support a range of marketing, customer engagement, and digital use cases. This Lead Data Scientist brings significant experience in designing, developing, and delivering statistical models and AI/ML algorithms for marketing and digital optimization use cases on large-scale data sets in a cloud environment. They show rigor in how they prototype, test, and evaluate algorithm performance both in the testing phase of algorithm development and in managing production algorithms. They demonstrate advanced knowledge of statistical and machine learning techniques along with ensuring the ethical use of data in the algorithm design process. At Salesforce, Trust is our number one value and we expect all applications of statistical and machine learning models to adhere to our values and policies to ensure we balance business needs with responsible uses of technology. Responsibilities As part of the Marketing Effectiveness Data Science team within the Salesforce Marketing Data Science organization, develop statistical and machine learning models to improve marketing effectiveness - e.g., marketing attribution models, causal inference models, uplift models, etc. Develop optimization and simulation algorithms to provide marketing investment and allocation recommendations to improve ROI by optimizing spend across marketing channels. Own the full lifecycle of model development from ideation and data exploration, algorithm design and testing, algorithm development and deployment, to algorithm monitoring and tuning in production. Design experiments to support marketing, customer experience, and digital campaigns and develop statistically sound models to measure impact. Collaborate with other data scientists to develop and operationalize consistent approaches to experimentation and campaign measurement. Be a master in cross-functional collaboration by developing deep relationships with key partners across the company and coordinating with working teams. Constantly learn, have a clear pulse on innovation across the enterprise SaaS, AdTech, paid media, data science, customer data, and analytics communities. Required Skills Master’s or Ph.D. in a quantitative field such as statistics, economics, industrial engineering and operations research, applied math, or other relevant quantitative field. 8+ years of experience designing models for marketing optimization such as multi-channel attribution models, customer lifetime value models, propensity models, uplift models, etc. using statistical and machine learning techniques. 8+ years of experience using advanced statistical techniques for experiment design (A/B and multi-cell testing) and causal inference methods for understanding business impact. Must have multiple, robust examples of using these techniques to measure effectiveness of marketing efforts and to solve business problems on large-scale data sets. 8+ years of experience with one or more programming languages such as Python, R, PySpark, Java. Expert-level knowledge of SQL with strong data exploration and manipulation skills. Experience using cloud platforms such as GCP and AWS for model development and operationalization is preferred. Must have superb quantitative reasoning and interpretation skills with strong ability to provide analysis-driven business insight and recommendations. Excellent written and verbal communication skills; ability to work well with peers and leaders across data science, marketing, and engineering organizations. Creative problem-solver who simplifies problems to their core elements. B2B customer data experience a big plus. Advanced Salesforce product knowledge is also a plus. Accommodations If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form. Posting Statement Salesforce is an equal opportunity employer and maintains a policy of non-discrimination with all employees and applicants for employment. What does that mean exactly? It means that at Salesforce, we believe in equality for all. And we believe we can lead the path to equality in part by creating a workplace that’s inclusive, and free from discrimination. Know your rights: workplace discrimination is illegal. Any employee or potential employee will be assessed on the basis of merit, competence and qualifications – without regard to race, religion, color, national origin, sex, sexual orientation, gender expression or identity, transgender status, age, disability, veteran or marital status, political viewpoint, or other classifications protected by law. This policy applies to current and prospective employees, no matter where they are in their Salesforce employment journey. It also applies to recruiting, hiring, job assignment, compensation, promotion, benefits, training, assessment of job performance, discipline, termination, and everything in between. Recruiting, hiring, and promotion decisions at Salesforce are fair and based on merit. The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education. Show more Show less
Posted 1 week ago
4.0 years
0 Lacs
India
On-site
Job Title: Data Analyst (Python +Pyspark) About Us “Capco, a Wipro company, is a global technology and management consulting firm. Awarded with Consultancy of the year in the British Bank Award and has been ranked Top 100 Best Companies for Women in India 2022 by Avtar & Seramount . With our presence across 32 cities across globe, we support 100+ clients across banking, financial and Energy sectors. We are recognized for our deep transformation execution and delivery. WHY JOIN CAPCO? You will work on engaging projects with the largest international and local banks, insurance companies, payment service providers and other key players in the industry. The projects that will transform the financial services industry. MAKE AN IMPACT Innovative thinking, delivery excellence and thought leadership to help our clients transform their business. Together with our clients and industry partners, we deliver disruptive work that is changing energy and financial services. #BEYOURSELFATWORK Capco has a tolerant, open culture that values diversity, inclusivity, and creativity. CAREER ADVANCEMENT With no forced hierarchy at Capco, everyone has the opportunity to grow as we grow, taking their career into their own hands. DIVERSITY & INCLUSION We believe that diversity of people and perspective gives us a competitive advantage. Job Description Role: Data Analyst / Senior Data Analyst Location : Bangalore/ Pune Responsibilities Define and obtain source data required to successfully deliver insights and use cases Determine the data mapping required to join multiple data sets together across multiple sources Create methods to highlight and report data inconsistencies, allowing users to review and provide feedback on Propose suitable data migration sets to the relevant stakeholders Assist teams with processing the data migration sets as required Assist with the planning, tracking and coordination of the data migration team and with the migration run-book and the scope for each customer Role Requirements Strong Data Analyst with Financial Services experience, Knowledge of and experience using data models and data dictionaries in a Banking and Financial Markets context "Knowledge of one or more of the following domains (including market data vendors): Party/Client Trade Settlements Payments Instrument and pricing Market and/or Credit Risk" Demonstrate a continual desire to implement “strategic” or “optimal” solutions and where possible, avoid workarounds or short term tactical solutions Working with stakeholders to ensure that negative customer and business impacts are avoided Manage stakeholder expectations and ensure that robust communication and escalation mechanisms are in place across the project portfolio Good understanding of the control requirement surrounding data handling Experience/Skillset Must have - Excellent analytical skills and commercial acumen, Minimum 4+ years of experience with Python and Pyspark. Good understanding of the control requirements surrounding data handling Experience of big data programmes preferable Strong verbal and written communication skills Strong self-starter with strong change delivery skills who enjoys the challenge of delivering change within tight deadlines Ability to manage multiple priorities Business analysis skills, defining and understanding requirements Knowledge of and experience using data models and data dictionaries in a Banking and Financial Markets context Can write SQL queries and navigate data bases especially Hive, CMD, Putty, Note++ Enthusiastic and energetic problem solver to join an ambitious team Good knowledge of SDLC and formal Agile processes, a bias towards TDD and a willingness to test products as part of the delivery cycle Ability to communicate effectively in a multi-programme environment across a range of stakeholders Attention to detail Good to have - Preferable knowledge and experience in Data Quality & Governance For Spark Scala - should have working experience using scala (preferable) or java for spark For Senior DAs: proven track record of managing small delivery-focussed data teams [09:07] Mishra, Aditi Show more Show less
Posted 1 week ago
7.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
Job Description- Position/Role- Data Engineer Experience: 9-12 yrs Location – Bangalore/Hyderabad Notice Period- Immediate- 30 days Rate- 9-12 yrs Job Overview- Total Years of Experience: 7 to 8 years Primary Skills: Snowflake (4+ years of experience), Python, PySpark, SQL Stored Procedures Secondary Skills: AWS Services (e.g., S3, IAM, Glue, Lambda) Show more Show less
Posted 1 week ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.
Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi
The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.
In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect
In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)
Here are 25 interview questions you may encounter when applying for PySpark roles:
As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
36723 Jobs | Dublin
Wipro
11788 Jobs | Bengaluru
EY
8277 Jobs | London
IBM
6362 Jobs | Armonk
Amazon
6322 Jobs | Seattle,WA
Oracle
5543 Jobs | Redwood City
Capgemini
5131 Jobs | Paris,France
Uplers
4724 Jobs | Ahmedabad
Infosys
4329 Jobs | Bangalore,Karnataka
Accenture in India
4290 Jobs | Dublin 2