Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
2.0 - 6.0 years
0 Lacs
maharashtra
On-site
As a Data Engineer II at Media.net, you will be responsible for designing, executing, and managing large and complex distributed data systems. Your role will involve monitoring performance, optimizing existing projects, and researching and integrating Big Data tools and frameworks as required to meet business and data requirements. You will play a key part in implementing scalable solutions, creating reusable components and data tools, and collaborating with teams across the company to integrate with the data platform efficiently. The team you will be a part of ensures that every web page view is seamlessly processed through high-scale services, handling a large volume of requests across 5 million unique topics. Leveraging cutting-edge Machine Learning and AI technologies on a large Hadoop cluster, you will work with a tech stack that includes Java, Elastic Search/Solr, Kafka, Spark, Machine Learning, NLP, Deep Learning, Redis, and Big Data technologies such as Hadoop, HBase, and YARN. To excel in this role, you should have 2 to 4 years of experience in big data technologies like Apache Hadoop and relational databases (MS SQL Server/Oracle/MySQL/Postgres). Proficiency in programming languages such as Java, Python, or Scala is required, along with expertise in SQL (T-SQL/PL-SQL/SPARK-SQL/HIVE-QL) and Apache Spark. Hands-on knowledge of working with Data Frames, Data Sets, RDDs, Spark SQL/PySpark/Scala APIs, and deep understanding of Performance Optimizations will be essential. Additionally, you should have a good grasp of Distributed Storage (HDFS/S3), strong analytical and quantitative skills, and experience with data integration across multiple sources. Experience with Message Queues like Apache Kafka, MPP systems such as Redshift/Snowflake, and NoSQL storage like MongoDB would be considered advantageous for this role. If you are passionate about working with cutting-edge technologies, collaborating with global teams, and contributing to the growth of a leading ad tech company, we encourage you to apply for this challenging and rewarding opportunity.,
Posted 4 days ago
3.0 - 7.0 years
0 Lacs
pune, maharashtra
On-site
We are searching for a proficient Python Developer to become a valuable member of our product development team. Your primary focus will involve automating data ingestion, processing, and validation workflows utilizing PySpark to create robust, scalable, and efficient data pipelines. In this role, you will collaborate closely with data engineers, analysts, and stakeholders to provide impactful data solutions within a dynamic work environment. Your responsibilities will include collaborating with data engineers and stakeholders to outline requirements and implement automated solutions. You will design, develop, and upkeep scalable and efficient PySpark-based automation for data ingestion, processing, and calculation. Additionally, you will automate the integration of data from various sources such as databases, APIs, flat files, and cloud storage. Implementing and optimizing ETL workflows for high-performance data pipelines will be crucial, along with ensuring data quality through validation checks and exception handling in data processing pipelines. Troubleshooting and resolving issues in data pipelines to uphold operational efficiency will also be part of your duties. Key Responsibilities: - Collaborate with data engineers and stakeholders to define requirements and deliver automated solutions. - Design, develop, and maintain scalable and efficient PySpark-based automation for data ingestion, processing, and calculation. - Automate the reading and integration of data from multiple sources, such as databases, APIs, flat files, and cloud storage. - Implement and optimize ETL workflows for high-performance data pipelines. - Ensure data quality by incorporating validation checks and handling exceptions in data processing pipelines. - Troubleshoot and resolve issues in data pipelines to maintain operational efficiency. Required Skills and Qualifications: - 3+ years of strong proficiency in Python for data handling and validation. - Strong experience in Python libraries like pandas, NumPy, and duckdb. - Familiarity with cloud platforms such as AWS, Azure, or GCP (e.g., S3, Databricks, or BigQuery). - Experience with data pipeline orchestration tools such as Apache Airflow or similar. - Proficiency in SQL for querying and manipulating data. - Experience in handling structured, semi-structured, and unstructured data. - Familiarity with CI/CD processes and version control tools such as Git. - Knowledge of performance tuning and optimization techniques in PySpark. - Strong analytical and problem-solving skills. NICE TO HAVE: - Knowledge of Spark architecture, including RDDs, DataFrames, and Spark SQL. - Knowledge of Keyword-driven automation framework. - Quality Engineering background with prior experience of building automation solutions for data heavy applications. - Familiarity with REST APIs and data integration techniques. - Understanding of data governance, compliance, and security principles. This position is open for immediate joiners ONLY.,
Posted 5 days ago
8.0 - 12.0 years
0 Lacs
hyderabad, telangana
On-site
You have an exciting opportunity to join a dynamic and high-impact firm as a Python, PySpark, and SQL Developer with 8-12 years of relevant experience. In this role, you will be responsible for working on development activities, collaborating with cross-functional teams, and designing scalable data pipelines using Python and PySpark. You will also be involved in implementing ETL processes, developing Power BI reports and dashboards, and optimizing data pipelines for performance and reliability. The ideal candidate should have 8+ years of experience in Spark, Scala, and PySpark for big data processing. Proficiency in Python programming for data manipulation and analysis is essential, along with knowledge of Python libraries such as Pandas and NumPy. Strong knowledge of SQL for querying databases and experience with database systems like Lakehouse, PostgreSQL, Teradata, and SQL Server are also required. Additionally, candidates should have strong analytical and problem-solving skills, effective communication skills, and the ability to troubleshoot and resolve data-related issues. Key Responsibilities: - Work on development activities and lead activities - Coordinate with Product Manager and Development Architect - Collaborate with other teams to understand data requirements and deliver solutions - Design, develop, and maintain scalable data pipelines using Python and PySpark - Utilize PySpark and Spark scripting for data processing and analysis - Implement ETL processes to ensure accurate data processing and storage - Develop and maintain Power BI reports and dashboards - Optimize data pipelines for performance and reliability - Integrate data from various sources into centralized data repositories - Ensure data quality and consistency across different data sets - Analyze large data sets to identify trends, patterns, and insights - Optimize PySpark applications for better performance and scalability - Continuously improve data processing workflows and infrastructure If you meet the qualifications and are interested in this incredible opportunity, please share your updated resume along with total experience, relevant experience in Python, PySpark, and SQL, current location, current CTC, expected CTC, and notice period. We assure you that your profile will be handled with strict confidentiality. Apply now and be a part of this amazing journey! Thank you, Syed Mohammad syed.m@anlage.co.in,
Posted 1 week ago
3.0 - 6.0 years
5 - 8 Lacs
hyderabad, bengaluru, delhi / ncr
Work from Office
As a Senior Azure Data Engineer, your responsibilities will include: Building scalable data pipelines using Databricks and PySpark Transforming raw data into usable business insights Integrating Azure services like Blob Storage, Data Lake, and Synapse Analytics Deploying and maintaining machine learning models using MLlib or TensorFlow Executing large-scale Spark jobs with performance tuning on Spark Pools Leveraging Databricks Notebooks and managing workflows with MLflow Qualifications: Bachelors/Masters in Computer Science, Data Science, or equivalent 7+ years in Data Engineering, with 3+ years in Azure Databricks Strong hands-on in: PySpark, Spark SQL, RDDs, Pandas, NumPy, Delta Lake Azure ecosystem: Data Lake, Blob Storage, Synapse Analytics Location: Remote- Bengaluru,Hyderabad,Delhi / NCR,Chennai,Pune,Kolkata,Ahmedabad,Mumbai
Posted 1 week ago
5.0 - 10.0 years
0 Lacs
karnataka
On-site
As a software developer, you will be working in a constantly evolving environment driven by technological advances and the strategic direction of the organization you are employed by. Your primary responsibilities will include creating, maintaining, auditing, and enhancing systems to meet specific needs, often based on recommendations from systems analysts or architects. You will be tasked with testing both hardware and software systems to identify and resolve system faults. Additionally, you will be involved in writing diagnostic programs and designing and developing code for operating systems and software to ensure optimal efficiency. In situations where necessary, you will also provide recommendations for future developments. Joining us offers numerous benefits, including the opportunity to work on challenging projects and solve complex technical problems. You can expect rapid career growth and the chance to assume leadership roles. Our mentorship program allows you to learn from experienced mentors and industry experts, while our global opportunities enable you to collaborate with clients from around the world and gain international experience. We offer competitive compensation packages and benefits to our employees. If you are passionate about technology and interested in working on innovative projects with a skilled team, pursuing a career as an Infosys Power Programmer could be an excellent choice for you. To be considered for this role, you must possess the following mandatory skills: - Proficiency in AWS Glue, AWS Redshift/Spectrum, S3, API Gateway, Athena, Step, and Lambda functions. - Experience with Extract Transform Load (ETL) and Extract Load & Transform (ELT) data integration patterns. - Expertise in designing and constructing data pipelines. - Development experience in one or more object-oriented programming languages, preferably Python. In terms of job specifications, we are looking for candidates who meet the following criteria: - At least 5 years of hands-on experience in developing, testing, deploying, and debugging Spark Jobs using Scala in the Hadoop Platform. - Profound knowledge of Spark Core and working with RDDs and Spark SQL. - Familiarity with Spark Optimization Techniques and Best Practices. - Strong understanding of Scala Functional Programming concepts like Try, Option, Future, and Collections. - Proficiency in Scala Object-Oriented Programming covering Classes, Traits, Objects (Singleton and Companion), and Case Classes. - Sound knowledge of Scala Language Features including the Type System and Implicit/Givens. - Hands-on experience working in the Hadoop Environment (HDFS/Hive), AWS S3, EMR. - Proficiency in Python programming. - Working experience with Workflow Orchestration tools such as Airflow and Oozie. - Experience with API calls in Scala. - Familiarity and exposure to file formats like Apache AVRO, Parquet, and JSON. - Desirable knowledge of Protocol Buffers and Geospatial data analytics. - Ability to write test cases using frameworks like scalatest. - Good understanding of Build Tools such as Gradle & SBT. - Experience using GIT, resolving conflicts, and working with branches. - Preferred experience in workflow systems like Airflow. - Strong programming skills focusing on data structures and algorithms. - Excellent analytical and communication skills. Candidates applying for this position should have: - 7-10 years of industry experience. - A BE/B.Tech in Computer Science or an equivalent qualification.,
Posted 3 weeks ago
4.0 - 8.0 years
0 Lacs
hyderabad, telangana
On-site
About the Company At Tide, we are dedicated to creating a business management platform that aims to streamline operations for small businesses, enabling them to save valuable time and resources. Our services include offering business accounts, banking solutions, as well as a range of integrated administrative tools spanning from invoicing to accounting. Established in 2017, Tide has garnered a user base of over 1 million small businesses globally, catering to SMEs in the UK, India, and Germany. Headquartered in central London, we also have offices in Sofia, Hyderabad, Delhi, Berlin, and Belgrade, with a team of more than 2,000 employees. Tide is on a trajectory of rapid growth, continuously venturing into new markets and products, and continuously seeking individuals who are enthusiastic and motivated to join us in our mission to empower small businesses by aiding them in saving time and resources. About the Role We are in search of an experienced Senior Data Engineer with exceptional skills in PySpark to join our ML/Data engineering team. This team's responsibilities encompass feature development, data quality assessments, deployment, and integration of ML models with backend services, and enhancing the overall Tide platform. As a Senior Data Engineer, you will play a crucial role in designing, developing, and optimizing our upcoming data pipelines and platforms. Your tasks will involve working with extensive datasets, addressing intricate data challenges, and contributing to the creation of robust, scalable, and efficient data solutions that drive business value. This position presents an exciting opportunity for individuals who are passionate about big data technologies, performance optimization, and constructing resilient data infrastructure. As a Data Engineer, You Will: - Focus on Performance Optimization: Identify and resolve complex performance bottlenecks in PySpark jobs and Spark clusters, utilizing Spark UI, query plans, and advanced optimization techniques. - Lead Design & Development: Spearhead the design and implementation of scalable, fault-tolerant ETL/ELT pipelines using PySpark for batch and real-time data processing. - Collaborate on Data Modeling: Work alongside data scientists, analysts, and product teams to design efficient data models for analytical and operational use cases. - Ensure Data Quality & Governance: Implement strong data quality checks, monitoring, and alerting mechanisms to maintain data accuracy, consistency, and reliability. - Contribute to Architectural Decisions: Aid in shaping the data architecture strategy, assess new technologies, and implement best practices to enhance the data platform's capabilities. - Uphold Best Practices: Promote engineering best practices, participate in code reviews, and mentor junior data engineers. - Foster Collaboration: Work closely with cross-functional teams to deliver impactful data solutions. Qualifications: - Possess 8+ years of professional experience in data engineering, with a minimum of 4+ years focusing on PySpark development in a production environment. - Demonstrate expert-level proficiency in PySpark, including Spark SQL, DataFrames, RDDs, and understanding Spark's architecture. - Showcase hands-on experience in optimizing PySpark performance, debugging slow jobs, and handling common issues in large datasets. - Exhibit strong programming skills in Python, proficiency in SQL, and familiarity with data warehousing concepts. - Prior experience with distributed data storage solutions and version control systems. - Strong problem-solving abilities, attention to detail, and excellent communication skills. - Hold a Bachelor's or Master's degree in Computer Science, Engineering, or a related field. What We Offer: - Competitive salary - Health and life insurance for self and family - OPD benefits - Mental well-being support - Learning and development budget - WFH setup allowance - Generous leave policy - Stock options Tide Ways of Working: At Tide, we embrace a flexible workplace model that accommodates both in-person and remote work to cater to the diverse needs of our teams. While we support remote work, we believe in the importance of face-to-face interactions to foster collaboration and team spirit, making our offices hubs for innovation and community building. Tide is a Place for Everyone: We promote a transparent and inclusive environment where every voice is valued and heard. Your personal data will be handled by Tide for recruitment purposes in accordance with our Recruitment Privacy Notice.,
Posted 3 weeks ago
6.0 - 11.0 years
5 - 15 Lacs
Chennai, Bengaluru, Mumbai (All Areas)
Hybrid
Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills
Posted 1 month ago
6.0 - 11.0 years
5 - 15 Lacs
Hyderabad, Chennai, Bengaluru
Hybrid
Mandatory Skill : Spark and Scala Data Engineering. Secondary Skill Python 5+ years of in depth hands on experience of developing, testing, deployment and debugging of Spark Jobs using Scala in Hadoop Platform In depth knowledge of Spark Core, working with RDDs, Spark SQL In depth knowledge on Spark Optimization Techniques and Best practices Good Knowledge of Scala Functional Programming: Try, Option, Future, Collections Good Knowledge of Scala OOPS: Classes, Traits and Objects (Singleton and Companion), Case Classes Good Understanding of Scala Language Features: Type System, Implicit/Givens Hands on experience of working in Hadoop Environment (HDFS/Hive), AWS S3, EMR Working experience on Workflow Orchestration tools like Airflow, Oozie Working with API calls in Scala Understanding and exposure to file formats such as Apache AVRO, Parquet, JSON Good to have knowledge of Protocol Buffers and Geospatial data analytics. Writing Test cases using frameworks such as scalatest. Good Knowledge of Build Tools such as: Gradle & SBT in depth Experience on using GIT, resolving conflicts, working with branches. Good to have Python programming skills Good to have worked on some workflow systems as Airflow Strong programming skills using data structures and algorithms. Excellent analytical skills Good communication skills
Posted 1 month ago
3.0 - 6.0 years
5 - 8 Lacs
Hyderabad, Bengaluru, Delhi / NCR
Work from Office
As a Senior Azure Data Engineer, your responsibilities will include: Building scalable data pipelines using Databricks and PySpark Transforming raw data into usable business insights Integrating Azure services like Blob Storage, Data Lake, and Synapse Analytics Deploying and maintaining machine learning models using MLlib or TensorFlow Executing large-scale Spark jobs with performance tuning on Spark Pools Leveraging Databricks Notebooks and managing workflows with MLflow Qualifications: Bachelors/Masters in Computer Science, Data Science, or equivalent 7+ years in Data Engineering, with 3+ years in Azure Databricks Strong hands-on in: PySpark, Spark SQL, RDDs, Pandas, NumPy, Delta Lake Azure ecosystem: Data Lake, Blob Storage, Synapse Analytics Location: Remote- Bengaluru,Hyderabad,Delhi / NCR,Chennai,Pune,Kolkata,Ahmedabad,Mumbai
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
71627 Jobs | Dublin
Wipro
26798 Jobs | Bengaluru
Accenture in India
22262 Jobs | Dublin 2
EY
20323 Jobs | London
Uplers
14624 Jobs | Ahmedabad
IBM
13848 Jobs | Armonk
Bajaj Finserv
13848 Jobs |
Accenture services Pvt Ltd
13066 Jobs |
Amazon
12516 Jobs | Seattle,WA
Capgemini
12337 Jobs | Paris,France