Jobs
Interviews

344 Hdfs Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 10.0 years

9 - 19 Lacs

chennai

Work from Office

Job Summary: We are seeking a Big Data Administrator with strong expertise in Linux systems, AWS infrastructure, and Big Data technologies. This role is ideal for someone experienced in managing large-scale Hadoop ecosystems in production, with a deep understanding of observability, performance tuning, and automation using tools like Terraform or Ansible. Key Responsibilities: Manage and maintain large-scale Big Data clusters (Cloudera, Hortonworks, or AWS EMR) Develop and support infrastructure as code using Terraform or Ansible Administer Hadoop ecosystem components HDFS, YARN, Hive (Tez, LLAP), Presto, Spark Implement and monitor observability tools like Prometheus, InfluxDB, Dynatrace, Grafana, Splunk Optimize SQL performance on Hive/Spark and understand query plans Automate cluster operations using Python (PySpark) or Shell scripting Support Data Analysts & Scientists with tools like JupyterHub, R-Studio, H2O, SAS Handle data in various formats ORC, Parquet, Avro Integrate with and support Kubernetes-based environments (if applicable) Collaborate across teams for deployments, monitoring, and troubleshooting Must-Have Skills: 5+ years in Linux system administration and AWS cloud infrastructure Experience with Cloudera, Hortonworks, or EMR in production Strong in Terraform Ansible for automation Solid hands-on with HDFS, YARN, Hive, Spark, Presto Proficient in Python and Shell scripting Familiar with observability tools : Grafana, Prometheus, InfluxDB, Splunk, Dynatrace Familiarity with Active Directory , Windows VDI platforms (Citrix, AWS Workspaces) Nice-to-Have Skills: Experience with Airflow , Oozie Familiar with Pandas, Numpy, Scipy, PyTorch Prior use of Jenkins, Chef, Packer Comfortable reading code in Java, Scala, Python, R Qualifications: Bachelors or Masters degree in Computer Science, Information Technology, or a related field Strong communication, collaboration, and troubleshooting skills Ability to thrive in remote or hybrid work environments

Posted Just now

Apply

5.0 - 10.0 years

9 - 19 Lacs

hyderabad

Work from Office

Job Summary: We are seeking a Big Data Administrator with strong expertise in Linux systems, AWS infrastructure, and Big Data technologies. This role is ideal for someone experienced in managing large-scale Hadoop ecosystems in production, with a deep understanding of observability, performance tuning, and automation using tools like Terraform or Ansible. Key Responsibilities: Manage and maintain large-scale Big Data clusters (Cloudera, Hortonworks, or AWS EMR) Develop and support infrastructure as code using Terraform or Ansible Administer Hadoop ecosystem components HDFS, YARN, Hive (Tez, LLAP), Presto, Spark Implement and monitor observability tools like Prometheus, InfluxDB, Dynatrace, Grafana, Splunk Optimize SQL performance on Hive/Spark and understand query plans Automate cluster operations using Python (PySpark) or Shell scripting Support Data Analysts & Scientists with tools like JupyterHub, R-Studio, H2O, SAS Handle data in various formats ORC, Parquet, Avro Integrate with and support Kubernetes-based environments (if applicable) Collaborate across teams for deployments, monitoring, and troubleshooting Must-Have Skills: 5+ years in Linux system administration and AWS cloud infrastructure Experience with Cloudera, Hortonworks, or EMR in production Strong in Terraform Ansible for automation Solid hands-on with HDFS, YARN, Hive, Spark, Presto Proficient in Python and Shell scripting Familiar with observability tools : Grafana, Prometheus, InfluxDB, Splunk, Dynatrace Familiarity with Active Directory , Windows VDI platforms (Citrix, AWS Workspaces) Nice-to-Have Skills: Experience with Airflow , Oozie Familiar with Pandas, Numpy, Scipy, PyTorch Prior use of Jenkins, Chef, Packer Comfortable reading code in Java, Scala, Python, R Qualifications: Bachelors or Masters degree in Computer Science, Information Technology, or a related field Strong communication, collaboration, and troubleshooting skills Ability to thrive in remote or hybrid work environments

Posted Just now

Apply

5.0 - 10.0 years

9 - 19 Lacs

bengaluru

Work from Office

Job Summary: We are seeking a Big Data Administrator with strong expertise in Linux systems, AWS infrastructure, and Big Data technologies. This role is ideal for someone experienced in managing large-scale Hadoop ecosystems in production, with a deep understanding of observability, performance tuning, and automation using tools like Terraform or Ansible. Key Responsibilities: Manage and maintain large-scale Big Data clusters (Cloudera, Hortonworks, or AWS EMR) Develop and support infrastructure as code using Terraform or Ansible Administer Hadoop ecosystem components HDFS, YARN, Hive (Tez, LLAP), Presto, Spark Implement and monitor observability tools like Prometheus, InfluxDB, Dynatrace, Grafana, Splunk Optimize SQL performance on Hive/Spark and understand query plans Automate cluster operations using Python (PySpark) or Shell scripting Support Data Analysts & Scientists with tools like JupyterHub, R-Studio, H2O, SAS Handle data in various formats ORC, Parquet, Avro Integrate with and support Kubernetes-based environments (if applicable) Collaborate across teams for deployments, monitoring, and troubleshooting Must-Have Skills: 5+ years in Linux system administration and AWS cloud infrastructure Experience with Cloudera, Hortonworks, or EMR in production Strong in Terraform Ansible for automation Solid hands-on with HDFS, YARN, Hive, Spark, Presto Proficient in Python and Shell scripting Familiar with observability tools : Grafana, Prometheus, InfluxDB, Splunk, Dynatrace Familiarity with Active Directory , Windows VDI platforms (Citrix, AWS Workspaces) Nice-to-Have Skills: Experience with Airflow , Oozie Familiar with Pandas, Numpy, Scipy, PyTorch Prior use of Jenkins, Chef, Packer Comfortable reading code in Java, Scala, Python, R Qualifications: Bachelors or Masters degree in Computer Science, Information Technology, or a related field Strong communication, collaboration, and troubleshooting skills Ability to thrive in remote or hybrid work environments

Posted Just now

Apply

3.0 - 10.0 years

0 Lacs

pune, maharashtra

On-site

Role Overview: You are a seasoned IT Quality Sr Analyst who will apply your in-depth disciplinary knowledge to contribute to the development of new techniques and improvement of processes and workflows within the Treasury and FP&A Technology team. You will integrate subject matter and industry expertise to evaluate complex issues and provide leadership within teams. Your role will have a significant impact on project size and geography by influencing decisions and ensuring the performance of all teams in the area. Key Responsibilities: - Plan, lead, and execute testing automation strategy for CitiFTP, continuously monitoring automation coverage and enhancing the existing framework. - Design, develop, and implement scalable automation frameworks for UI, API, and data validation testing on Big Data/Hadoop platform. - Collaborate with other testing areas, development teams, and business partners to integrate automation into the agile SDLC. - Enhance regression and end-to-end testing efficiency using automation and develop robust test scripts to support rapid software releases. - Improve test coverage, defect detection, and release quality through automation and establish key QA metrics. - Advocate for best practices in test automation, drive the adoption of AI/ML-based testing tools, and mentor a team of test engineers. - Foster a culture of continuous learning and innovation within the testing community and analyze industry trends to improve processes. - Assess risk in business decisions, drive compliance with applicable laws, and escalate control issues with transparency. Qualifications: - 10+ years of experience in functional and non-functional software testing. - 3+ years of experience as Test Automation Lead and expertise in automation frameworks/tools like Jenkins, Selenium, Cucumber, TestNG, Junit, Cypress. - Strong programming skills in Java, Python, or any scripting language, expertise in SQL, API testing tools, and performance testing tools. - Knowledge of Agile, Scrum, DevOps practices, functional Test tools, build tools, continuous integration tools, source management tools, and cloud-based test execution. - Familiarity with big data testing, database testing automation, and certifications such as ISTQB Advanced, Certified Agile Tester, or Selenium WebDriver certification. - Exposure to banking/financial domains, communication skills, and passion for automation in quality engineering. Note: This job description does not include any additional details of the company.,

Posted 1 day ago

Apply

2.0 - 6.0 years

0 Lacs

karnataka

On-site

Role Overview: As a Data Scientist at mPokket, you will be responsible for collaborating with the data science team to plan projects and build analytics models. Your strong problem-solving skills and proficiency in statistical analysis will be key in aligning our data products with our business goals. Your primary objective will be to enhance our products and business decisions through effective utilization of data. Key Responsibilities: - Oversee the data scientists" team and data specialists, providing guidance and support. - Educate, lead, and advise colleagues on innovative techniques and solutions. - Work closely with data and software engineers to implement scalable sciences and technologies company-wide. - Conceptualize, plan, and prioritize data projects in alignment with organizational objectives. - Develop and deploy analytic systems, predictive models, and explore new techniques. - Ensure that all data projects are in sync with the company's goals. Qualifications Required: - Master's degree in Computer Science, Operations Research, Econometrics, Statistics, or a related technical field. - Minimum of 2 years of experience in solving analytical problems using quantitative approaches. - Proficiency in communicating quantitative analysis results effectively. - Knowledge of relational databases, SQL, and experience in at least one scripting language (PHP, Python, Perl, etc.). - Familiarity with statistical concepts such as hypothesis testing, regressions, and experience in manipulating data sets using statistical software (e.g., R, SAS) or other methods. Additional Details: mPokket is a company that values innovation and collaboration. The team culture encourages learning and growth, providing opportunities to work on cutting-edge technologies and projects that have a real impact on the business. Thank you for considering a career at mPokket.,

Posted 2 days ago

Apply

0.0 - 3.0 years

4 - 8 Lacs

bengaluru

Work from Office

The Data Engineer 1 works on different projects of data engineering to support the use cases, data ingestion pipeline and identify potential process or data quality issues. The team also supports marketing analytic teams with analytical tools that enable our analytics and business communities to do their job easier, faster and smarter. The team brings together data from different internal & external partners and builds a curated Marketing analytics focused data & tools ecosystem. The Data Engineer plays a crucial role in building this ecosystem depending on the Marketing analytics communities need. Essential Job Functions Collaboration - Collaborate s with internal/external stakeholders to manage data logistics including data specifications, transfers, structures, and rules . Collaborate s with business users, business analysts and technical architects in transforming business requirements into analytical workbenches, tools and dashboards reflecting usability best practices and current design trends. Demonstrates analytical, interpersonal and professional communication skills . Learns quickly and works effectively ind i vidually and as part of a team . Process Improvement - Access, extract, and transform Credit and Retail data from a variety of sources of all sizes (including client marketing databases, 2 nd and 3 rd party data) using Hadoop, Spark, SQL, Big data technologies etc. Provide s automation help to analytical teams around data centric needs using orchestration tools, SQL and possibly other big data/cloud solutions for efficiency improvement. Project Support - Support Sr. Specialist and Specialist in new analytical proof of concepts and tool exploration projects. Effectively manage time and resources in order to deliver on time/correctly on concurrent projects. Involved in creating POCs to ingest and process streaming data using Spark and HDFS. Data and Analytics - Answer and trouble shoot questions about data sets and analytical tools; Develop, maintain and enhance new and existing analytics tools to support internal customers. I ngest data from files, streams and databases then p rocess the data with Python and Pyspark in order to store data to Hive or NoSQL database. Manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data. Apply knowledge in Agile Scrum methodology that leverages the Client Bigdata platform and used version control tool Git. I mport and export data using Sqoop from HDFS to RDBMS and vice-versa. Demonstrate an understanding of Hadoop Architecture and underlying Hadoop framework including Storage Management. Create POCs to ingest and process streaming data using Spark and HDFS. Work on back-end using Scala, Python and Spark to perform several aggregation logics Technical Skills - Expert in writing complicated SQL Queries and database analysis for good performance . Experience in working on Microsoft Azure Services like ADLS/Blob Storage solutions, Azure DataFactory , Azure Functions and Databricks. Utilize basic knowledge of Rest API for designing networked applications Reports to: Data Engineer, Lead or higher Working Conditions/ Physical Requirements: Normal office environment Minimum Qualifications : Bachelor s Degree in Computer Science or Engineering 0 to 3 years in Data & Analytics

Posted 3 days ago

Apply

8.0 - 13.0 years

2 - 2 Lacs

hyderabad

Work from Office

SUMMARY Key Responsibilities: Work closely with clients to understand their business requirements and design data solutions that meet their needs. Develop and implement end-to-end data solutions that include data ingestion, data storage, data processing, and data visualization components. Design and implement data architectures that are scalable, secure, and compliant with industry standards. Work with data engineers, data analysts, and other stakeholders to ensure the successful delivery of data solutions. Participate in presales activities, including solution design, proposal creation, and client presentations. Act as a technical liaison between the client and our internal teams, providing technical guidance and expertise throughout the project lifecycle. Stay up-to-date with industry trends and emerging technologies related to data architecture and engineering. Develop and maintain relationships with clients to ensure their ongoing satisfaction and identify opportunities for additional business. Understands Entire End to End AI Life Cycle starting from Ingestion to Inferencing along with Operations. Exposure to Gen AI Emerging technologies. Exposure to Kubernetes Platform and hands on deploying and containorizing Applications. Good Knowledge on Data Governance, data warehousing and data modelling. Requirements: Bachelor's or Master's degree in Computer Science, Data Science, or related field. 10+ years of experience as a Data Solution Architect, with a proven track record of designing and implementing end-to-end data solutions. Strong technical background in data architecture, data engineering, and data management. Extensive experience on working with any of the hadoop flavours preferably Data Fabric. Experience with presales activities such as solution design, proposal creation, and client presentations. Familiarity with cloud-based data platforms (e.g., AWS, Azure, Google Cloud) and related technologies such as data warehousing, data lakes, and data streaming. Experience with Kubernetes and Gen AI tools and tech stack. Excellent communication and interpersonal skills, with the ability to effectively communicate technical concepts to both technical and non-technical audiences. Strong problem-solving skills, with the ability to analyze complex data systems and identify areas for improvement. Strong project management skills, with the ability to manage multiple projects simultaneously and prioritize tasks effectively. Tools and Tech Stack Data Architecture and Engineering: Hadoop Ecosystem: Preferred: Cloudera Data Platform (CDP) or Data Fabric. Tools: HDFS, Hive, Spark, HBase, Oozie. Data Warehousing: Cloud - based: Azure Synapse, Amazon Redshift, Google Big Query, Snowflake, Azure Synapsis and Azure Data Bricks On - premises: , Teradata, Vertica Data Integration and ETL Tools: Apache NiFi, Talend, Informatica, Azure Data Factory, Glue. Cloud Platforms: Azure (preferred for its Data Services and Synapse integration), AWS, or GCP. Cloud - native Components: Data Lakes: Azure Data Lake Storage, AWS S3, or Google Cloud Storage. Data Streaming: Apache Kafka, Azure Event Hubs, AWS Kinesis. HPE Platforms: Data Fabric, AI Essentials or Unified Analytics, HPE MLDM and HPE MLDE AI and Gen AI Technologies: AI Lifecycle Management: MLOps: MLflow, KubeFlow, Azure ML, or SageMaker, Ray Inference tools: TensorFlow Serving, K Serve, Seldon Generative AI: Frameworks: Hugging Face Transformers, LangChain. Tools: OpenAI API (e.g., GPT-4) Orchestration and Deployment: Kubernetes: Platforms: Azure Kubernetes Service (AKS)or Amazon EKS or Google Kubernetes Engine (GKE) or Open Source K8 Tools: Helm CI/CD for Data Pipelines and Applications: Jenkins, GitHub Actions, GitLab CI, or Azure DevOps

Posted 3 days ago

Apply

8.0 - 12.0 years

0 Lacs

pune, maharashtra

On-site

As an Applications Development Senior Programmer Analyst at our company, your role involves participating in the establishment and implementation of new or revised application systems and programs in coordination with the Technology team. Your primary objective is to contribute to applications systems analysis and programming activities. **Responsibilities:** - Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas - Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users - Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement - Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality - Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems - Ensure essential procedures are followed and help define operating standards and processes - Serve as advisor or coach to new or lower level analysts - Has the ability to operate with a limited level of direct supervision - Can exercise independence of judgement and autonomy - Acts as SME to senior stakeholders and /or other team members - Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency **Qualifications:** - 8 years of relevant experience - At least 4+ years experience in Python, SQL and PySpark, with overall experience of 8+ experience as Data Engineer - Proficiency in distributed data processing and big data tools and technologies: Hadoop, HDFS, YARN, Spark, Hive - Design, develop, and maintain big data pipelines using PySpark - Experience with integration of data from multiple data sources - Experience in DevTools like openshift, teamcity, uDeploy, BitBucket, GitHub - Proactively contribute to stability of overall production systems and troubleshoot key components and processes - Keep track of latest technology trends and proactively learn new technologies driving practical and reference implementations **Nice to have:** - Experience with NoSQL databases, such as ElasticSearch - Experience in systems analysis and programming of software applications - Experience in managing and implementing successful projects - Working knowledge of consulting/project management techniques/methods - Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements **Education:** - Bachelors degree/University degree or equivalent experience This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.,

Posted 3 days ago

Apply

1.0 - 6.0 years

1 - 2 Lacs

hyderabad

Work from Office

SUMMARY Job Summary We are looking for a Senior PySpark Developer with 3 to 6 years of experience in building and optimizing data pipelines using PySpark on Databricks, within AWS cloud environments. This role focuses on the modernization of legacy domains, involving integration with systems like Kafka and collaboration across cross-functional teams. Key Responsibilities Develop and optimize scalable PySpark applications on Databricks. Work with AWS services (S3, EMy, Lambda, Dlue) for cloud-native data processing. Integrate streaming and batch data sources, especially using Kafka. Tune Spark jobs for performance, memory, and compute efficiency. Collaborate with DevOps, product, and analytics teams on delivery and deployment. Ensure data governance, lineage, and quality compliance across all pipelines. Required Skills 3 6 years of hands-on development in PySpark. Experience with Databricks and performance tuning using Spark UI. Strong understanding of AWS services, Kafka, and distributed data processing. Proficient in partitioning, caching, join optimization, and resource configuration. Familiarity with data formats like Parquet, Avro, and OyC . Exposure to orchestration tools (Airflow, Databricks Workflows). Scala experience is a strong plus.

Posted 4 days ago

Apply

8.0 - 13.0 years

2 - 2 Lacs

hyderabad

Work from Office

SUMMARY About the Role We are seeking a highly skilled and experienced Hands-on Palantir Architect to join our dynamic team. In this pivotal role, you will be responsible for the end-to-end design, development, and implementation of complex data solutions on the Palantir platform. The ideal candidate is not only a seasoned architect but also a proficient developer who can dive deep into code, build robust data pipelines, and create powerful applications that deliver critical insights. Key Responsibilities Architect and Design: Lead the technical design and architecture of solutions on the Palantir platform, ensuring scalability, performance, and security. Hands-on Development: Actively participate in the development of data pipelines using PySpark, Spark SQL, TypeScript and Python to transform and integrate data from various sources. Application Building: Construct and configure interactive applications and dashboards using Palantir's tools, such as Workshop, Quiver, and Slate. Technical Leadership: Serve as a technical expert and mentor for junior team members, fostering best practices and knowledge sharing within the team. Stakeholder Collaboration: Work closely with business stakeholders, data scientists, and engineers to translate business requirements into technical solutions. Solution Optimization: Identify and address performance bottlenecks, ensuring the efficiency and reliability of data pipelines and applications. Required Qualifications Experience: 4+ years of experience as a Palantir Architect, or Palatir Solution Developer. Platform Expertise: Deep, hands-on experience with Palantir Foundry or Palantir Gotham, including data integration, ontology modeling, and application development. Programming Skills: Strong proficiency in TypeScript, Python and PySpark is essential. Experience with Scala and SQL is a plus. Data Fundamentals: Solid understanding of data warehousing, ETL/ELT processes, and data modeling concepts. Communication: Excellent verbal and written communication skills with the ability to articulate complex technical concepts to both technical and non-technical audiences. Preferred Qualifications Certifications: Palantir Certified Architect or other relevant certifications. Problem-Solving: Proven track record of solving complex technical challenges in a fast-paced environment.

Posted 4 days ago

Apply

2.0 - 5.0 years

8 - 9 Lacs

chennai

Work from Office

Job Title: Hadoop Administrator Location: Chennai, India Experience: 5 yrs of experience in IT, with At least 2+ years of experience with cloud and system administration. At least 3 years of experience with and strong understanding of big data technologies in Hadoop ecosystem Hive, HDFS, Map/Reduce, Flume, Pig, Cloudera, HBase Sqoop, Spark etc. Company: Smartavya Analytica Private limited is a niche Data and AI company. Based in Pune, we are pioneers in data-driven innovation, transforming enterprise data into strategic insights. Established in 2017, our team has experience in handling large datasets up to 20 PBs in a single implementation, delivering many successful data and AI projects across major industries, including retail, finance, telecom, manufacturing, insurance, and capital markets. We are leaders in Big Data, Cloud and Analytics projects with super specialization in very large Data Platforms. https://smart-analytica.com SMARTAVYA ANALYTICA Smartavya Analytica is a leader in Big Data, Data Warehouse and Data Lake Solutions, Data Migration Services and Machine Learning/Data Science projects on all possible flavours namely on-prem, cloud and migration both ways across platforms such as traditional DWH/DL platforms, Big Data Solutions on Hadoop, Public Cloud and Private Cloud.smart-analytica.com Empowering Your Digital Transformation with Data Modernization and AI Job Overview: Smartavya Analytica Private Limited is seeking an experienced Hadoop Administrator to manage and support our Hadoop ecosystem. The ideal candidate will have strong expertise in Hadoop cluster administration, excellent troubleshooting skills, and a proven track record of maintaining and optimizing Hadoop environments. Key Responsibilities: Install, configure, and manage Hadoop clusters, including HDFS, YARN, Hive, HBase, and other ecosystem components. Monitor and manage Hadoop cluster performance, capacity, and security. Perform routine maintenance tasks such as upgrades, patching, and backups. Implement and maintain data ingestion processes using tools like Sqoop, Flume, and Kafka. Ensure high availability and disaster recovery of Hadoop clusters. Collaborate with development teams to understand requirements and provide appropriate Hadoop solutions. Troubleshoot and resolve issues related to the Hadoop ecosystem. Maintain documentation of Hadoop environment configurations, processes, and procedures. Requirement: Experience in Installing, configuring and tuning Hadoop distributions. Hands on experience in Cloudera. Understanding of Hadoop design principals and factors that affect distributed system performance, including hardware and network considerations. Provide Infrastructure Recommendations, Capacity Planning, work load management. Develop utilities to monitor cluster better Ganglia, Nagios etc. Manage large clusters with huge volumes of data Perform Cluster maintenance tasks Create and removal of nodes, cluster monitoring and troubleshooting Manage and review Hadoop log files Install and implement security for Hadoop clusters Install Hadoop Updates, patches and version upgrades. Automate the same through scripts Point of Contact for Vendor escalation. Work with Hortonworks in resolving issues Should have Conceptual/working knowledge of basic data management concepts like ETL, Ref/Master data, Data quality, RDBMS Working knowledge of any scripting language like Shell, Python, Perl Should have experience in Orchestration & Deployment tools. Academic Qualification: BE / B.Tech in Computer Science or equivalent along with hands-on experience in dealing with large data sets and distributed computing in data warehousing and business intelligence systems using Hadoop.

Posted 4 days ago

Apply

10.0 - 14.0 years

0 Lacs

maharashtra

On-site

As a Data Ops Capability Deployment Analyst at Citi, you will be a seasoned professional contributing to the development of new solutions and techniques for the Enterprise Data function. Your role involves performing data analytics and analysis across various asset classes, as well as building data science capabilities within the team. You will collaborate closely with the wider Enterprise Data team to deliver on business priorities. Working within the B & I Data Capabilities team, you will be responsible for managing the Data quality/Metrics/Controls program and implementing improved data governance and management practices. This program focuses on enhancing Citis approach to data risk and meeting regulatory commitments in this area. Key Responsibilities: - Hands-on experience with data engineering and a strong understanding of Distributed Data platforms and Cloud services. - Knowledge of data architecture and integration with enterprise applications. - Research and assess new data technologies and self-service data platforms. - Collaboration with Enterprise Architecture Team on refining overall data strategy. - Addressing performance bottlenecks, designing batch orchestrations, and delivering Reporting capabilities. - Performing complex data analytics on large datasets including data cleansing, transformation, joins, and aggregation. - Building analytics dashboards and data science capabilities for Enterprise Data platforms. - Communicating findings and proposing solutions to stakeholders. - Translating business requirements into technical design documents. - Collaboration with cross-functional teams for testing and implementation. - Understanding of banking industry requirements. - Other duties and functions as assigned. Skills & Qualifications: - 10+ years of development experience in Financial Services or Finance IT. - Experience with Data Quality/Data Tracing/Data Lineage/Metadata Management Tools. - Hands-on experience with ETL using PySpark, data ingestion, Spark optimization, and batch orchestration. - Proficiency in Hive, HDFS, Airflow, and job scheduling. - Strong programming skills in Python with data manipulation and analysis libraries. - Proficient in writing complex SQL/Stored Procedures. - Experience with DevOps tools, Jenkins/Lightspeed, Git, CoPilot. - Knowledge of BI visualization tools such as Tableau, PowerBI. - Implementation experience with Datalake/Datawarehouse for enterprise use cases. - Exposure to analytical tools and AI/ML is desired. Education: - Bachelor's/University degree, master's degree in information systems, Business Analysis, or Computer Science. In this role, you will be part of the Data Governance job family focusing on Data Governance Foundation. This is a full-time position at Citi, where you will utilize skills like Data Management, Internal Controls, Risk Management, and more to drive compliance and achieve business objectives. If you require a reasonable accommodation due to a disability to utilize search tools or apply for a career opportunity at Citi, please review the Accessibility at Citi guidelines. Additionally, you can refer to Citi's EEO Policy Statement and the Know Your Rights poster for more information.,

Posted 5 days ago

Apply

10.0 - 14.0 years

0 Lacs

pune, maharashtra

On-site

As a Data Ops Capability Deployment Analyst at Citigroup, you will be a seasoned professional contributing to the development of new solutions, frameworks, and techniques for the Enterprise Data function. Your role will involve performing data analytics and analysis across different asset classes, as well as building data science and tooling capabilities within the team. You will work closely with the Enterprise Data team to deliver business priorities. The B & I Data Capabilities team manages the Data quality/Metrics/Controls program and implements improved data governance and data management practices. The Data quality program focuses on enhancing Citigroup's approach to data risk and meeting regulatory commitments. Key Responsibilities: - Hands-on experience with data engineering and distributed data platforms - Understanding of data architecture and integration with enterprise applications - Research and evaluate new data technologies and self-service data platforms - Collaborate with the Enterprise Architecture Team on defining data strategy - Perform complex data analytics on large datasets - Build analytics dashboards and data science capabilities - Communicate findings and propose solutions to stakeholders - Convert business requirements into technical design documents - Work with cross-functional teams for implementation and support - Demonstrate a good understanding of the banking industry - Perform other assigned duties Skills & Qualifications: - 10+ years of development experience in Financial Services or Finance IT - Experience with Data Quality/Data Tracing/Metadata Management Tools - ETL experience using PySpark on distributed platforms - Proficiency in Python, SQL, and BI visualization tools - Strong knowledge of Hive, HDFS, Airflow, and job scheduling - Experience in Data Lake/Data Warehouse implementation - Exposure to analytical tools and AI/ML is desired Education: - Bachelor's/University degree, master's degree in information systems, Business Analysis, or Computer Science If you are a person with a disability and require accommodation to use search tools or apply for a career opportunity, review Accessibility at Citi.,

Posted 5 days ago

Apply

4.0 - 8.0 years

0 Lacs

chennai, tamil nadu

On-site

You have a great opportunity to join us as a Software Engineer / Senior Software Engineer / System Analyst in Chennai with 4-7 years of experience. We are looking for a candidate with expertise in Database testing, ETL testing, and Agile methodology. As a part of our team, your responsibilities will include test planning & execution, working with Agile methodology, and having a minimum of 4-6 years of testing experience. Experience in auditing domain would be a plus. You should have strong application analysis, troubleshooting, and behavior skills along with extensive experience in Manual testing. Experience in Automation scripting would be an added advantage. You will also be responsible for leading discussions with Business, Development, and vendor teams for testing activities such as Defect Coordinator and test scenario reviews. Strong communication skills, both verbal and written, are essential for this role. You should be able to work effectively independently as well as with onshore and offshore teams. In addition, we are seeking an experienced ETL developer with expertise in Big Data technologies like Hadoop. The required skills include Hadoop (Horton Works), HDFS, Hive, Pig, Knox, Ambari, Ranger, Oozie, TALEND, SSIS, MySQL, MS SQL Server, Oracle, Windows, and Linux. This role may require you to work in 2nd shifts (1pm - 10pm) and excellent English communication skills are a must. If you are interested, please share your profile on mytestingcareer.com and mention your Current CTC, Expected CTC, Notice Period, Current Location, and Contact number in your response. Don't miss this opportunity to be a part of our dynamic team!,

Posted 5 days ago

Apply

5.0 - 10.0 years

18 - 33 Lacs

japan, chennai

Work from Office

C1X AdTech Pvt Ltd is a fast-growing product and engineering-driven AdTech company building next-generation advertising and marketing technology platforms. Our mission is to empower enterprise clients with the smartest marketing solutions, enabling seamless integration with personalization engines and delivering cross-channel marketing capabilities. We are dedicated to enhancing customer engagement and experiences while focusing on increasing Lifetime Value (LTV) through consistent messaging across all channels.Our engineering team spans front end (UI), back end (Java/Node.js APIs), Big Data, and DevOps , working together to deliver scalable, high-performance products for the digital advertising ecosystem. Role Overview As a Data Engineer , you will be a key member of our data engineering team, responsible for building and maintaining large-scale data products and infrastructure. Youll shape the next generation of data analytics tech stack by leveraging modern big data technologies. This role involves working closely with business stakeholders, product managers, and engineering teams to meet diverse data requirements that drive business insights and product innovation. Objectives Design, build, and maintain scalable data infrastructure for collection, storage, and processing. Enable easy access to reliable data for data scientists, analysts, and business users. Support data-driven decision-making and improve organizational efficiency through high-quality data products. Responsibilities Build large-scale batch and real-time data pipelines using frameworks like Apache Spark on AWS or GCP. Design, manage, and automate data flows between multiple data sources. Implement best practices for continuous integration, testing, and data quality assurance . Maintain data documentation, definitions, and governance practices. Optimize performance, scalability, and cost-effectiveness of data systems. Collaborate with stakeholders to translate business needs into data-driven solutions. Qualifications Bachelor’s degree in Computer Science, Engineering, or related field (exceptional coding performance on platforms like LeetCode/HackerRank may substitute). 2+ years’ experience working on full lifecycle Big Data projects. Strong foundation in data structures, algorithms, and software design principles . Proficiency in at least two programming languages – Python or Scala preferred. Experience with AWS services such as EMR, Lambda, S3, DynamoDB (GCP equivalents also relevant). Hands-on experience with Databricks Notebooks and Jobs API. Strong expertise in big data frameworks: Spark, MapReduce, Hadoop, Sqoop, Hive, HDFS, Airflow, Zookeeper . Familiarity with containerization (Docker) and workflow management tools (Apache Airflow) . Intermediate to advanced knowledge of SQL (relational + NoSQL databases like Postgres, MySQL, Redshift, Redis). Experience with SQL tuning, schema design, and analytical programming . Proficient in Git (version control) and collaborative workflows. Comfortable working across diverse technologies in a fast-paced, results-oriented environment .

Posted 5 days ago

Apply

2.0 - 4.0 years

0 Lacs

mumbai, maharashtra, india

On-site

Data Engineer II Location: Bangalore/Mumbai About Media.net : Media.net is a leading, global ad tech company that focuses on creating the most transparent and efficient path for advertiser budgets to become publisher revenue. Our proprietary contextual technology is at the forefront of enhancing Programmatic buying, the latest industry standard in ad buying for digital platforms. The Media.net platform powers major global publishers and ad-tech businesses at scale across ad formats like display, video, mobile, native, as well as search. Media.nets U.S. HQ is based in New York, and the Global HQ is in Dubai. With office locations and consultant partners across the world, Media.net takes pride in the value-add it offers to its 50+ demand and 21K+ publisher partners, in terms of both products and services. What does the team do Every single web page view hits one or more services and hosts high scale service to handle this large volume of requests, across 5 million unique topics. Some of the platforms built and managed by the team. To achieve this we use cutting edge Machine Learning and AI technologies on a large Hadoop cluster. Tech Stack Java, Elastic Search/Solr, Kafka, Spark, Machine Learning, NLP, Deep Learning, Redis, Big Data technologies like Hadoop, HBase, YARN etc. Roles and Responsibilities: Design, execution and management of a large and complex distributed data systems. Monitoring of performance and optimizing existing projects. Researching on and integrating any Big Data tools and frameworks required to provide requested capabilities. Understanding Business/Data requirements and implementing scalable solutions. Creating reusable components and data tools which help all the teams in the company to integrate with our data platform. Who should apply for this role 2 to 4 years of experience in big data technologies (Apache Hadoop) and relational databases (ms sql server/oracle/MySQL/postgres). Proficiency in at least one of the following programming languages - Java, Python or Scala. Expertise in SQL (T-SQL/PL-SQL/SPARK-SQL/HIVE-QL). Proficiency in Apache Spark. Hands on knowledge of working with Data Frames, Data Sets, RDDs, Spark SQL/PySpark/Scala APIs with deep understanding of Performance Optimizations. Good Understanding of Distributed Storage (HDFS/S3). Strong analytical/quantitative skills and comfortable working with very large sets of data. Experience with integration of data across multiple data sources. Good understanding of distributed computing principles. Good to have skills: Experience with Message Queues (e.g., Apache Kafka). Experience with MPP systems (e.g., Redshift/Snowflake). Experience with NoSQL storage (e.g., MongoDB). Show more Show less

Posted 5 days ago

Apply

4.0 - 6.0 years

0 Lacs

gurugram, haryana, india

On-site

Role Title - Site Reliability Engineer Location: Gurgaon (Hybrid) Bravuras Commitment and Mission At Bravura Solutions, collaboration, diversity and excellence matter. We value your ideas, giving you room to be curious and innovate in an exciting, fast-paced, and flexible environment. We look for many different skills and abilities, as well as how you can add value to Bravura and our culture. As a Global FinTech market leader and ASX listed company, Bravura is a trusted partner to over 350 leading financial services clients, delivering wealth management technology and products. We invest significantly in our technology hubs and innovation labs, which inspire and drive our creative, future-focused mindset. We take pride in developing cutting-edge, digital first technology solutions that support our clients to achieve financial security and prosperity for their customers. Position Purpose Join our dedicated Service Operations Team and contribute to the successful delivery of exciting projects within Bravura&aposs application portfolio. The Observability Team ensures the health, performance, and reliability of systems and applications by providing crucial insights. Site Reliability Engineers (SREs) are skilled engineers who blend technical expertise with a passion for improvement. They creatively solve complex challenges, ensuring the availability and reliability of critical services. SREs collaborate with business leaders to build and maintain sustainable systems that adapt to a dynamic global environment. At Bravura, we&aposre dedicated to building software that solves real-world problems. Our SREs play a vital role in empowering our users with a robust and high-performance platform. As we expand, we seek an experienced SRE who can bring fresh perspectives and innovative solutions. This individual will collaborate with cross-functional teams to deliver exceptional user experiences. Main Activities Based in Gurgaon, you will join us in ensuring our applications deliver high availability, optimal performance, and reliable uptime that meet our clients' needs and service level agreements. We&aposre looking for proactive, curious individuals with a focus on continuous improvement and automation. Your day-to-day responsibilities will be: Proactively monitor and observe business services and processes to ensure uninterrupted service delivery. Continuously optimize system performance, anticipating client needs by proactively improving the reliability of services throughout their lifecycle. Support deployment, availability, reliability, performance, and customer escalation targets for these environments Create traceability of workflow transactions, alerting strategies & corresponding triggers Maintaining/Monitoring applications and infrastructure across multiple production and non-production environments Providing support of applications to resolve issues by troubleshooting application and infra issues while coordinating with multiple stakeholders. Actively work with development teams to diagnose application performance issues and identify areas for improvement Take responsibility for a piece of work and see it through from specification into production (in collaboration with others) Work closely with other teams to improve knowledge sharing and platform understanding Document and provide feedback on application documentation and tickets Incident management and response within a 24/7 environment and ensuring service level targets are met. Key skills Experience in supporting a cloud platform (AWS/Azure) along with previous comprehensive experience in application support to support non-cloud-based applications. Sound understanding of Site Reliability Engineering principles to manage a complex suite of environments and SRE tooling and leveraging SRE technology and tools to further automate current platforms and environment management activities. Demonstrated skills with automation including scripting knowledge Shell / Bash Experience in Monitoring tools AppDynamics and Grafana and Prometheus Experience in troubleshooting applications in Java / REST APIs / JSON Excellent communication skills, with the ability to communicate ideas, concepts and facts to Clients, peers, and senior members of staff Friendly, professional, and business-like approach to both external and internal clients Systematic, logical thinker with excellent attention to detail Good client focus with the ability to build positive effective relationships. The aptitude to be flexible and assertive in demanding circumstances. Self-control and resilience including the ability to work effectively under pressure. Proven use of problem-solving skills with the initiative to proactively resolve issues. Excellent team and interpersonal skills Empathy and the ability to understand customer needs. Effective organization and time management skills Able to work unaided and as part of a collaborative team. Qualifications and Experience Bachelors degree in computer science or other highly technical, scientific discipline/MCA 4-6 years of relevant industry experience Any experience in regular expressions is bonus. Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C++, Ruby, and JavaScript Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn) Proven knowledge of databases, SQL preferable on Oracle Database or SQL Server A basic understanding of service delivery processes ie. Incident Management Code promotion and release process Change Control Problem Management Availability Management Contingency planning / business continuity Configuration Management Proven experience gained in an IT related role within the Financial Services Industry advantageous A proactive approach to spotting problems, areas for improvement, and performance bottlenecks Characteristics Consultative and an effective influencer Ability to apply analytical skill and conceptual thinking to operations and system planning. Ability to collaborate with clients. Commercial awareness Capable of working on-site at client offices. Troubleshooting and debugging capabilities/techniques Show more Show less

Posted 5 days ago

Apply

10.0 - 12.0 years

10 - 20 Lacs

chennai

Work from Office

Responsibilities for Data Engineer (AM Role) Develop data pipelines in Python, SQL - build reusable python components. Good understanding of Individual Insurance Markets Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions. Mine and analyze data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies. Assess the effectiveness and accuracy of new data sources and data gathering techniques. Develop processes and tools to monitor and analyze model performance and data accuracy. Team management. Exhibits strong understanding of core business processes and the purpose of the team. Build strong relationships with US teams to identify and deliver enhancements . Provide regular updates to stakeholders and discuss solutions to potential problem areas. Drive opportunities to leverage / improve quality & efficiency Guiding team on implementing solutions using appropriate analytical tools & techniques Project management skills with an emphasis on identifying and solving complex business issues, ability to plan, organize and set priorities. Qualifications for Data Engineer (AM Role) Strong problem-solving skills with an emphasis on product development. Experience using statistical computer languages (Python, SQL, etc.) to manipulate data and draw insights from large data sets. Experience working with and creating data architectures. Excellent written and verbal communication skills for coordinating across teams. A drive to learn and master new technologies and techniques. We’re looking for someone with 10+ years of experience manipulating data sets and is familiar with the following software/tools: Coding knowledge and experience with: Python, SQL, etc. Experience querying databases and using statistical computer languages: Python, SQL, etc. Experience analyzing data from 3rd party providers: Google Analytics, Salesforce, Neustar, Facebook Insights, etc. Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, MySQL, etc. Location: This position can be based in any of the following locations: Chennai, Gurgaon For internal use only: R000107713

Posted 6 days ago

Apply

10.0 - 14.0 years

0 Lacs

maharashtra

On-site

As a Data Ops Capability Deployment Analyst at Citigroup, you will be a seasoned professional applying your in-depth disciplinary knowledge to contribute to the development of new solutions, frameworks, and techniques for the Enterprise Data function. Your role will involve integrating subject matter expertise and industry knowledge within a defined area, requiring a thorough understanding of how different areas collectively integrate within the sub-function to contribute to the overall business objectives. Your primary responsibilities will include performing data analytics and data analysis across various asset classes, as well as building data science and tooling capabilities within the team. You will collaborate closely with the wider Enterprise Data team, particularly the front to back leads, to deliver on business priorities. Working within the B & I Data Capabilities team in the Enterprise Data function, you will manage the Data quality/Metrics/Controls program and implement improved data governance and data management practices across the region. The Data quality program focuses on enhancing Citigroup's approach to data risk and meeting regulatory commitments in this area. Key Responsibilities: - Utilize a data engineering background to work hands-on with Distributed Data platforms and Cloud services. - Demonstrate a sound understanding of data architecture and data integration with enterprise applications. - Research and evaluate new data technologies, data mesh architecture, and self-service data platforms. - Collaborate with the Enterprise Architecture Team to define and refine the overall data strategy. - Address performance bottlenecks, design batch orchestrations, and deliver Reporting capabilities. - Perform complex data analytics on large datasets, including data cleansing, transformation, joins, and aggregation. - Build analytics dashboards and data science capabilities for Enterprise Data platforms. - Communicate findings and propose solutions to various stakeholders. - Translate business and functional requirements into technical design documents. - Work closely with cross-functional teams to prepare handover documents and manage testing and implementation processes. - Demonstrate an understanding of how the development function integrates within the overall business/technology landscape. Skills & Qualifications: - 10+ years of active development background in Financial Services or Finance IT. - Experience with Data Quality, Data Tracing, Data Lineage, and Metadata Management Tools. - Hands-on experience with ETL using PySpark on distributed platforms, data ingestion, Spark optimization, resource utilization, and batch orchestration. - Proficiency in programming languages such as Python, with experience in data manipulation and analysis libraries. - Strong SQL skills and experience with DevOps tools like Jenkins/Lightspeed, Git, CoPilot. - Knowledge of BI visualization tools like Tableau, PowerBI. - Experience in implementing Datalake/Datawarehouse for enterprise use cases. - Exposure to analytical tools and AI/ML is desired. Education: - Bachelor's/University degree, master's degree in information systems, Business Analysis, or Computer Science. In this role, you will play a crucial part in driving compliance with applicable laws, rules, and regulations while safeguarding Citigroup, its clients, and assets. Your ability to assess risks and make informed business decisions will be essential in maintaining the firm's reputation. Please refer to the full Job Description for more details on the skills, qualifications, and responsibilities associated with this position.,

Posted 6 days ago

Apply

2.0 - 6.0 years

0 Lacs

maharashtra

On-site

Unlock your potential with Dassault Systmes, a global leader in Scientific Software Engineering as a Big Data Engineer in Pune, Maharashtra! Role Description & Responsibilities: - Data Pipeline Development: Design, develop, and maintain robust ETL pipelines for batch and real-time data ingestion, processing, and transformation using Spark, Kafka, and Python. - Data Architecture: Build and optimize scalable data architectures, including data lakes, data marts, and data warehouses, to support business intelligence, reporting, and machine learning. - Data Governance: Ensure data reliability, integrity, and governance by enabling accurate, consistent, and trustworthy data for decision-making. - Collaboration: Work closely with data analysts, data scientists, and business stakeholders to gather requirements, identify inefficiencies, and deliver scalable and impactful data solutions. - Optimization: Develop efficient workflows to handle large-scale datasets, improving performance and minimizing downtime. - Documentation: Create detailed documentation for data processes, pipelines, and architecture to support seamless collaboration and knowledge sharing. - Innovation: Contribute to a thriving data engineering culture by introducing new tools, frameworks, and best practices to improve data processes across the organization. Qualifications: - Educational Background: Bachelor's degree in Computer Science, Engineering, or a related field. - Professional Experience: 2-3 years of experience in data engineering, with expertise in designing and managing complex ETL pipelines. Technical Skills: - Proficiency in Python, PySpark, and Spark SQL for distributed and real-time data processing. - Deep understanding of real-time streaming systems using Kafka. - Experience with data lake and data warehousing technologies (Hadoop, HDFS, Hive, Iceberg, Apache Spark). - Strong knowledge of relational and non-relational databases (SQL, NoSQL). - Experience in cloud and on-premises environments for building and managing data pipelines. - Experience with ETL tools like SAP BODS or similar platforms. - Knowledge of reporting tools like SAP BO for designing dashboards and reports. - Hands-on experience building end-to-end data frameworks and working with data lakes. Analytical and Problem-Solving Skills: Ability to translate complex business requirements into scalable and efficient technical solutions. Collaboration and Communication: Strong communication skills and the ability to work with cross-functional teams, including analysts, scientists, and stakeholders. Location: Willingness to work from Pune (on-site). What is in it for you - Work for one of the biggest software companies. - Work in a culture of collaboration and innovation. - Opportunities for personal development and career progression. - Chance to collaborate with various internal users of Dassault Systmes and also stakeholders of various internal and partner projects. Inclusion Statement: As a game-changer in sustainable technology and innovation, Dassault Systmes is striving to build more inclusive and diverse teams across the globe. We believe that our people are our number one asset and we want all employees to feel empowered to bring their whole selves to work every day. It is our goal that our people feel a sense of pride and a passion for belonging. As a company leading change, it's our responsibility to foster opportunities for all people to participate in a harmonized Workforce of the Future.,

Posted 6 days ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

This position falls under the ICG TTS Operations Technology (OpsTech) Group, focusing on assisting in the implementation of a next-generation Digital Automation Platform and Imaging Workflow Technologies. The ideal Candidate should have relevant experience in managing development teams within the distributed systems Eco-System and must exhibit strong teamwork skills. The candidate is expected to possess superior technical knowledge of current programming languages, technologies, and leading-edge development tools. The primary objective of this role is to contribute to applications, systems analysis, and programming activities. As a Lead Spark Scala Engineer, the candidate should have hands-on knowledge of SPARK, Py-Spark, Scala, Java, and RDBMS like MS-SQL/Oracle. Familiarity with CI/CD tools such as LightSpeed and uDeploy is also required. Key Responsibilities include: - Development & Optimization: Develop, test, and deploy production-grade Spark applications in Scala, ensuring optimal performance, scalability, and resource utilization. - Technical Leadership: Provide guidance to a team of data engineers, promoting a culture of technical excellence and collaboration. - Code Review & Best Practices: Conduct thorough code reviews, establish coding standards, and enforce best practices for Spark Scala development, data governance, and data quality. - Performance Tuning: Identify and resolve performance bottlenecks in Spark applications through advanced tuning techniques. - Deep Spark Expertise: Profound understanding of Spark's architecture, execution model, and optimization techniques. - Scala Proficiency: Expert-level proficiency in Scala programming, including functional programming paradigms and object-oriented design. - Big Data Ecosystem: Strong hands-on experience with the broader Hadoop ecosystem and related big data technologies. - Database Knowledge: Solid understanding of relational databases and NoSQL databases. - Communication: Excellent communication, interpersonal, and leadership skills to convey complex technical concepts effectively. - Problem-Solving: Exceptional analytical and problem-solving abilities with meticulous attention to detail. Education Requirement: - Bachelors degree/University degree or equivalent experience This position is a full-time role falling under the Technology Job Family Group and Applications Development Job Family. The most relevant skills include those mentioned in the requirements section, while additional complementary skills can be found above or by contacting the recruiter.,

Posted 6 days ago

Apply

4.0 - 8.0 years

0 Lacs

karnataka

On-site

As a Data Engineer Trainer/Big Data Trainer, you will be responsible for imparting knowledge and training on various technical aspects related to data engineering and big data. Your key responsibilities will include expertise in Data Mining and ETL Operations/Tools. It is crucial to have a deep understanding and knowledge of HDFS, Hadoop System, Map Reduce, RDD, Spark DataFrame, PySpark along with related concepts. You should also have experience in using Business Intelligence Tools such as Tableau, Power BI, and Big Data Frameworks like Hadoop and Spark. Proficiency in Pig, Hive, Sqoop, and Kafka is essential for this role. Knowledge of AWS and/or Azure, especially with Big Data Stack, will be an added advantage. You should possess a high level of proficiency in Standard Database skills like SQL, NoSQL Databases, Data Preparation, Cleaning, and Wrangling/Munging. Having a strong foundation and advanced level understanding of Statistics, R Programming, Python, and Machine Learning is necessary to excel in this role.,

Posted 6 days ago

Apply

4.0 - 9.0 years

6 - 16 Lacs

hyderabad, chennai, bengaluru

Hybrid

Big Data Development Administration.AWS EMR clusters setup, scaling, tuningSpark (PySpark/Scala), Hive, HDFS, YARN.SQL Redshift/ SnowflakeAWS services: S3, IAM, Lambda, Glue, CloudWatch, Step Functions.Linux/Unix(Airflow, Oozie, Control-M, Autosys).

Posted 1 week ago

Apply

7.0 - 12.0 years

10 - 20 Lacs

hyderabad

Work from Office

Notice Period : Immediate to 30 days Mandatory Scala and Python Apache Spark (batch & streaming) must! Deep knowledge of HDFS internals and migration strategies. Experience with Apache Iceberg (or similar table formats like Delta Lake / Apache Hudi) for schema evolution, ACID transactions, and time travel. Running Spark and/or Flink jobs on Kubernetes (e.g., Spark-on-K8s operator, Flink-on-K8s). Experience with distributed blob storages like Ceph or AWS S3 and similar Building ingestion, transformation, and enrichment pipelines for large-scale datasets. Infrastructure-as-Code (Terraform, Helm) for provisioning data infrastructure. Ability to work independently while guiding juniors. Nice to have: Experience with Apache Flink Prior experience in migration projects or large-scale data platform modernization. Apple experience preferred (to enable him/her to get up to speed on our tooling set quickly and more independently) Role & responsibilities Preferred candidate profile

Posted 1 week ago

Apply

8.0 - 12.0 years

0 Lacs

bengaluru, karnataka, india

On-site

As a DataOps Lead, you will be responsible for managing, design highly-scalable and Available solution for data pipelines that provides the foundation for collecting, storing, modeling, and analyzing massive data sets from multiple channels. This position reports to Devops Architect. Responsibilities: Align Sigmoid with key Client initiatives Interface daily with customers across leading Fortune 500 companies to understand strategic requirements Connect with VP and Director level clients on a regular Travel to client locations Ability to understand business requirements and tie them to technologysolutions Strategically support Technical Initiatives Design, manage & deploy highly scalable and fault-tolerant distributed components using Big data Ability to evaluate and choose technology stacks that best fit client data strategy and constraints Drive Automation and massive deployments Ability to drive good engineering practices from bottom up Develop industry leading CI/CD, monitoring and support practices inside the team Develop scripts to automate devops processes to reduce team effort Work with the team to develop automation and resolve issue Support TB scale pipelines Perform root cause analysis for production errors Support developers in day to day devops operations Excellent experience in Application support, integration development and data Design roster and escalation matrix for team Provide technical leadership and manage it day to day basis Guiding devops in day to day design, automation b support tasks Play a key role in hiring technical talents to build the future of Conduct training for technology stack for developers in house and outside Culture Must be a strategic thinker with the ability to think unconventional / out:of:box. Analytical and data driven Raw intellect, talent and energy are Entrepreneurial and Agile : understands the demands of a private, high growth Ability to be both a leader and hands on "doer". Qualifications: - 8 - 12 years track record of relevant work experience and a computer Science or a related technical discipline is required Proven track record of building and shipping large-scale engineering products and/or knowledge of cloud infrastructure such as GCP/AWS preferred Experience in Shell, Python or any scripting language Experience in managing linux systems, build and release tools like jenkins Effective communication skills (both written and verbal) Ability to collaborate with a diverse set of engineers, data scientists and product managers Comfort in a fast-paced start-up environment Preferred Qualification: Support experience in Big Data domain Architecting, implementing and maintaining Big Data solutions Experience with Hadoop ecosystem (HDFS, MapReduce, Oozie, Hive, lmpala, Spark, Kerberos, KAFKA, etc) Experience in container technologies like Docker, Kubernetes Et configuration management systems Show more Show less

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies