Remote
Full Time
Note: This is a remote role with occasional office visits. Candidates from Mumbai or Pune will be preferred About The Company A fast-growing enterprise technology consultancy operating at the intersection of cloud computing, big-data engineering and advanced analytics . The team builds high-throughput, real-time data platforms that power AI, BI and digital products for Fortune 500 clients across finance, retail and healthcare. By combining Databricks Lakehouse architecture with modern DevOps practices, they unlock insight at petabyte scale while meeting stringent security and performance SLAs. Role & Responsibilities Architect end-to-end data pipelines (ingestion → transformation → consumption) using Databricks, Spark and cloud object storage. Design scalable data warehouses/marts that enable self-service analytics and ML workloads. Translate logical data models into physical schemas; own database design, partitioning and lifecycle management for cost-efficient performance. Implement, automate and monitor ETL/ELT workflows, ensuring reliability, observability and robust error handling. Tune Spark jobs and SQL queries, optimizing cluster configurations and indexing strategies to achieve sub-second response times. Provide production support and continuous improvement for existing data assets, championing best practices and mentoring peers. Skills & Qualifications Must-Have 6–10 years building production-grade data platforms, including 3 years+ hands-on Apache Spark/Databricks experience. Expert proficiency in PySpark, Python and advanced SQL, with a track record of performance-tuning distributed jobs. Demonstrated ability to model data warehouses/marts and orchestrate ETL/ELT pipelines with tools such as Airflow or dbt. Hands-on with at least one major cloud platform (AWS or Azure) and modern lakehouse / data-lake patterns. Strong problem-solving skills, DevOps mindset and commitment to code quality; comfortable mentoring fellow engineers. Preferred Deep familiarity with the AWS analytics stack (Redshift, Glue, S3) or the broader Hadoop ecosystem. Bachelor’s or Master’s degree in Computer Science, Engineering or a related field. Experience building streaming pipelines (Kafka, Kinesis, Delta Live Tables) and real-time analytics solutions. Exposure to ML feature stores, MLOps workflows and data-governance/compliance frameworks. Relevant professional certifications (Databricks, AWS, Azure) or notable open-source contributions. Benefits & Culture Highlights Remote-first & flexible hours with 25+ PTO days and comprehensive health cover. Annual training budget & certification sponsorship (Databricks, AWS, Azure) to fuel continuous learning. Inclusive, impact-focused culture where engineers shape the technical roadmap and mentor a vibrant data community Skills: data modeling,big data technologies,team leadership,agile methodologies,performance tuning,data,aws,airflow
PyjamaHR
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowIndia
Salary: Not disclosed
Mumbai Metropolitan Region
Salary: Not disclosed
Bengaluru
8.0 - 13.0 Lacs P.A.
India
15.0 - 17.0 Lacs P.A.
Kochi, Kerala, India
Salary: Not disclosed
Visakhapatnam, Andhra Pradesh, India
Salary: Not disclosed
Greater Bhopal Area
Salary: Not disclosed
Indore, Madhya Pradesh, India
Salary: Not disclosed
Chandigarh, India
Salary: Not disclosed
Dehradun, Uttarakhand, India
Salary: Not disclosed