Company Description
Hybrowlabs Technologies is dedicated to building software better, and faster. We explore every tool that hits the market to find the best stack of tools for software development. Our magical formula and stack of tools will accelerate your software development process. Contact us to learn more.
Role Description
- Design and architect scalable, robust big data solutions using PySpark and related technologies
- Lead the technical vision for data processing pipelines and analytics platforms
- Create comprehensive solution architectures aligned with business requirements and technical constraints
- Design integration patterns for connecting various data sources, APIs, and downstream systems
- Establish and enforce coding standards, best practices, and design patterns across the team
- Conduct architecture reviews and provide technical guidance on complex implementation challenges
Hands-On Development
- Develop high-performance PySpark applications for large-scale data processing and transformation
- Optimize existing PySpark jobs for performance, cost-efficiency, and scalability
- Write efficient, maintainable, and well-documented code that serves as a reference for the team
- Troubleshoot and resolve complex technical issues in production environments
- Implement data quality frameworks and validation mechanisms
- Build reusable components and libraries to accelerate development
Team Leadership & Mentorship
- Provide technical mentorship and guidance to junior and mid-level developers
- Conduct code reviews ensuring quality, performance, and adherence to standards
- Foster a collaborative environment that encourages knowledge sharing and innovation
- Lead technical discussions and facilitate problem-solving sessions
- Guide the team in adopting new technologies and methodologies
Solution Design & Delivery
- Collaborate with business analysts and stakeholders to translate requirements into technical solutions
- Create detailed technical specifications and design documents
- Estimate effort, identify risks, and plan technical deliverables
- Drive proof-of-concepts (POCs) for evaluating new technologies and approaches
- Ensure timely delivery of high-quality solutions meeting functional and non-functional requirements
Integration & Collaboration
- Design and implement integration solutions with various data platforms (databases, data lakes, cloud storage)
- Work closely with DevOps teams to establish CI/CD pipelines for data applications
- Collaborate with data engineers, data scientists, and analytics teams to build end-to-end solutions
- Interface with enterprise architects to ensure alignment with organizational standards
🔧 Required Technical SkillsCore Expertise
PySpark:
2–3+ years of hands-on experience building production-grade applicationsPython:
Strong programming skills with deep understanding of Python ecosystems and librariesApache Spark:
Comprehensive knowledge of Spark architecture, internals, and optimization techniquesBig Data Technologies:
Experience with Hadoop ecosystem, HDFS, Hive, or similar platforms
Data Processing & Engineering
- Expertise in designing and implementing ETL/ELT pipelines at scale
- Strong SQL skills and experience with both relational and NoSQL databases
- Proficiency in data modeling, schema design, and data warehouse concepts
- Experience with data partitioning, bucketing, and optimization strategies
- Knowledge of data quality frameworks and testing methodologies
Cloud & Infrastructure
- Experience with cloud platforms (AWS, Azure, or GCP) and their big data services
- Familiarity with distributed computing concepts and cluster management
- Understanding of containerization (Docker) and orchestration (Kubernetes) is a plus
- Knowledge of cloud-native data services (S3, Azure Data Lake, BigQuery, etc.)
Architecture & Design
- Proven track record in designing scalable, resilient data architectures
- Experience with microservices architecture and API design
- Understanding of data governance, security, and compliance requirements
- Familiarity with streaming technologies (Kafka, Spark Streaming) is advantageous
Tools & Frameworks
- Version control systems (Git, Bitbucket, GitHub)
- CI/CD tools (Jenkins, GitLab CI, Azure DevOps)
- Workflow orchestration tools (Airflow, Databricks workflows)
- Monitoring and logging tools (ELK stack, Splunk, CloudWatch)
🎓 Required QualificationsExperience
Total IT Experience:
6+ years in software development and data engineering rolesPySpark Experience:
Minimum 2–3 years of dedicated PySpark developmentLeadership Experience:
Demonstrated experience leading technical teams or projectsSolution Design:
Proven experience in end-to-end solution design and architecture
Education
- Bachelor's or Master's degree in Computer Science, Information Technology, Engineering, or related field
- Relevant certifications (Databricks, AWS/Azure/GCP, or Spark certifications) are highly desirable
✨ Desired Skills & AttributesTechnical
- Experience with real-time/streaming data processing
- Knowledge of machine learning pipelines and MLOps
- Familiarity with modern data platforms (Databricks, Snowflake, Delta Lake)
- Understanding of data mesh or data fabric architectures
- Experience with infrastructure as code (Terraform, CloudFormation)
Soft Skills
Leadership:
Ability to inspire and guide technical teams toward excellenceCommunication:
Excellent verbal and written communication skills for technical and non-technical audiencesProblem-Solving:
Strong analytical thinking and creative problem-solving abilitiesCollaboration:
Proven ability to work effectively across multiple teams and stakeholdersAdaptability:
Comfortable working in fast-paced, evolving environmentsOwnership:
Takes accountability for technical decisions and project outcomes
🌟 What You'll Work On
- Designing next-generation data platforms and analytics solutions
- Building scalable data pipelines processing terabytes of data daily
- Architecting integrations across diverse enterprise systems
- Optimizing existing systems for performance and cost-efficiency
- Implementing best practices for data quality, governance, and security
- Mentoring team members and elevating overall technical capabilities
- Driving innovation through POCs and adoption of emerging technologies
📍 Work Arrangement
Primary:
Remote work with flexibilityOffice Visits:
Periodic visits to Mumbai office for team collaboration, planning sessions, and stakeholder meetings (frequency to be determined based on project needs)Flexibility:
Results-oriented culture with focus on delivery and collaboration