About The Organization
Join a fast-scaling B2B data intelligence platform that powers revenue growth through precision contact data and advanced data infrastructure. The organization combines automation with human verification to deliver over 95% accuracy across its database of more than 5 million human-verified contacts and 70 million+ machine-processed records. Known for having the highest number of direct dial contacts in the industry, the platform re-verifies every contact every 90 days through a dedicated research team.With a global footprint and a commitment to data excellence, the company supports cross-functional teams with scalable, secure, and high-performance data systems. The work culture is dynamic, inclusive, and growth-orientedoffering paid training programs, performance incentives, wellness benefits, and a collaborative environment that values innovation and continuous improvement.Lead Data Engineer
- Experience: 8+ years
- Key Skills: Scala, PySpark, Apache Spark, Delta Lake, DataBricks, SQL
- Responsibilities:
- Architect and lead the development of scalable data pipelines and platforms
- Mentor data engineering teams and enforce best practices across coding, CI/CD, and governance
- Modernize legacy systems to support ML workloads and advanced analytics
- Drive initiatives for cost optimization, automation, and infrastructure scalability
- Collaborate with product, engineering, and business teams to deliver robust technical solutions
Senior Data Engineer
- Experience: 4+ years
- Key Skills: Scala, PySpark, Apache Spark, Delta Lake, DataBricks, SQL
- Responsibilities:
- Build and extend data pipeline architecture for diverse data sources
- Automate manual processes and optimize data delivery
- Support data infrastructure needs across multiple teams and products
- Collaborate with ML, analytics, and engineering teams to enhance data functionality
- Maintain and improve data architecture for next-gen product initiatives
What You'll Need
- Bachelor's degree in Engineering, Computer Science, or related field
- Strong coding skills in Scala and/or Python
- Hands-on experience with Apache Spark and distributed systems
- Familiarity with cloud platforms (preferably DataBricks)
- Advanced SQL and experience with data formats like Parquet, JSON, CSV
- Comfort working in Agile environments and Linux shell
- Machine Learning knowledge is a plus
- Excellent communication and independent problem-solving skills
Skills: ml workloads,csv,pyspark,agile,json,sql,scala,data engineer,databricks,apache spark,big data,computerized system validation (csv),machine learning,linux shell,automation,data pipelines,data engineering,ci/cd,parquet,python,delta lake,lakehouse,aws,azure,opensearch,spark