Sourcing and Screening: Identifying, attracting, and engaging with potential candidates through various channels, including online job boards, social media, and professional networks. Technical Interviewing: Conducting interviews and assessments to evaluate candidates' skills and experience, often requiring technical expertise in the IT field. Candidate Pipeline Management: Maintaining a strong pipeline of qualified candidates, ensuring a readily available pool for future openings. Offer Negotiation and Compliance: Assisting with salary negotiations, making offers, and ensuring all recruitment practices comply with US labor laws and visa requirements. Collaboration and Communication: Working closely with hiring managers, HR teams, and other stakeholders to understand requirements and facilitate the hiring process. Strategic Recruitment: Developing and executing recruitment strategies to meet specific hiring needs and goals. Staying Updated: Keeping abreast of industry trends and best practices in IT recruitment and US labor laws.
Roles and Responsibilities A Databricks Platform Administrator is a crucial role responsible for the effective design, implementation, maintenance, and optimization of the Databricks Lakehouse Platform within an organization. This individual ensures the platform is scalable, performant, secure, and aligned with business objectives, providing essential support to data engineers, data scientists, and analysts. Key Responsibilities: 1. Provision and configure Databricks workspaces, clusters, pools, and jobs across environments. 2. Create catalogs, schemas, access controls, and lineage configurations. 3. Implement identity and access management using account groups, workspace-level permissions, and data-level governance. 4. Monitor platform health, cluster utilization, job performance, and cost using Databricks admin tools and observability dashboards. 5. Automate workspace onboarding, schema creation, user/group assignments, and external location setup using Terraform, APIs, or CLI. 6. Integrate with Azure services like ADLS Gen2, Azure Key Vault, Azure Data Factory, and Azure Synapse. 7. Support model serving, feature store, and MLflow lifecycle management for Data Science/ML teams. 8. Manage secrets, tokens, and credentials securely using Databricks Secrets and integration with Azure Key Vault. 9. Define and enforce tagging policies, data masking, and row-level access control using Unity Catalog and Attribute-Based Access Control (ABAC). 10. Ensure compliance with enterprise policies, security standards, and audit requirements. 11. Coordinate with Ops Architect, Cloud DevOps teams for network, authentication (e.g., SSO), and VNET setup. 12. Troubleshoot workspace, job, cluster, or permission issues for end users and data teams. Preferred Qualifications: Databricks Certified Associate Platform Administrator or other relevant Databricks certifications. Experience with Apache Spark and data engineering concepts. Knowledge of monitoring tools (e.g., Splunk, Grafana, Cloud-native monitoring). Familiarity with data warehousing and data lake concepts. Experience with other big data technologies (e.g., Hadoop, Kafka). Previous experience leading or mentoring junior administrators.
Role Design, Build & Optimize Scalable Data Solutions and End-to-End Pipelines Responsibilities Develop, and deploy end-to-end data pipelines and solutions on Databricks, integrating with various data sources and systems. Collaborate with cross-functional teams to understand data, and deliver effective BI solutions. Implement data ingestion, transformation, and processing workflows using Spark (PySpark/Scala), SQL, and Databricks notebooks. Develop and maintain data models, ETL/ELT processes ensuring high performance, reliability, scalability and data quality. Build and maintain APIs and data services to support analytics, reporting, and application integration. Ensure data quality, integrity, and security across all stages of the data lifecycle. Monitor, troubleshoot, and optimize pipeline performance in a cloud-based environment. Write clean, modular, and well-documented Python/Scala/SQL/PySpark code. Integrate data from various sources, including APIs, relational and non-relational databases, IoT devices, and external data providers. Ensure adherence to data governance, security, and compliance policies. Required Skills and Experience: Bachelors or Master’s degree in Computer Science, Engineering, or a related field. 5-6 years of hands-on experience in data engineering, with a strong focus on Databricks and Apache Spark. Strong programming skills in Python/PySpark and/or Scala , with a deep understanding of Apache Spark. Experience with Azure Databricks. Strong SQL skills for data manipulation, analysis, and performance tuning. Strong understanding of data structures and algorithms, with the ability to apply them to optimize code and implement efficient solutions. Strong understanding of data architecture, data modeling, ETL/ELT processes, and data warehousing concepts. Experience building and maintaining ETL/ELT pipelines in production environments. Familiarity with Delta Lake, Unity Catalog, or similar technologies. Experience working with structured and unstructured data, including JSON, Parquet, Avro, and time-series data. Familiarity with CI/CD pipelines and tools like Azure DevOps, version control (Git), and DevOps practices for data engineering. Excellent problem-solving skills, attention to detail, and ability to work independently or as part of a team. Strong communication skills to interact with technical and non-technical stakeholders. Preferred Qualifications: Experience with Delta Lake and Databricks Workflows. Exposure to real-time data processing and streaming technologies (Kafka, Spark Streaming). Exposure to data visualization tool Databricks Genie for data analysis and reporting. Knowledge of data governance, security, and compliance best practices.