Posted:3 months ago|
Platform:
Work from Office
Full Time
Experience with processing large workloads and complex code on Spark clusters. Experience setting up monitoring for Spark clusters and driving optimization based on insights and findings. Understanding designing and implementing scalable data warehouse solutions to support analytical and reporting needs. Strong analytic skills related to working with unstructured datasets. Understanding of building processes supporting data transformation, data structures, metadata, dependency, and workload management. Knowledge of message queuing, stream processing, and highly scalable big data data stores. Knowledge of Python and Jupyter Notebooks. Knowledge of big data tools like Spark, Kafka, etc. Experience with relational SQL and NoSQL databases, including Postgres and Cassandra. Experience with data pipeline and workflow management tools like Azkaban, Luigi, Airflow, etc. Experience with AWS cloud services (EC2, EMR, RDS, and Redshift). Willingness to work from an office at least 2 times per week. Nice to have: Knowledge of stream-processing systems (Storm, Spark-Streaming). Responsibilities: Optimize Spark clusters for cost, efficiency, and performance by implementing robust monitoring systems to identify bottlenecks using data and metrics. Provide actionable recommendations for continuous improvement. Optimize the infrastructure required for extracting, transforming, and loading data from various data sources using SQL and AWS big data technologies. Work with data and analytics experts to strive for greater cost efficiencies in the data systems.
Svitla Systems
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections Svitla Systems
25.0 - 30.0 Lacs P.A.