As a Data Engineer specializing in geospatial data, your primary responsibility is to design, build, and maintain data infrastructure and systems that handle geospatial information effectively. You will work closely with cross-functional teams, including data scientists, geospatial analysts, and software engineers, to ensure that geospatial data is collected, processed, stored, and analyzed efficiently and accurately. Key Responsibilities: Data Pipeline Development: Design and implement robust data pipelines to acquire, ingest, clean, transform, and process geospatial data from various sources such as satellites, aerial, drones, and geolocation services. Data Ingestion, Storage and Extraction: Develop data models and schemas tailored to geospatial data structures, ensuring optimal performance and scalability for storage and retrieval operations. Spatial Database Management: Manage geospatial databases, including both traditional relational databases (e.g., PostgreSQL with PostGIS extension) and NoSQL databases (e.g., MongoDB, Cassandra) to store and query spatial data efficiently. Geospatial Analysis Tools Integration: Integrate geospatial analysis tools and libraries (e.g., GDAL, GeoPandas, Fiona) into data processing pipelines and analytics workflows to perform spatial data analysis, visualization, and geoprocessing tasks. Geospatial Data Visualization: Collaborate with data visualization specialists to create interactive maps, dashboards, and visualizations that effectively communicate geospatial insights and patterns to stakeholders. (frontend related) Performance Optimization: Identify and address performance bottlenecks in data processing and storage systems, leveraging techniques such as indexing, partitioning, and parallelization to optimize geospatial data workflows. Data Quality Assurance: Implement data quality checks and validation procedures to ensure the accuracy, completeness, and consistency of geospatial data throughout the data lifecycle. Geospatial Data Governance: Establish data governance policies and standards specific to geospatial data, including metadata management, data privacy, and compliance with geospatial regulations and standards (e.g., INSPIRE, OGC). Collaboration and Communication: Collaborate with cross-functional teams to understand geospatial data requirements and provide technical expertise and support. Communicate findings, insights, and technical solutions effectively to both technical and non-technical stakeholders. Requirements:Must-have: Bachelor's or Master's degree in Computer Science or a related field. 3-5 years of experience working in the field and deploying the pipelines in production. Strong programming skills in languages such as Python, Java, or Scala, with experience in geospatial libraries and frameworks (e.g., Rasterio, Shapely). Experience with distributed computing frameworks (e.g., Apache Spark, Airflow) and cloud-based data platforms (e.g., AWS, Azure, Google Cloud Platform). Familiarity with geospatial data formats and standards (e.g., GeoJSON, Shapefile, KML) and geospatial data visualization tools (e.g., Mapbox, Leaflet, Tableau). Strong analytical and problem-solving skills, with the ability to work with large and complex geospatial datasets. Good-to-have: Proficiency in SQL and experience with geospatial extensions for relational databases (e.g., PostGIS). Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment. Nice to have experience with geospatial libraries such as Rasterio, Xarray, Geopandas, and GDAL. Nice to have Knowledge of distributed computing frameworks such as Dask. STAC, GeoParquet, Cloud native tools. Productionising Data Science code. The role of a Data Engineer for Geospatial Data is crucial in enabling organizations to leverage the power of geospatial information for various applications, including urban planning, environmental monitoring, transportation, agriculture, and emergency response. Benefits: Medical Health Cover for you and your family, including unlimited online doctor consultations Access to mental health experts for you and your family Dedicated allowances for learning and skill development Comprehensive leave policy with casual leaves, paid leaves, marriage leaves, bereavement leaves Twice a year appraisal Job Type: Full-time Work Location: In person
We are looking for a Machine Learning Operations Engineer to join our team, to design, build, and integrate ML Ops for large-scale, distributed machine learning systems, focusing on cutting-edge tools, distributed GPU training, and enhancing research experimentation. Roles & Responsibilities: Architect, build, and integrate end-to-end life cycles of large-scale, distributed machine learning systems i.e., ML Ops, using cutting-edge tools/frameworks. Develop tools and services for the explainability of ML solutions. Implement distributed cloud GPU training approaches for deep learning models. Build software/tools that improve the rate of experimentation for the research team and extract insights from it. Identify and evaluate new patterns and technologies to improve the performance, maintainability, and elegance of our machine learning systems. Lead and execute technical projects to completion. Communicate with peers to build requirements and track progress. Mentor fellow engineers in your areas of expertise - Contribute to a team culture that values effective collaboration, technical excellence, and innovation. Collaborate with engineers across various functions to solve complex data problems at scale. Qualification: 5 - 8 years of professional experience in implementing the MLOps framework to scale up ML in production. Master’s degree or PhD in Computer Science, Machine Learning / Deep Learning domains Must-have: Hands-on experience with Kubernetes, Kubeflow, MLflow, Sagemaker, and other ML model experiment management tools, including training, inference, and evaluation. Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.) Proficiency with ML model training frameworks (PyTorch, PyTorch Lightning, Tensorflow, etc.). Experience with GPU computing to do data and model training parallelism. Solid software engineering skills in developing systems for production. Strong expertise in Python. Building end-to-end data systems as an ML Engineer, Platform Engineer, or equivalent. Experience working with cloud data processing technologies (S3, ECR, Lambda, AWS, Spark, Dask, ElasticSearch, Presto, SQL, etc.). Having Geospatial / Remote sensing experience is a plus. Competencies: Excellent debugging and critical thinking skills. Excellent analytical and problem-solving skills. Ability to work in a fast-paced, team-based environment. Benefits: Medical Health Cover for you and your family, including unlimited online doctor consultations Access to mental health experts for you and your family Dedicated allowances for learning and skill development Comprehensive leave policy with casual leaves, paid leaves, marriage leaves, bereavement leaves Twice a year appraisal Job Type: Full-time Work Location: In person
We are looking for a Data Science Intern to join our Data Science team. As an Intern, you will contribute to advancing machine learning for geospatial applications, with a focus on self-supervised and weakly supervised learning. The role combines research and hands-on development: you’ll explore large-scale Earth observation datasets, design and train models across multiple data modalities, and benchmark them against state-of-the-art approaches from the research community. At the same time, you’ll gain experience in building robust, reproducible systems and collaborating with a fast-moving team to turn ideas into practical solutions that push the boundaries of geospatial AI. Key responsibilities Conduct data discovery and exploratory analysis on open-source Earth observation datasets. Collaborate with researchers and engineers to design experiments, share insights, and iterate on approaches while writing clean, maintainable code. Build and train machine learning models across multiple data modalities (RGB, multispectral, radar, LiDAR, text, etc.). Benchmark models against state-of-the-art baselines from peer-reviewed research. Maintain experiment logs, ensure reproducibility, and contribute to shared code repositories Document methodologies and share results through technical reports, internal presentations, or research publications. What we are looking for Pursuing (or recently completed) B.Tech, M.Tech, MS (Research), or PhD in a technical field relevant to the role (e.g., CS, EE, EC, AI, etc.), Proficiency in Python, with experience in ML/DL frameworks (PyTorch) Prior hands-on experience with ML/DL projects (academic, research, or personal) in topics related to computer vision Skilled at translating research papers into working prototypes and practical implementations. Good to have: Prior experience in working with geospatial data and familiarity with geospatial processing libraries (GDAL, rasterio, geopandas, xarray, rioxarray) Publications in remote sensing or ML/vision conferences Internship Details Mode: Hybrid Duration: 3 Months Stipend: ₹15,000 – ₹35,000/month (Based on experience and skillset) Openings: 3 Positions Full-Time Opportunity: High-performing interns will be considered for a full-time role upon successful completion of the internship. Perks Letter of recommendation Certificate Free snacks and beverages Informal dress code 5 days a week If you are a motivated and talented individual with a passion for data science, this is the perfect opportunity for you to expand your skills and make a real impact. Apply now and be part of a team that is shaping the future of geospatial analytics. Job Types: Full-time, Internship Pay: ₹15,000.00 - ₹35,000.00 per month Work Location: In person