About Us
CLOUDSUFI, a Google Cloud Premier Partner, is a global leading provider of data-driven digital transformation across cloud-based enterprises. With a global presence and focus on Software & Platforms, Life sciences and Healthcare, Retail, CPG, financial services and supply chain, CLOUDSUFI is positioned to meet customers where they are in their data monetization journey.
Our Values
We are a passionate and empathetic team that prioritizes human values. Our purpose is to elevate the quality of lives for our family, customers, partners and the community.
Equal Opportunity Statement
CLOUDSUFI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified candidates receive consideration for employment without regard to race, colour, religion, gender, gender identity or expression, sexual orientation and national origin status. We provide equal opportunities in employment, advancement, and all other areas of our workplace. Please explore more at https://www.cloudsufi.com/
Role Overview
We are seeking an experienced Technical Architect to lead the design and implementation of scalable, high-performance data architectures. This role will focus on building a unified and intelligent data ecosystem, enabling seamless integration of diverse public datasets into our Knowledge Graph using Google Cloud Platform (GCP).
The ideal candidate will have a deep understanding of modern data architectures, cloud-native services, and best practices in data integration, knowledge graph modelling, and automation frameworks.
Key Responsibilities
Architectural Leadership
-
Define and own the end-to-end architecture for data ingestion, transformation, and knowledge graph integration pipelines.
-
Establish best practices for schema modelling, metadata management, and data lineage across GCP environments.
-
Provide technical leadership and mentorship to Data Engineers and Automation teams.
Solution Design & Implementation
-
Architect large-scale, automated ETL/ELT frameworks leveraging GCP services such as Dataflow, BigQuery, Pub/Sub, Cloud Run, and Cloud Storage.
-
Design schema mapping and entity resolution frameworks using LLM-based tools for auto-schematization and intelligent data classification.
-
Define standards for integrating datasets into the Knowledge Graph, including schema.org compliance, MCF/TMCF file generation, and SPARQL endpoints.
Data Governance & Quality
-
Establish data quality, validation, and observability frameworks (statistical validation, anomaly detection, freshness tracking).
-
Implement governance controls to ensure scalability, performance, and security in data integration workflows.
Automation & Innovation
-
Partner with the Automation POD to integrate AI/LLM-based accelerators for data profiling, mapping, and validation.
-
Drive innovation through reusable data frameworks and automation-first architectural principles.
Collaboration
-
Collaborate closely with cross-functional teams—Engineering, Automation, Managed Services, and Product—to ensure architectural alignment and delivery excellence.
-
Influence and guide architectural decisions across multiple concurrent projects.
Qualifications and Experience
Education:
-
Bachelor’s or Master’s in Computer Science, Data Engineering, or a related field.
Experience:
-
8+ years of experience in Data Engineering or Architecture roles, including 3+ years in GCP-based data solutions.
-
Proven track record in designing and implementing large-scale, production-grade data architectures.
Technical Expertise:
-
Must Have: GCP (BigQuery, Dataflow/Apache Beam, Cloud Run, Cloud Storage, Pub/Sub), Python, SQL, Data Modeling, CI/CD (Cloud Build).
-
Good to Have: SPARQL, Schema.org, RDF/JSON-LD, Apigee, Cloud Data Fusion, and knowledge graph concepts.
-
Strong knowledge of data governance, metadata management, and data lineage frameworks.
Preferred Qualifications
-
Experience working with LLM-based or AI-assisted data processing tools (e.g., auto-schematization, semantic mapping).
-
Familiarity with open data ecosystems and large-scale data integration initiatives.
-
Exposure to multilingual or multi-domain data integration.
-
Strong analytical and problem-solving ability.
-
Excellent communication and stakeholder management skills.
-
Passion for mentoring teams and driving best practices.
Behavioural competencies required
-
Must have worked with US/Europe based clients in onsite/offshore delivery model
-
Should have very good verbal and written communication, technical articulation, listening and presentation skills
-
Should have proven analytical and problem solving skills
-
Should have demonstrated effective task prioritization, time management and internal/external stakeholder management skills
-
Should be a quick learner and team player
-
Should have experience of working under stringent deadlines in a Matrix organization structure
-
Should have demonstrated appreciable Organizational Citizenship Behavior (OCB) in past organizations