Job Description Summary DESCRIPTION
We are seeking an experienced Data Architect who specializes in modernizing enterprise data platforms for the AI era. This role requires someone who deeply understands both traditional data architectures and the emerging requirements of AI systems, with expertise in bridging existing data lakes to support modern AI capabilities like RAG (Retrieval-Augmented Generation), vector search, and multi-modal AI applications. You''ll be the architect who transforms our wealth of structured and unstructured data assets into AI-ready infrastructure. The ideal candidate will have 10+ years of experience with enterprise data platforms and proven expertise in handling both structured and unstructured data at scale. You understand the complexities of existing data lake architectures and can architect the evolution path to support AI workloads without disrupting current operations. As a GE Vernova accelerator, GE Vernova Advanced Research is driving strategy and leading research & development efforts to execute on the business''s mission to help power the energy transition. We forge the collaborations and help invent the technologies required to electrify and decarbonize for a zero-carbon future. Representing virtually every major scientific and engineering discipline, our researchers are collaborating with GE Vernova''s businesses, the U.S. government, and more than 420 entities at the forefront of technology to execute on 150+ energy-focused projects. Collectively, these research programs and initiatives aim to solve near term technical challenges, deliver next generation product advances, and drive long term breakthrough innovation to enable more affordable, reliable, sustainable, and secure energy.

Job Description

Key job responsibilities

Unstructured Data & AI Enablement

Design scalable architectures for processing and indexing unstructured data (PDFs, documents, emails, logs, images) for AI consumption
Architect document processing pipelines that leverage multi-modal LLMs (GPT-4V, Claude, Gemini) for direct document understanding without traditional OCR preprocessing
Implement intelligent document extraction using LLMs'' native vision and context capabilities to handle complex layouts, tables, and mixed media
Design metadata extraction and enrichment pipelines that enhance discoverability of unstructured assets
Build architectures for multi-modal AI applications that combine structured and unstructured data sources

RAG & Knowledge Platform Architecture

Design end-to-end RAG architectures that leverage existing data lakes and enterprise knowledge bases
Architect hybrid search systems combining traditional keyword search with semantic/vector search capabilities
Implement chunking strategies and embedding pipelines for diverse document types and data sources
Build architectures for continuous learning where RAG systems are updated with new data in near real-time
Design security and access control models that work across legacy systems and modern AI platforms
Create data governance frameworks that ensure compliance while enabling AI innovation

Platform Optimization & Scale:

Optimize storage strategies for cost-effective management of structured and unstructured data
Design tiered storage architectures that balance performance needs with storage costs
Implement caching layers for frequently accessed embeddings and AI model inputs

QUALIFICATIONS

Bachelor''s degree in Computer Science, Information Systems, or related field
10+ years of experience as a Data Architect, Data Platform Engineer, or similar role with enterprise data systems
5+ years of experience working with both structured (SQL databases, data warehouses) and unstructured data (documents, logs, multimedia)
Understanding of modern document processing using multi-modal LLMs and traditional extraction methods
Proficiency in Python and SQL, with experience in data processing libraries
Must be willing to work out of an office located in Bangalore JFWTC Campus
You must submit your application for employment on the careers page at careers.gevernova.com to be considered.

PREFERRED QUALIFICATIONS

12+ years of experience modernizing legacy data architectures for cloud and AI workloads
Deep expertise in unstructured data processing using both multi-modal LLMs and traditional methods
Experience with multi-modal LLMs for document understanding and their cost/performance trade-offs
Background in information retrieval, search engineering, or content management systems
Experience with multi-modal AI architectures combining text, image, and structured data
Master''s degree in Computer Science, Information Systems, or related field

Technical Stack

Document Processing: Multi-modal LLMs (GPT-4V, Claude Vision, Gemini), LlamaParse, Unstructured.io, Azure Document Intelligence, AWS Textract (for legacy/high-volume), direct PDF-to-context pipelines

Vector/Search: Pinecone, Weaviate, pgvector

Lake Technologies: AWS S3, Azure ADLS

Languages: Python, SQL, Scala, Java

APIs: OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, Azure OpenAI

More Jobs at GE VERNOVA

HR Operations Specialist

bengaluru

5.0 - 10.0 yrs

INR 7 - 12 Lacs

Engineer - Mechanical Component

chennai

3.0 - 8.0 yrs

INR 6 - 11 Lacs

Commercial Leader

bengaluru

10.0 - 15.0 yrs

INR 10 - 14 Lacs

Sourcing Specialist - Supplier Quality Engineering

coimbatore, padappai

5.0 - 10.0 yrs

INR 9 - 14 Lacs

Lead Engineer 2 - Electrical

chennai

10.0 - 15.0 yrs

INR 11 - 16 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

GE VERNOVA

Login to

Please Verify Your Phone or Email

Confirm Action

Senior Data Architect - AI-Powered Data Platforms