AI Fiesta
We are seeking a Senior Full-Stack Engineer to design and build an AI-driven conversational platform. The role involves architecting scalable systems that integrate multiple LLM providers, support real-time interactions, handle complex state management, and ensure performance, observability, and security across the stack.
This position requires a deep understanding of modern web architectures, cloud-native deployments, and AI/LLM integration patterns.
Key Responsibilities
- Implement advanced chat session management, including context persistence, token optimization, and retrieval-augmented generation (RAG).
- Design and optimize high-throughput APIs (REST, GraphQL, and WebSockets/SSE) with rate limiting and fault tolerance.
- Integrate token metering, analytics, and usage-based billing systems.
- Develop a secure, multi-tenant user management system with granular authentication/authorization.
- Leverage event-driven architectures (Kafka, Pub/Sub, or equivalent) for real-time processing and monitoring.
- Optimize database schemas and queries (PostgreSQL, pgvector, Supabase) for low-latency chat history retrieval.
- Implement vector search and RAG pipelines using Pinecone, Weaviate, or pgvector for knowledge grounding.
- Ensure cloud-native scalability with Docker/Kubernetes, CI/CD pipelines, and IaC (Terraform, Pulumi).
- Set up observability (distributed tracing, structured logging, metrics, error tracking) for debugging and performance monitoring.
- Apply AI safety and guardrails (moderation APIs, prompt filtering, structured outputs).
- Stay ahead of AI ecosystem developments and propose new integrations.
Must-Have Skills
- Backend Engineering: Node.js/Deno + TypeScript, event-driven design, API performance optimization.
- Frontend Development: React/Next.js (SSR, streaming responses, optimistic UI updates).
- Databases: PostgreSQL + vector extensions (pgvector), Redis/Valkey for caching and pub/sub.
- Cloud & Infra: AWS/GCP/Azure, Kubernetes, serverless compute (Lambda/Cloud Functions), load balancing.
- Real-time Systems: WebSockets, Server-Sent Events, or WebRTC for interactive chat.
- Security: OAuth2, JWT, token expiration/refresh strategies, encryption at rest and in transit.
- Testing & Quality: Unit/integration/e2e testing frameworks, contract testing for APIs.
Nice-to-Have Skills
- Experience with LangChain, LlamaIndex, or custom orchestration engines.
- Knowledge of embeddings, vector databases, and hybrid search techniques.
- Familiarity with streaming LLM APIs and fine-tuning workflows.
- Background in distributed systems, CAP theorem trade-offs, and scaling stateful apps.
- Experience with observability stacks (OpenTelemetry, Grafana, Prometheus).
- Understanding of payment and billing systems (Stripe, usage-based pricing).
- Prior work on multi-tenant SaaS platforms.
Qualifications
- 5–7 years of professional experience in full-stack or platform engineering.
- Proven experience delivering production-grade distributed systems.
- Strong understanding of LLM APIs and AI/ML system integrations.
- Bachelor's/Master's in Computer Science or equivalent practical experience from a pedigree background.