
Staff Software Engineer
Hi, I'm Neeraj Gupta
Full Stack Developer & Generative AI Engineer building intelligent systems that scale. 11+ years turning complex problems into elegant, production-grade solutions.
Currently at Visa | Previously at IRIS Software, BLOCK8, Infosys
About
Engineering at the intersection of AI & Scale
I'm a Staff Software Engineer at Visa with 11+ years of experience architecting production systems across payments, fintech, edtech, recruitment, and government contracting. I specialize in bridging the gap between cutting-edge AI capabilities and real-world engineering constraints — building systems that are not just intelligent, but reliable, scalable, and cost-effective. From designing low-latency APIs processing millions of transactions to orchestrating multi-agent AI pipelines, I deliver solutions that move the needle at scale.
AI & Agents
Designing multi-agent orchestration systems with LangChain, Bedrock Agents, and RAG architectures for enterprise-grade AI workflows
Distributed Systems
Architecting microservices handling millions of requests with sub-100ms latency and 99.99% uptime SLAs
Full Stack
End-to-end product delivery — React/Next.js frontends, FastAPI/Spring Boot backends, serverless AWS infrastructure
Performance at Scale
Optimizing latency, throughput, and cost across cloud-native architectures serving global traffic
11+
Years Experience
5+
Products Shipped
6
AI Agents Designed
99.99%
Uptime Delivered
Projects
What I've been building
From AI-powered products I designed end-to-end, to enterprise systems serving millions at companies like Visa, Morgan Stanley, and IRIS Software.
Products I designed, architected, and shipped end-to-end — from zero to production
AI Proposal Automation Platform
Multi-Agent AI System Replacing Weeks of Manual RFP Work with Hours
Designed and built an end-to-end AI platform that automates the entire government RFP response lifecycle. Orchestrates 6 specialized AI agents that ingest solicitation documents, extract compliance requirements, generate proposal sections, and run multi-tier quality reviews — reducing proposal turnaround from weeks to hours while maintaining compliance accuracy.
- Architected a 6-agent orchestration system handling document parsing, scoring, generation, review, compliance, and communications
- Built RAG pipelines with vector search over 10K+ institutional documents for context-aware proposal generation
- Engineered an automated P-Win scoring engine that evaluates win probability across 15+ weighted factors
- Designed Red/Gold/White Glove multi-tier review workflows with automated quality gates and feedback loops
- Implemented real-time compliance matrix generation with bi-directional requirement traceability
AI Recruitment & Interview Platform
Production-Grade Job Matching Engine Aggregating 5+ Sources at Scale
Built and deployed a full-stack recruitment platform that aggregates jobs from 5+ sources in real-time, matches candidates using AI-powered scoring, and conducts automated video interviews with D-ID avatars — replacing manual screening with intelligent, scalable candidate assessment.
- Designed a multi-source ETL pipeline (Lambda + Step Functions) ingesting and deduplicating jobs from 5+ aggregators
- Integrated D-ID AI avatar technology for automated video interviews with real-time Q&A assessment
- Built full-text search with OpenSearch, implementing smart ranking algorithms across 100K+ job listings
- Engineered real-time WebSocket communication layer for live interview sessions with sub-200ms latency
- Deployed on ECS Fargate with Aurora PostgreSQL and Redis, handling concurrent user sessions at scale
Real-Time AI Voice Coach
Sub-Second Voice AI Pipeline with 6 Distinct Personas
Architected a cross-platform AI voice conversation coach that achieves sub-second end-to-end latency through a streaming pipeline of Deepgram STT, Claude LLM, and ElevenLabs TTS. Features 6 AI personas for interview coaching and daily conversation practice — targeting the $65B English learning and interview prep market.
- Engineered a real-time voice pipeline (STT → LLM → TTS) achieving <1 second end-to-end response latency
- Designed 6 AI personas: 3 interview coaches (HR, Tech, Executive) + 3 daily life characters with distinct personalities
- Built cross-platform app (iOS, Android, Web) with React Native Expo and WebSocket-based voice streaming
- Implemented session persistence and progress tracking in DynamoDB with personalized feedback analytics
- Optimized infrastructure costs to ~$100/month using ECS on t4g.small with smart API batching
AI Transition Planning Assistant
RAG-Powered Education AI with Domain-Specific Knowledge Base
Built an AI assistant for special education professionals that provides personalized transition planning guidance through conversational AI backed by a continuously updated, domain-specific vector knowledge base — ensuring every recommendation is grounded in current best practices and regulations.
- Designed Bedrock Agent with vector knowledge base (S3 + OpenSearch) for domain-specific expertise retrieval
- Built session-aware conversations with DynamoDB persistence, maintaining context across multi-turn interactions
- Implemented document ingestion pipeline for continuous knowledge base updates from education resources
- Architected fully serverless infrastructure (Lambda + API Gateway) with <$20/month operating cost
- Set up multi-environment CI/CD (GitHub Actions) with automated deployment to dev and prod stages
Experience
A decade of building at scale
Staff Software Engineer
CurrentVisa · Bengaluru, India
Aug 2025 - Present
- —Architecting and shipping high-throughput REST APIs and data pipelines powering Visa's next-generation payment intelligence platform, handling millions of daily transactions with strict latency SLAs
- —Leading the design and implementation of Generative AI agent systems using LangChain, AWS Bedrock, and agentic workflows to automate complex decision-making across enterprise operations
- —Building production RAG systems with vector databases and embedding pipelines, enabling intelligent knowledge retrieval across large-scale enterprise document repositories
- —Driving adoption of AI-first engineering practices across the organization — from fine-tuned transformer models to knowledge-base chatbots serving internal and external stakeholders
Generative AI Developer & Full Stack Developer
IRIS Software · Pune, India
Jan 2020 - Aug 2025
- —Led the architecture of distributed systems serving millions of concurrent users with sub-100ms P99 latency — designing for horizontal scalability, fault tolerance, and zero-downtime deployments
- —Spearheaded monolith-to-microservices migration using GraphQL federation, reducing inter-service latency by 25% and eliminating single points of failure across the platform
- —Engineered event-driven architectures with Kafka and AWS SQS for decoupled, exactly-once processing — achieving 99.99% uptime across EC2, Lambda, and RDS production clusters
- —Delivered AI-powered workflow automation tools using LangFlow, accelerating team productivity and reducing manual review cycles by 40%
- —Mentored cross-functional engineering teams on system design patterns, performance optimization, and production readiness practices
Open Source Contributor
LangFlow · Open Source
Jan 2025
- —Contributed to LangFlow (langflow.org) — enhancing LangChain integration, build process improvements, and real-time orchestration features
- —Focused on streamlining the no-code/low-code AI builder experience by improving workflow management, component integration, and performance optimization
Full Stack Developer
BLOCK8 · Pune, India
Jan 2018 - Jan 2020
- —Built real-time event-driven systems using AWS Lambda, API Gateway, and DynamoDB — processing thousands of concurrent requests with auto-scaling and cost-optimized serverless architecture
- —Reduced API response times by 50% through GraphQL optimization, multi-layer caching strategies (Redis + CDN), and query batching patterns
- —Designed and implemented CI/CD pipelines that cut deployment cycles by 20%, enabling rapid iteration and continuous delivery across multiple environments
Full Stack Developer
Infosys Limited · Pune, India
Jan 2015 - Jan 2018
- —Built real-time enterprise systems with AWS Lambda, API Gateway, and DynamoDB for Fortune 500 clients — delivering scalable cloud-native solutions from day one
- —Optimized critical API pathways with GraphQL and advanced caching, reducing latency by 50% and improving end-user experience for high-traffic applications
- —Designed scalable CI/CD workflows that reduced deployment times by 20%, establishing engineering best practices adopted across multiple project teams
Skills
Technical toolkit
Languages & Frameworks
AI & Machine Learning
Cloud & Infrastructure
Architecture & Tools
Achievements
Recognition & contributions
Global Rank 6 - AWS CodeJam
Competed globally in Amazon Web Services' premier coding competition and secured 6th place worldwide.
AWS Certified Solutions Architect
Certified expertise in designing distributed systems and scalable architectures on AWS.
CodeChef - Ranked 22nd & 34th
Consistently ranked among top competitive programmers in national-level programming contests.
IRIS Hackathon & CodeChef League Winner
Won multiple competitive events demonstrating rapid prototyping and problem-solving skills.
Open Source - LangFlow Contributor
Enhanced LangChain integration, build processes, and real-time orchestration features for the LangFlow project.
YouTube - NeerajTechTalks
Creating technical content on AI, system design, and full-stack development to empower the developer community.
Recommendations
What people say about me
Verified recommendations from colleagues, managers, and collaborators I've worked with.
Contact
Let's connect
Open to discussing engineering leadership roles, AI product collaborations, advisory opportunities, or technical partnerships.