Back to Projects

RAG Vault

A privacy-first RAG assistant that grounds Bedrock models (Claude/Titan) in your own documents with per-user isolation.

Duration: 1 week
Team Size: 1

Project Overview

RAG Vault is a secure, full-stack Retrieval-Augmented Generation application. Users authenticate, upload documents (PDF, DOCX, TXT), and run multi-turn chats grounded in their private knowledge. Retrieval can be toggled on/off per question. The system emphasizes privacy, scalability, and a clean UX.

Problem

Traditional e-commerce platforms struggle with:

  • Teams need to query private documents with foundation models—without leaking data.
  • RAG systems often lack per-user isolation and reliable session/history management.
  • Switching models (Claude/Titan) should be seamless while keeping prompts well-formed.

Architecture & Solution

Auth & Sessions

JWT login/signup with access & refresh tokens, bcrypt password hashing, refresh revocation, per-user chat sessions.

Document Intake

Uploads to S3 with user-level isolation. Extract text from PDF/DOCX/TXT; index metadata in PostgreSQL.

Chunk & Embed

LangChain RecursiveCharacterTextSplitter; Titan Embeddings v2; vectors stored in Pinecone per user namespace.

Retrieve & Ground

Toggleable RAG: top-k chunks retrieved and injected into prompts; prompts formatted for Claude/Titan via Bedrock.

Multi-turn Chat

Session-scoped history persisted in PostgreSQL; load/resume any session.

Stack

Frontend: Vite + React + Tailwind · Backend: FastAPI + PostgreSQL · Infra: S3, Pinecone, Amazon Bedrock.

Technology Stack

Frontend

  • Vite + React + TypeScript · Tailwind

Backend

  • FastAPI · PostgreSQL · JWT (access/refresh) · bcrypt

AI/Models

  • Amazon Bedrock (Claude, Titan) · Titan Embeddings v2

Storage & Retrieval

  • S3 (documents) · Pinecone (per-user namespaces) · LangChain chunking

Testing

  • pytest suite (>80% coverage), in-memory SQLite for fast tests

Key Features

  • JWT auth with refresh revocation; current user endpoint
  • Upload & parse PDF/DOCX/TXT; store files in S3 and indices in Postgres
  • Embed with Titan v2; index in Pinecone under user namespace
  • Toggle RAG per query; supports Claude/Titan with dynamic prompt formatting
  • Multi-session chat with role-tagged messages and retrievable history

Testing & Quality

pytest suite (>80% coverage) covering auth flows, uploads, embedding/indexing, Bedrock prompts, and retrieval. Includes edge cases for token refresh expiry and upload errors.