DEV Community

Cover image for AI Project Development Step By Step to learn in 2025
Taki089.Dang
Taki089.Dang

Posted on

AI Project Development Step By Step to learn in 2025

Here's a detailed roadmap for building a Generative AI App using Next.js, NestJS, OpenAI, and MongoDB Vector Search. This guide covers everything from embedding company documents, storing vectors, retrieving data, and using prompt engineering for intelligent AI responses.


🚀 AI Project Roadmap

A structured end-to-end guide to building a production-ready Generative AI application.


📌 Phase 1: Define the Problem & System Design

🎯 Goal:

  • Define the problem statement (e.g., AI-powered document search, chatbot, summarization).
  • Choose the AI task (e.g., text generation, question-answering, summarization).
  • Define user inputs (e.g., text queries, document uploads).
  • Determine output format (e.g., plain text, structured responses).
  • Identify the retrieval approach (e.g., MongoDB Vector Search, RAG).

🛠 Tech to Learn:

  • AI Basics: Generative AI, RAG (Retrieval-Augmented Generation).
  • Vector Search: MongoDB Atlas Vector Search.
  • Next.js & NestJS: Full-stack development.

📌 Phase 2: Data Collection & Storage

🎯 Goal:

  • Collect company documents (e.g., PDFs, Word, CSVs, JSON).
  • Store documents efficiently in MongoDB.
  • Generate and store vector embeddings.

🔥 Steps:

  1. Set Up MongoDB Atlas:

    • Create a database for documents & embeddings.
    • Enable Vector Search.
  2. Install Dependencies:

   npm install openai mongoose @nestjs/mongoose
Enter fullscreen mode Exit fullscreen mode
  1. Store Documents in MongoDB:
   const documentSchema = new mongoose.Schema({
     content: String,
     embedding: { type: Array, default: [] },
   });
Enter fullscreen mode Exit fullscreen mode
  1. Convert Documents to Embeddings:
   import OpenAI from 'openai';

   const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

   async function generateEmbedding(text: string) {
     const response = await openai.embeddings.create({
       input: text,
       model: 'text-embedding-ada-002',
     });

     return response.data[0].embedding;
   }
Enter fullscreen mode Exit fullscreen mode
  1. Store Vectors in MongoDB:
   async function storeDocument(content: string) {
     const embedding = await generateEmbedding(content);
     await DocumentModel.create({ content, embedding });
   }
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • MongoDB Atlas Vector Search.
  • OpenAI Embeddings API.
  • NestJS with MongoDB.

📌 Phase 3: Data Preprocessing

🎯 Goal:

  • Clean and process unstructured data.
  • Implement document chunking for better embeddings.
  • Store structured metadata for retrieval.

🔥 Steps:

  1. Remove Unnecessary Sections:
   function cleanText(text: string): string {
     return text.replace(/(Disclaimer|Footer|Further Support).*/g, '');
   }
Enter fullscreen mode Exit fullscreen mode
  1. Chunk Large Documents:
   function chunkText(text: string, chunkSize = 500) {
     return text.match(new RegExp(`.{1,${chunkSize}}`, 'g'));
   }
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • Text Preprocessing (Regex, Chunking, Stopwords Removal).
  • LangChain for Text Splitting.

📌 Phase 4: Model Training & Fine-Tuning (Optional)

🎯 Goal:

  • Train a custom OpenAI model if needed.
  • Fine-tune responses for specific industry terms.

🔥 Steps:

  1. Prepare Training Data (JSONL Format):
   {"messages": [{"role": "system", "content": "You are a financial assistant."}, {"role": "user", "content": "What is an ETF?"}, {"role": "assistant", "content": "An ETF is an exchange-traded fund."}]}
Enter fullscreen mode Exit fullscreen mode
  1. Upload and Train in OpenAI:
   openai api fine_tunes.create -t dataset.jsonl -m gpt-4
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • Fine-tuning OpenAI models.
  • Prompt Engineering for better responses.

📌 Phase 5: Retrieval System Implementation

🎯 Goal:

  • Implement vector search to find relevant data fast.
  • Use MongoDB Vector Search to retrieve the best-matching document.

🔥 Steps:

  1. Perform Vector Search Query:
   async function searchDocuments(query: string) {
     const queryEmbedding = await generateEmbedding(query);

     return await DocumentModel.find({
       $vectorSearch: {
         queryVector: queryEmbedding,
         path: 'embedding',
         numCandidates: 10,
         limit: 5,
       },
     });
   }
Enter fullscreen mode Exit fullscreen mode
  1. Retrieve Data & Send to OpenAI for Response:
   async function generateAIResponse(query: string) {
     const relevantDocs = await searchDocuments(query);

     const response = await openai.chat.completions.create({
       model: 'gpt-4',
       messages: [
         { role: 'system', content: 'Use the following documents to answer the user query:' },
         ...relevantDocs.map(doc => ({ role: 'user', content: doc.content })),
         { role: 'user', content: query },
       ],
     });

     return response.choices[0].message.content;
   }
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • MongoDB Vector Search Queries.
  • Retrieval-Augmented Generation (RAG).

📌 Phase 6: API Development (NestJS)

🎯 Goal:

  • Build a REST API to handle AI queries.
  • Expose endpoints for querying AI-generated responses.

🔥 Steps:

  1. Create AI Service:
   @Injectable()
   export class AIService {
     async processQuery(query: string) {
       return await generateAIResponse(query);
     }
   }
Enter fullscreen mode Exit fullscreen mode
  1. Create Controller:
   @Controller('ai')
   export class AIController {
     constructor(private readonly aiService: AIService) {}

     @Post('query')
     async handleQuery(@Body() data: { query: string }) {
       return this.aiService.processQuery(data.query);
     }
   }
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • NestJS Controllers & Services.
  • API Development Best Practices.

📌 Phase 7: Frontend Integration (Next.js)

🎯 Goal:

  • Create a Next.js UI to interact with AI.
  • Send queries to the backend and display results.

🔥 Steps:

  1. API Call from Next.js:
   async function fetchAIResponse(query: string) {
     const res = await fetch('/api/ai-query', {
       method: 'POST',
       body: JSON.stringify({ query }),
     });
     return res.json();
   }
Enter fullscreen mode Exit fullscreen mode
  1. Create UI Component:
   export default function AIChat() {
     const [query, setQuery] = useState('');
     const [response, setResponse] = useState('');

     async function handleSubmit() {
       const data = await fetchAIResponse(query);
       setResponse(data);
     }

     return (
       <div>
         <input type="text" value={query} onChange={e => setQuery(e.target.value)} />
         <button onClick={handleSubmit}>Ask AI</button>
         <p>{response}</p>
       </div>
     );
   }
Enter fullscreen mode Exit fullscreen mode

🛠 Tech to Learn:

  • Next.js API Routes.
  • React Hooks & State Management.

📌 Phase 8: Deployment

  • Deploy Next.js on Vercel.
  • Deploy NestJS on AWS/GCP.
  • Use MongoDB Atlas for storage.

📌 User Feedback & Iteration (Final Phase)

The User Feedback & Iteration phase is essential for improving your AI-powered app based on real-world usage. It ensures your system remains accurate, user-friendly, and efficient over time.


🎯 Goal:

  • Collect user feedback on AI responses.
  • Analyze incorrect AI answers and improve accuracy.
  • Optimize query handling, retrieval, and model responses.
  • Implement continuous improvement using feedback loops.

🔥 Steps to Improve Your AI System

Step 1: Collect User Feedback

Encourage users to rate AI responses or report inaccuracies.

Frontend Implementation (Next.js UI)

  • Add a feedback button to AI responses.
  • Collect user ratings (e.g., 👍/👎, 1-5 stars).

📌 Example (Next.js Feedback UI):

export default function AIResponse({ response }) {
  const [feedback, setFeedback] = useState(null);

  async function sendFeedback(value) {
    await fetch('/api/feedback', {
      method: 'POST',
      body: JSON.stringify({ response, feedback: value }),
    });
    setFeedback(value);
  }

  return (
    <div>
      <p>{response}</p>
      <button onClick={() => sendFeedback('👍')}>👍</button>
      <button onClick={() => sendFeedback('👎')}>👎</button>
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Step 2: Store Feedback in MongoDB

Create a feedback collection in MongoDB.

📌 NestJS API for Storing Feedback:

@Controller('feedback')
export class FeedbackController {
  @Post()
  async saveFeedback(@Body() data: { response: string; feedback: string }) {
    await FeedbackModel.create(data);
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Analyze Incorrect AI Responses

Regularly review negative feedback and categorize errors:

Types of Errors:

  1. Incorrect Facts → Improve retrieval model.
  2. Irrelevant Results → Enhance vector embeddings.
  3. Unclear Answers → Optimize prompt engineering.

📌 Query AI Performance Data in MongoDB:

async function getNegativeFeedback() {
  return await FeedbackModel.find({ feedback: '👎' }).limit(100);
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Fine-Tune the System

Based on the analysis, improve the system using:

  1. Better Prompt Engineering
    • Example: Add context to AI prompts.
   const prompt = `
     You are an AI assistant for a legal firm.
     Use the following company documents to answer accurately:
     ${retrievedDocuments}
   `;
Enter fullscreen mode Exit fullscreen mode
  1. Improve Document Chunking
    • If retrieval is inaccurate, adjust chunk size.
   function chunkText(text: string, chunkSize = 300) {
     return text.match(new RegExp(`.{1,${chunkSize}}`, 'g'));
   }
Enter fullscreen mode Exit fullscreen mode
  1. Re-train or Fine-Tune the Model
    • If OpenAI responses are low-quality, train a custom fine-tuned model.

Step 5: Continuous Monitoring & Automated Improvements

  1. Track AI Performance with Logging (e.g., log incorrect answers).
  2. Use Metrics (e.g., track how often users give a 👍 vs. 👎).
  3. A/B Test Prompts (e.g., compare different AI prompt formats).
  4. Schedule Model Updates (e.g., re-train embeddings monthly).

📌 Log AI Performance in NestJS:

async function logAIResponse(query: string, response: string) {
  await PerformanceModel.create({ query, response, timestamp: new Date() });
}
Enter fullscreen mode Exit fullscreen mode

Top comments (3)

Collapse
 
ailearn_019_88317bee446f4 profile image
aiLearn 019

nice ❤

Collapse
 
esmatullah_hadel_bdf630a2 profile image
esmatullah hadel

good

Collapse
 
xs10l3 profile image
Dylan Su

Very detailed!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.