Web Search Plugin Development Scope for AI/Braindrive Chat

mhowlett · July 31, 2025, 5:27pm

1. Project Overview

The goal is to develop a web search feature for the BrainDriveBasicAIChat plugin, enabling real-time web search capabilities using a SearXNG instance. This feature will enhance the chat’s ability to fetch and process relevant web content, improving response accuracy and context awareness within the existing AI/Braindrive chat system.

2. Objectives

Enable seamless integration of web search into the BrainDriveBasicAIChat plugin.
Use SearXNG for privacy-focused and robust search results.
Ensure low latency and efficient scraping to avoid rate-limiting issues.
Add web search toggle box to AI Chat V2 Component.
Optionally allow the AI to decide when to trigger a search based on query context.

3. Functional Requirements

3.1 Search Engine Integration

SearXNG:

Host a local SearXNG instance via Docker for privacy and control.
Configure to return JSON results for API compatibility.
Optimize settings to avoid port conflicts (default: 8080).
Support 5-10 results per query with title, link, and snippet.

3.2 User & AI Search Trigger

Enable the user to toggle the web search on or off.
Enable the AI to determine when a search is needed (e.g., for factual or recent data queries).
Support multi-turn tool use for iterative searches if initial results are insufficient.

3.3 Data Processing

Parse search results (title, URL, snippet) and feed them into the AI’s context.
Clean HTML tags and irrelevant content from scraped pages.

3.4 User Interface

Display search results within the chat UI, with clickable links and source attribution.
Optionally provide a collapsible “thinking” section to show search duration and logic.

4. Non-Functional Requirements

Performance: Search latency < 10 seconds for 5-10 results.
Scalability: Handle up to 100 concurrent searches without crashing.
Security: Use secure API keys or bearer tokens for SearXNG API access.
Privacy: Prioritize SearXNG to minimize tracking.
Compatibility: Support Python 3.8+ and Docker for deployment; integrate with existing BrainDriveBasicAIChat TypeScript/React setup.

5. Technical Architecture

5.1 Components

Backend:

A Python-based FastAPI service to handle search requests, integrated as a new endpoint within the BrainDrive backend.
A TypeScript-based service module in the BrainDriveBasicAIChat plugin to facilitate API calls to the FastAPI backend.

Search Module:

A service to interface with SearXNG for result retrieval, implemented within the BrainDrive backend.
Configurable result counts and query parameters, managed via the existing configuration system or a new configuration module.
AI Integration:
Extend the AI message processing logic in the BrainDriveBasicAIChat plugin to trigger searches contextually.
Modify the backend AI provider logic to incorporate search results into the AI’s context, leveraging existing conversation and message models.
Update chat components to handle search triggers and display results.
Frontend:
Add a web search toggle box to the chat interface, styled with existing Tailwind CSS.
Render search results in the chat history component or a new dedicated component.
Reuse existing dropdown components for the toggle UI or create a lightweight alternative.
Utilize existing portal utilities for rendering search results in the chat UI.

5.2 Data Flow

The user submits a query through the chat interface in the BrainDriveBasicAIChat plugin.
The query is sent to the backend, where the system checks the web search toggle state stored in the configuration or user settings.
If web search is enabled, the backend search service sends a request to the local SearXNG instance and retrieves results.
Search results are parsed, cleaned, and passed to the local AI model running in the backend to augment its context.
The AI model processes the query with search results, generates a response, and returns it to the frontend for display with attributed links in the chat UI.

6. Development Phases

Phase 1: Setup and Prototyping (~0.5-1 week)

Set up SearXNG Docker instance and configure JSON output.
Build a FastAPI endpoint, integrated with BrainDriveBasicAIChat backend.
Test SearXNG integration with sample queries using the TypeScript service module.

Phase 2: AI Integration (~0.5-1 week)

Develop logic in the AI service for contextual search triggering.
Hook search results into the AI response pipeline via the backend AI provider.
Implement result parsing and cleaning in the backend search service.

Phase 3: UI and Optimization (~0.5-1 week)

Add toggle box to the chat interface using Tailwind CSS.
Include search results with links and attribution in the chat UI.
Optimize search latency and handle rate-limiting gracefully.
Test multi-turn search scenarios.

Phase 4: Testing and Deployment (~0.5-1 week)

Conduct unit and integration tests for search reliability, leveraging existing test setups.
Commit and push updated BrainDriveBasicAIChat plugin to GitHub.
Document setup and configuration steps.

7. Deliverables

Web search feature source code (Python FastAPI, TypeScript/React) within BrainDriveBasicAIChat plugin.
Docker configuration for SearXNG and plugin deployment.
Documentation for setup, configuration, and usage.
Test cases and performance benchmarks.

davewaring · July 31, 2025, 10:12pm

Hi Guys,

Matt, Dave J. and I went through this document on a call today recording for which is below.

Excited to get search added to BrainDrive!

Questions, comments, ideas welcome as always. Just hit the reply button.

Thanks,
Dave W.

Video Recording:

AI Powered Summary:

Technology and Implementation

The team decided to use SearXNG instead of DuckDuckGo, citing privacy benefits and the API limitations of other services. The plan is to add a “web search” toggle box to the user interface. When enabled, the system will query a local SearXNG instance and pass the search results to the AI model to generate a sourced response. This feature will be built into the more advanced, unified chat plugin that already includes features like personas.

Development Strategy and Timeline

The group agreed on an MVP-first approach to get a working version out quickly for feedback, even if it’s “janky” at first. The fastest path, suggested by David Jones, is to have the front-end query SearXNG directly. With this focus, Matt believes he can have something to demo within a week or a week and a half and expects to show progress by the next meeting. David Jones also directed Matt to the correct repository to ensure the work is done on the right component.

mhowlett · August 7, 2025, 5:59pm

Dev update - Web Search Tool Data Pipeline

Overview

The BrainDriveChat plugin’s web search feature integrates SearXNG with web scraping capabilities to provide real-time information to AI conversations. Here’s the complete event/data pipeline:

1. SearXNG

First things first, I installed SearXNG by pulling the latest docker container - documentation on that here.

docker pull docker.io/searxng/searxng:latest

Then, I updated the settings.yml to enable JSON responses and GET HTTP queries.

settings.yml lines 77-79:

  formats:
    - html
    - json

line 105 updated to:

method: "GET"

Start the container:

# Start
docker container start searxng

Confirm the container is running by checking the process status:

docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED        STATUS        PORTS                                         NAMES
d9629e724821   searxng/searxng:latest   "/usr/local/searxng/…"   X hours ago   Up X hours   0.0.0.0:8888->8080/tcp, [::]:8888->8080/tcp   searxng

& navigate a browser to localhost:8888

2. User Interaction & Initialization

User Action: Clicks the web search toggle button in the BrainDriveChat UI
Frontend Response:
- Tests SearXNG connectivity via GET /api/v1/search/health
- Displays confirmation message: “ Web search enabled - I can now search the web to help answer your questions”

3. Search Request Flow

When user enters a question with web search enabled:

Step 3a: Initial Search

Frontend calls searchService.searchWithScraping(userPrompt)
This triggers GET /api/v1/search/web
Backend proxies request to SearXNG at http://localhost:8888/search
SearXNG returns JSON formatted search results

Step 3b: URL Scraping

Frontend automatically extracts top URLs from search results
Calls POST /api/v1/search/scrape with URLs array
Backend uses httpx + BeautifulSoup to fetch and clean content from each URL
Returns scraped text content (max 3000 chars per page)

4. Context Injection & AI Processing

Single Combined Request: Frontend creates one comprehensive prompt that includes:
- Original user question
- [WEB SEARCH CONTEXT] section with both search results AND scraped content
Direct AI Call: This combined prompt goes directly to /api/v1/ai/providers/chat

5. Response Flow

AI model processes the enriched prompt with web context
Backend streams response back to frontend
Frontend displays AI response in chat

6. User Experience

From the logs:

Message 1 (Search Results Summary):

Found 46 web search results for "What is the weather today?":
[5 top results with URLs and summaries]
Additionally scraped detailed content from 2 web page(s) for deeper analysis.
Analyzing this information to provide a comprehensive answer...

Message 2 (AI Response):

The weather in Portland is currently cloudy with a high near 76°F and low around 55°F. 
There's a slight chance of rain shower later today. Winds are from the northwest at 5 to 10 mph. 
Humidity is at 60%.

7. Key Technical Components

Authentication: All requests use BrainDrive’s ApiService with JWT tokens for user authentication.

Error Handling:

Failed scrapes are logged but don’t break the pipeline (AccuWeather returned 403, but Weather.com succeeded)
Search can work even if some URLs fail to scrape

Content Processing:

Search results limited to top 5 for display
Scraped content truncated to 3000 chars per page
BeautifulSoup extracts clean text from HTML

davewaring · August 8, 2025, 9:28pm

Thanks Matt!

Here’s the recording of the call Matt, Dave J. and I did yesterday where Matt Demo’s BrainDrive Websearch and we discuss next steps.

I’ve also included an AI powered summary of what was discussed below the video:

AI Powered Summary:

Matt’s Web Search Plugin Demo

Matt presents a new “Brain Drive Chat” plugin update.
Added a web search toggle button in the chat UI.
When enabled, it pings an API health endpoint and performs a search.
Current output is plain text; markdown formatting with hyperlinks is planned.
The plugin scrapes top URLs for text content and injects it into the AI prompt.
Location detection issue noted (thought he was in Portland).
Posted updated data flow and install instructions to the forum.
Progress is good, but some cleanup needed.

Dave’s Feedback on Architecture

Current design directly uses SearXNG search via backend proxy.
Dave recommends abstracting the search provider so it’s easy to add others (Google, DuckDuckGo).
Suggests using a settings system to store provider choice, API keys, host/port, etc.
Points out non-technical users will need a UI for these settings.
Emphasizes future-proofing the API structure for multiple providers.

Search Performance Test

Group runs a search for “activities in Raleigh, NC, for kids.”
Search speed is fast; AI response is reasonably quick.
Using a Gemma 2B model locally; tested with Orca before but got odd formatting.
Output came in Markdown; Matt plans to handle accordion display and better formatting later.
Consensus: focus on speed and quality first, then polish formatting.

Next Steps for Plugin

Add settings plugin so users can configure search provider, API key, port, etc.
Rename API endpoints for clarity (e.g., /search/searxng/scrape).
Make code modular so other search engines can be added easily.
Collapse web search results in UI for cleaner display.
Keep MVP simple (only scrape top results at first).
Use plugin template and settings example repos as guides.
Matt estimates about a week to complete changes.

Project Integration & GitHub Repos

Discussion of where plugins will live:
- Core plugins = installed with Brain Drive by default.
- Community plugins = separate repos.
Dave’s personal GitHub has plugins, service bridge examples, etc.
Documentation index exists with links to all repos and docs.
Dave has been helping organize repo and documentation structure.

MVP Criteria for Search

Must: perform search, retrieve top results, use them effectively in AI responses.
After MVP: add other engines, deeper scraping, advanced settings.
Goal: avoid re-coding later by leaving hooks for expansion now.

mhowlett · August 14, 2025, 7:31pm

Sharing screenshots of progress made to show web search sources with accordion styling within BrainDriveChat:

davewaring · August 22, 2025, 4:15pm

Just a heads up here that we have switched Matt over to working on polishing the AI chat plugin for the last and this coming week but we’ll be back to search functionality shortly.

Thanks!
Dave W.