Jun 1, 2025

ragion. ai for civil engineers

Design Document

Tooling

Redis & Celery

Redis as Celery broker and result backend (with SSL):
- Redis backend is used by Celery for task queuing and result storage.
Redis for progress tracking:
- Tracks progress of indexed files uploaded by users.
- Progress is reliably shared and visible to the frontend in real time.

Explicit Redis usage for progress tracking:

Both the Flask app and Celery tasks connect to Redis (using the same connection string).
Progress updates are stored and retrieved using Redis commands (hset, hgetall, etc.).

Problems Faced

County-Specific Query Issues

When ensuring the application asks for a county to clarify, the app was unable to answer questions when the county was included, even though the answer existed for that county.

Problem Analysis

Missing Metadata Context:
- Document chunks contain technical specs but don’t always explicitly link to “Manatee County” in every chunk.
- The retriever matches on text similarity only, so queries with “Manatee County” won’t find chunks that mention the regulation but not the county.
Embedding Limitations:
- Without metadata or explicit mentions, embeddings don’t capture jurisdictional context, even if the full document is Manatee-specific.

Solution

Add Jurisdiction Metadata:
Add jurisdiction metadata to every chunk before embedding and indexing the document.

Metadata Filtering with Pinecone & LangChain

Even after adding jurisdiction metadata:
- Using LangChain, metadata is added to each chunk before embedding and indexing.
- A chunk is an individual section that the document is split into for retrieval.
- When using Pinecone as a vector store, similarity search ignores metadata by default unless a filter is used with the search_kwargs parameter:

retriever = docsearch.as_retriever( search_type=“similarity”, search_kwargs={“k”: 3, “filter”: {“jurisdiction”: “Manatee County, Florida”}}, )

Solution

Clean the Query Before Retrieval:
When the user specifies a county (e.g., “Manatee County”), remove the county reference from the query before sending it to the retriever. This ensures the embedding focuses on the technical question, not the jurisdiction.
The mistake was combining the county and question in the query sent to the LLM, which reduced similarity scores due to the focus on jurisdiction instead of the actual question.
The retriever should search only within the filtered set of chunks (those matching the county) and find the most relevant text based on the technical part of the question.
The county (or any jurisdiction) is used to tell the retriever which subset of documents or chunks to search.

Solution Choice

Adding the jurisdiction to each chunk’s text will strengthen your search, especially if users include county names in their queries.
For maximum accuracy and flexibility, include jurisdiction in both chunk text and metadata.

Sources:

Retrieval Stage Issue

Example user query:
“What design storm is required for a local road in Manatee if it is not within a floodplain?”
The chatbot response:
“I do not know the answer based on the provided context. The context discusses design storms for new streets and tailwater elevation for the twenty-five year design storm, but does not specify the design storm requirements for local roads outside of a floodplain.”
Expected answer:
“Data Source: MANATEE COUNTY PUBLIC WORKS STANDARDS PART 2 - STORMWATER MANAGEMENT DESIGN MANUAL [05/15] Page SW-4
Code: C. For local streets, bridges and culverts not in the published one hundred (100) year floodplain, the design storm shall be twenty-five (25) year frequency.”

Problem Analysis

Retrieval Stage Issue:
The system is not retrieving the correct chunk(s) that contain the answer, so the LLM doesn’t see the relevant context and responds with “I do not know the answer based on the provided context.”
This is a well-documented failure mode in RAG systems.
Why does this happen?
- Chunking: If documents are chunked in a way that splits the relevant sentence from its context, or the chunk size is too small/large, retrieval may miss the answer.
- Query/Chunk Mismatch: If queries use different terminology than the chunk (e.g., “local road” vs. “local street”), the retriever may not score the right chunk highly enough.
- Metadata Filtering: If the retriever is filtering too strictly (e.g., only “Manatee County, Florida”) and chunks are not consistently tagged, relevant chunks may be excluded.
- Retriever Configuration: If k (number of chunks retrieved) is too low, or the embedding model isn’t well-matched to the domain, the right chunk may not be in the context window at all.

Source:
https://www.chitika.com/common-reasons-rag-underperforming/

Solution Choice

Increase Chunk Overlap and Adjust Chunk Size

If using a text splitter (e.g., LangChain’s RecursiveCharacterTextSplitter), increase the overlap and adjust the chunk size:

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter( chunk_size=800, # Try 800–1200 for technical docs chunk_overlap=200 # Increase overlap to ensure context is preserved ) text_chunks = text_splitter.split_documents(extracted_data)

What this does:
Ensures important sentences are less likely to be split across chunks.

Increase Number of Retrieved Chunks (k)

When configuring the retriever, increase k:

retriever = docsearch.as_retriever( search_type=“similarity”, search_kwargs={“k”: 5, “filter”: {“jurisdiction”: “Manatee County, Florida”}}, )

(Changed k from 3 to 5)

What this does:
Gives the LLM more context to work with, increasing the chance the answer is present.

After these changes, results improved:
The question:
“What design storm is required for a local road in manatee if it is not within a floodplain?“
now gives the following response:
“For local streets, bridges, and culverts not in the published one hundred (100) year floodplain, the design storm shall be twenty-five (25) year frequency.
Code: MANATEE COUNTY PUBLIC WORKS STANDARDS PART 2 - STORMWATER MANAGEMENT DESIGN MANUAL
Manual: STORMWATER MANAGEMENT DESIGN MANUAL
Data Source: MANATEE COUNTY PUBLIC WORKS STANDARDS
Page: SW-4
Link: Not provided in context”

I would still like to continue to improve the responses—if the answer is in the document, the chatbot should respond with the answer. For now, this will suffice for the MVP.

Design Decision: LLMs vs. Traditional NLP for Document Querying and Retrieval

Background and Motivation

During development, I encountered the need to frequently update and manage large language models (LLMs) in production. This maintenance overhead led me to pause and question whether an LLM was actually necessary for the core functionality of the application—specifically, quoting and sourcing exact language from Florida LDCs, ordinances, and Manatee County-specific manuals and forms, with strict requirements for jurisdictional accuracy and no text generation or summarization.

Initial Assumptions and Rethinking

My initial assumption was that LLMs would be required for any natural language query processing. However, after reviewing the actual user needs and the nature of the queries, I realized that traditional NLP and information retrieval techniques might be sufficient—at least for strict retrieval tasks.

What the Application Needs to Do

Accept user queries about regulations, ordinances, or manuals.
Return only exact, quoted text from the provided documents.
Always cite the source and confirm jurisdiction.
Never generate or paraphrase content; no summaries required.

Traditional NLP and Retrieval: When Is It Enough?

For use cases where users are expected to search for passages using keywords or near-exact phrases, full-text search engines (like Elasticsearch or Solr) are more than adequate. These tools are robust, scalable, and require minimal maintenance compared to LLMs. They also allow for strict control over what is returned—ensuring that only direct quotes are surfaced.

Example:
If an engineer searches for “setback requirements for residential lots,” a full-text search can return:

“Minimum front setback for R-1 residential lots is 25 feet.”
— Manatee County Land Development Code, Section 402.4

This approach works as long as the query terms closely match the document language.

Why That’s Not Always Enough: The Real-World Query Problem

However, in practice, engineers and users rarely phrase queries in the exact language used in the documents. For example:

A user might search for “distance from road to house” instead of “setback.”
They might use synonyms, paraphrase, or ask about concepts described in different terms.
Legal and regulatory documents often use pronouns or references (“it,” “these requirements”) that aren’t clear out of context.

This means strict keyword search can miss relevant passages or return ambiguous results.

Where NLP Becomes Necessary

1. Semantic Search

Semantic search allows the system to match user intent and meaning, not just keywords. It uses NLP techniques (often embeddings or transformer models) to find passages that are contextually relevant, even if the wording is different.

Example:
A query for “distance from road to house” would still surface the correct “setback” regulation, even if the word “setback” isn’t used in the query.

2. Coreference Resolution

Coreference resolution is the process of determining what pronouns or references refer to in a document. This is crucial for legal and regulatory texts, where requirements are often described using ambiguous references.

Example:
A passage might say:

“It shall be the responsibility of the applicant to ensure compliance with these requirements.”

Coreference resolution helps the system clarify that “it” refers to the applicant and “these requirements” refers to previously mentioned regulations. This ensures that quoted results are accurate and understandable, even when context is required.

Conclusion and Design Decision

For strict, verbatim retrieval:
Traditional search and retrieval is sufficient and preferable (for simplicity, cost, and control).
For real-world usability:
NLP techniques—specifically semantic search and coreference resolution—are necessary to ensure users can find relevant, quoted passages regardless of how they phrase their queries, and to make sure references in the documents are clear and accurate.

This means the project does become an NLP/ML project, but not in the generative sense (no LLM text generation or summarization). Instead, the focus is on retrieval, semantic matching, and reference disambiguation.

Final Thoughts

This pause in development, prompted by the need to constantly update LLMs, led to a deeper understanding of the actual requirements. It clarified that while traditional NLP is often enough for simple retrieval, real-world usage demands more robust NLP features like semantic search and coreference resolution. By focusing on these, the application can deliver accurate, reliable, and user-friendly results—without the unpredictability or overhead of full LLM generation.