Text chunks in retrieval may risk losing context. The below article introduces contextualizing each chunk using an LLM. The added cost and latency required for contextualizing is partially addressed through a lightweight LLM (haiku) and prompt caching. Along with other known techniques (BM25, reranking, etc.), a significant reduction in retrieval failure is achieved. Overall, a good article to read. https://lnkd.in/g_5fgAN8
Sanket Bhat’s Post
More Relevant Posts
-
I wrote up a blog post about using LlamaIndex to construct hierarchical indexes using Semantic Chunking and RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) -- https://lnkd.in/eMvyhX78 hope you find it useful!
Hierarchical (and other) Indexes using LlamaIndex for RAG Content Enrichment
sujitpal.blogspot.com
To view or add a comment, sign in
-
Effective retrieval becomes crucial during query time when you need to fetch the most relevant information based on a given query. In our previous lesson, we delved into the fundamentals of semantic search, noting its effectiveness across various use cases. However, we also encountered some nuanced scenarios where challenges arose. In this article, we will conduct a thorough exploration of retrieval, delving into more advanced techniques to address these edge cases. While our previous discussion touched on semantic similarity search, we will now delve into several more sophisticated methods. Our journey begins with Maximum Marginal Relevance (MMR), a technique designed to retrieve more diverse data. Following that, we’ll explore LLM-aided retrieval, allowing for self-query and the application of filters to enhance query precision. Finally, we’ll investigate retrieval by comparison, aiming to extract only the most pertinent information from the retrieved passages. https://lnkd.in/dd-CyBGW
Effective retrieval becomes crucial during query time when you need to fetch the most relevant information based on a given query. In our previous lesson, we delved into the fundamentals of semantic search, noting its effectiveness across various use cases. However, we also encountered some nuanced scenarios where challenges arose. In this article, we will conduct a thorough exploration of retrieval, delving into more advanced techniques to address these edge cases. While our previous discussion touched on semantic similarity search, we will now delve into several more sophisticated methods. Our journey begins with Maximum Marginal Relevance (MMR), a technique designed to retrieve more diverse data. Following that, we’ll explore LLM-aided retrieval, allowing for self-query and the application of filters to enhance query precision. Finally, we’ll investigate retrieval by comparison, aiming to extract only the most pertinent information from the retrieved passages. https://lnkd.in/dtcBVVaK
Hands-On LangChain for LLM Applications Development: Information Retrieval
pub.towardsai.net
To view or add a comment, sign in
-
AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)
AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)
dtunkelang.medium.com
To view or add a comment, sign in
-
I really enjoy seeing our alumni out there sharing their knowledge in various ways. Large Language Models (LLMs) were just being introduced when our data science program started and now this is the hottest topic in the field. Check out what Josh has to say. #GIDS #LLM #DataScience
Image retrieval has remained a challenge in Retrieval-Augmented Generation (RAG). To address this, I used LangChain to implement an LLM-in-the-loop image retrieval system in a chatbot. This approach retrieves more relevant images that align with the text answers, eliminating the need for a multimodal LLM for answer synthesis and resulting in significant cost savings. In addition, I implemented features such as agentic routing, web search, and multi-index RAG. You can ask the chatbot about my skills, my PhD dissertation, or my cat 🐈 here https://lnkd.in/gN2YGEvv For more details, check out the article here https://lnkd.in/giX9RAkF
Building an Advanced LangChain RAG Chatbot with Image Retrieval and Agentic Routing
medium.com
To view or add a comment, sign in
-
Recursive Retrieval is still an enormously powerful way of enhancing traditional, vector-based RAG. In normal RAG, if the relevant information is spread across multiple chunks, we might not be able to retrieve all relevant information with a single retrieval. If we look at tables, for example, oftentimes the 'semantic meaning' of a table is not captured by the table itself, but by the text surrounding it. Recursive retrieval solves this problem by recursively looking at not only the semantically most similar documents, but also document chunks which might be related to these documents. This way, we can make sure to capture all relevant information, even if it is spread across multiple chunks. This means, recursive retrieval consists of two main components: - A way to identify relationships between document chunks - A way to recursively retrieve related document chunks In the article below we show how to implement recursive retrieval using LlamaIndex https://lnkd.in/d5DpASJQ
Advanced RAG: Recursive Retrieval with llamaindex
pondhousedata.com
To view or add a comment, sign in
-
Data Science Manager | Certified MLOps Engineer | Google Cloud Certified Data Engineer | IIIT Guwahati & IIIT Bangalore Alumnus
Rethinking the Role of Token Retrieval in Multi-Vector Retrieval Presented at NeurIPS 2024 Authors: - Jinhyuk Lee - Zhuyun Dai - Sai Meher Karthik Duddu - Tao Lei - Iftekhar Naim - Ming-Wei Chang - Vincent Zhao ColBERT - processes every word in a search query with every word in a document. For example, if you search "What is the capital of France?" and the document says, "Paris is the capital city of France," ColBERT compares each word in the query with each word in the document, which can be slow and resource-intensive. XTR - takes a more efficient approach. First, it quickly filters out less relevant documents using a simple model. Then, it selects only the most important words in the remaining documents for detailed comparison. For instance, it might focus on words like "Paris", "capital", and "France" from the document. This reduces the number of comparisons needed, making the search faster while still finding relevant results. By combining these methods, search systems can become both accurate and efficient. For more details, you can check out the paper "Rethinking the Role of Token Retrieval in Multi-Vector Retrieval" (https://lnkd.in/dxUiaZe3). Implementation - https://lnkd.in/dndKHJ9E
2304.01982
arxiv.org
To view or add a comment, sign in
-
RAG performs best in POC and worst in production - If you've been thinking about this, then this blog is for you. The technique demonstrated here will greatly enhance your RAG! With this blog, you can take that extra step you're always seeking, which may help you achieve the desired accuracy in overall RAG performance. #LLM #RAG #GenAI #NLP #BERT https://lnkd.in/gcA2GFXF
Improve Your RAG Context Recall by 95% with an Adapted Embedding Model.
medium.com
To view or add a comment, sign in
-
Project Manager at Objectbox | First on-device vector database+ Data Sync Sustainable | Open Source | On-device AI | Edge AI | Local AI
❔ How to make LLM's responses more relevant? Many of us use LLMs like GPT or Gemini on a regular basis. However, their responses are often too general (and sometimes wrong), especially, if you ask for domain-specific information. To make the results more relevant you can provide the context for LLM - e.g. up-to-date information or a set of internal data from the company. Such a technique is called RAG - retrieval augmented generation. In the recent article in Objectbox, we discuss: ❎ What is RAG and how is it connected to vector databases? ❎ Why should you use it? ❎ How does it work? We also compare RAG with long-context windows (another way to enhance LLM's response quality) and discuss future perspectives. 🎀 Bonus: RAG explained in a simple diagram https://lnkd.in/ewYgFpPz
Retrieval Augmented Generation (RAG) with vector databases: Expanding AI Capabilities
https://objectbox.io
To view or add a comment, sign in
-
ListT5: Advancing AI Reranking with Fusion-in-Decoder Approach ListT5 introduces a novel reranking method, Fusion-in-Decoder (FiD), for handling multiple candidate passages during training and inference. It incorporates an efficient inference framework based on m-ary tournament sort with output caching. Evaluation on the BEIR benchmark for zero-shot retrieval shows ListT5 outperforms RankT5 by +1.3 NDCG@10 score, with efficiency comparable to pointwise models and surpassing previous listwise models. Additionally, ListT5 resolves the "lost-in-the-middle" issue of previous listwise rerankers. The open-sourced code, model checkpoints, and evaluation framework are available for further exploration and implementation. This advancement demonstrates AI's capability in improving ranking accuracy and efficiency in information retrieval tasks. #ListT5 #AI #Reranking #FusionInDecoder #FiD #BEIRBenchmark #ZeroShotRetrieval #NDCG #Efficiency #InformationRetrieval #OpenSource #ModelCheckpoints #EvaluationFramework
ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval
arxiv.org
To view or add a comment, sign in
-
In conclusion, RAG has emerged as a promising solution by incorporating knowledge from external databases. This enhances the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, and allows for continuous knowledge updates and integration of domain-specific information. RAG enhances LLMs by retrieving relevant document chunks from the external knowledge base through semantic similarity calculation. The RAG research paradigm is continuously evolving, and RAG is categorized into three stages: Naive RAG, Advanced RAG, and Modular RAG. Naive RAG has several limitations, including Retrieval Challenges and Generation Difficulties. The latter RAG architectures were proposed to address these problems: Advanced RAG and Modular RAG. Due to the adaptable architecture of Modular RAG, it has become a standard paradigm in building RAG applications.
Evolution of RAGs: Naive RAG, Advanced RAG, and Modular RAG Architectures
https://www.marktechpost.com
To view or add a comment, sign in