Sanket Bhat’s Post

Senior Applied Scientist at Alexa

3w Edited

Text chunks in retrieval may risk losing context. The below article introduces contextualizing each chunk using an LLM. The added cost and latency required for contextualizing is partially addressed through a lightweight LLM (haiku) and prompt caching. Along with other known techniques (BM25, reranking, etc.), a significant reduction in retrieval failure is achieved. Overall, a good article to read. https://lnkd.in/g_5fgAN8

Introducing Contextual Retrieval

anthropic.com

To view or add a comment, sign in

More Relevant Posts

Sujit Pal

Search, NLP and Machine Learning at Elsevier Labs
7mo
Report this post
I wrote up a blog post about using LlamaIndex to construct hierarchical indexes using Semantic Chunking and RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) -- https://lnkd.in/eMvyhX78 hope you find it useful!

Hierarchical (and other) Indexes using LlamaIndex for RAG Content Enrichment

sujitpal.blogspot.com

3 Comments
Like Comment
To view or add a comment, sign in
Lava Kafle
9mo
Report this post
Effective retrieval becomes crucial during query time when you need to fetch the most relevant information based on a given query. In our previous lesson, we delved into the fundamentals of semantic search, noting its effectiveness across various use cases. However, we also encountered some nuanced scenarios where challenges arose. In this article, we will conduct a thorough exploration of retrieval, delving into more advanced techniques to address these edge cases. While our previous discussion touched on semantic similarity search, we will now delve into several more sophisticated methods. Our journey begins with Maximum Marginal Relevance (MMR), a technique designed to retrieve more diverse data. Following that, we’ll explore LLM-aided retrieval, allowing for self-query and the application of filters to enhance query precision. Finally, we’ll investigate retrieval by comparison, aiming to extract only the most pertinent information from the retrieved passages. https://lnkd.in/dd-CyBGW

Wow Development Quality Assurance

16,556 followers
9mo

Effective retrieval becomes crucial during query time when you need to fetch the most relevant information based on a given query. In our previous lesson, we delved into the fundamentals of semantic search, noting its effectiveness across various use cases. However, we also encountered some nuanced scenarios where challenges arose. In this article, we will conduct a thorough exploration of retrieval, delving into more advanced techniques to address these edge cases. While our previous discussion touched on semantic similarity search, we will now delve into several more sophisticated methods. Our journey begins with Maximum Marginal Relevance (MMR), a technique designed to retrieve more diverse data. Following that, we’ll explore LLM-aided retrieval, allowing for self-query and the application of filters to enhance query precision. Finally, we’ll investigate retrieval by comparison, aiming to extract only the most pertinent information from the retrieved passages. https://lnkd.in/dtcBVVaK

Hands-On LangChain for LLM Applications Development: Information Retrieval

pub.towardsai.net
Like Comment
To view or add a comment, sign in
Mike Tamir, PhD

Distinguished ML Scientist @Shopify, ML Faculty at UC Berkeley
5mo
Report this post
AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)

AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)

dtunkelang.medium.com
Like Comment
To view or add a comment, sign in
Lisa Altman

Program Coordinator, Goergen Institute of Data Science at University of Rochester
3mo
Report this post
I really enjoy seeing our alumni out there sharing their knowledge in various ways. Large Language Models (LLMs) were just being introduced when our data science program started and now this is the hottest topic in the field. Check out what Josh has to say. #GIDS #LLM #DataScience

Jingtian "Josh" Wang

MS in Data Science | PhD in Molecular Biology
3mo

Image retrieval has remained a challenge in Retrieval-Augmented Generation (RAG). To address this, I used LangChain to implement an LLM-in-the-loop image retrieval system in a chatbot. This approach retrieves more relevant images that align with the text answers, eliminating the need for a multimodal LLM for answer synthesis and resulting in significant cost savings. In addition, I implemented features such as agentic routing, web search, and multi-index RAG. You can ask the chatbot about my skills, my PhD dissertation, or my cat 🐈 here https://lnkd.in/gN2YGEvv For more details, check out the article here https://lnkd.in/giX9RAkF

Building an Advanced LangChain RAG Chatbot with Image Retrieval and Agentic Routing

medium.com
Like Comment
To view or add a comment, sign in
Andreas Nigg

I write about tips and tricks around AI, LLMs and data
2mo
Report this post
Recursive Retrieval is still an enormously powerful way of enhancing traditional, vector-based RAG. In normal RAG, if the relevant information is spread across multiple chunks, we might not be able to retrieve all relevant information with a single retrieval. If we look at tables, for example, oftentimes the 'semantic meaning' of a table is not captured by the table itself, but by the text surrounding it. Recursive retrieval solves this problem by recursively looking at not only the semantically most similar documents, but also document chunks which might be related to these documents. This way, we can make sure to capture all relevant information, even if it is spread across multiple chunks. This means, recursive retrieval consists of two main components: - A way to identify relationships between document chunks - A way to recursively retrieve related document chunks In the article below we show how to implement recursive retrieval using LlamaIndex https://lnkd.in/d5DpASJQ

Advanced RAG: Recursive Retrieval with llamaindex

pondhousedata.com

1 Comment
Like Comment
To view or add a comment, sign in
Nikhil Mugganawar

Data Science Manager | Certified MLOps Engineer | Google Cloud Certified Data Engineer | IIIT Guwahati & IIIT Bangalore Alumnus
2mo Edited
Report this post
Rethinking the Role of Token Retrieval in Multi-Vector Retrieval Presented at NeurIPS 2024 Authors: - Jinhyuk Lee - Zhuyun Dai - Sai Meher Karthik Duddu - Tao Lei - Iftekhar Naim - Ming-Wei Chang - Vincent Zhao ColBERT - processes every word in a search query with every word in a document. For example, if you search "What is the capital of France?" and the document says, "Paris is the capital city of France," ColBERT compares each word in the query with each word in the document, which can be slow and resource-intensive. XTR - takes a more efficient approach. First, it quickly filters out less relevant documents using a simple model. Then, it selects only the most important words in the remaining documents for detailed comparison. For instance, it might focus on words like "Paris", "capital", and "France" from the document. This reduces the number of comparisons needed, making the search faster while still finding relevant results. By combining these methods, search systems can become both accurate and efficient. For more details, you can check out the paper "Rethinking the Role of Token Retrieval in Multi-Vector Retrieval" (https://lnkd.in/dxUiaZe3). Implementation - https://lnkd.in/dndKHJ9E

2304.01982

arxiv.org
Like Comment
To view or add a comment, sign in
Vignesh Baskaran

GenAI Engineer @ Citi via Synechron | Ex-Omuni (Shiprocket group) | Ex-Locus
1w
Report this post
RAG performs best in POC and worst in production - If you've been thinking about this, then this blog is for you. The technique demonstrated here will greatly enhance your RAG! With this blog, you can take that extra step you're always seeking, which may help you achieve the desired accuracy in overall RAG performance. #LLM #RAG #GenAI #NLP #BERT https://lnkd.in/gcA2GFXF

Improve Your RAG Context Recall by 95% with an Adapted Embedding Model.

medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Dr Anastasia Ragulskaya

Project Manager at Objectbox | First on-device vector database+ Data Sync Sustainable | Open Source | On-device AI | Edge AI | Local AI
4mo
Report this post
❔ How to make LLM's responses more relevant? Many of us use LLMs like GPT or Gemini on a regular basis. However, their responses are often too general (and sometimes wrong), especially, if you ask for domain-specific information. To make the results more relevant you can provide the context for LLM - e.g. up-to-date information or a set of internal data from the company. Such a technique is called RAG - retrieval augmented generation. In the recent article in Objectbox, we discuss: ❎ What is RAG and how is it connected to vector databases? ❎ Why should you use it? ❎ How does it work? We also compare RAG with long-context windows (another way to enhance LLM's response quality) and discuss future perspectives. 🎀 Bonus: RAG explained in a simple diagram https://lnkd.in/ewYgFpPz

Retrieval Augmented Generation (RAG) with vector databases: Expanding AI Capabilities

https://objectbox.io

1 Comment
Like Comment
To view or add a comment, sign in
SentientMatters

1,952 followers
7mo
Report this post
ListT5: Advancing AI Reranking with Fusion-in-Decoder Approach ListT5 introduces a novel reranking method, Fusion-in-Decoder (FiD), for handling multiple candidate passages during training and inference. It incorporates an efficient inference framework based on m-ary tournament sort with output caching. Evaluation on the BEIR benchmark for zero-shot retrieval shows ListT5 outperforms RankT5 by +1.3 NDCG@10 score, with efficiency comparable to pointwise models and surpassing previous listwise models. Additionally, ListT5 resolves the "lost-in-the-middle" issue of previous listwise rerankers. The open-sourced code, model checkpoints, and evaluation framework are available for further exploration and implementation. This advancement demonstrates AI's capability in improving ranking accuracy and efficiency in information retrieval tasks. #ListT5 #AI #Reranking #FusionInDecoder #FiD #BEIRBenchmark #ZeroShotRetrieval #NDCG #Efficiency #InformationRetrieval #OpenSource #ModelCheckpoints #EvaluationFramework

ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval

arxiv.org
Like Comment
To view or add a comment, sign in
Stefan Raupach

Strategy I Transformation I Leadership II GenAI Evangelist
6mo
Report this post
In conclusion, RAG has emerged as a promising solution by incorporating knowledge from external databases. This enhances the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, and allows for continuous knowledge updates and integration of domain-specific information. RAG enhances LLMs by retrieving relevant document chunks from the external knowledge base through semantic similarity calculation. The RAG research paradigm is continuously evolving, and RAG is categorized into three stages: Naive RAG, Advanced RAG, and Modular RAG. Naive RAG has several limitations, including Retrieval Challenges and Generation Difficulties. The latter RAG architectures were proposed to address these problems: Advanced RAG and Modular RAG. Due to the adaptable architecture of Modular RAG, it has become a standard paradigm in building RAG applications.

Evolution of RAGs: Naive RAG, Advanced RAG, and Modular RAG Architectures

https://www.marktechpost.com
Like Comment
To view or add a comment, sign in

654 followers

24 Posts

View Profile Follow

Sanket Bhat’s Post

More Relevant Posts

Explore topics