Retrieval Augmented Generation (RAG)#

What is bodhilib?#

bodhilib is an Open Source (MIT License), Plugin Architecture based, Pythonic and Composable LLM Library.

Bodhi is a Sanskrit term for deep insight into reality. With bodhilib, we aspire to provide tools for a deeper understanding of the data-rich world around us.

Plugin Architecture?#

bodhilib in itself, only defines the models and interfaces. All the implementations are provided by plugins installed separately.

Pythonic?#

bodhilib prefers Pythonic (over Java-like) syntax, uses the Python’s dynamic language power.

Composable?#

The interfaces take inspiration from functional languages (specially Haskell), to create composability among components. Follows Postel’s Law - conservative in what you send, liberal in what you accept.

LLM Library?#

bodhilib aspires to the LLM library of choice, developer-friendly, and feature rich, through distributed and democratic development using plugin architecture.


What is Retrieval Augmented Generation (RAG)?#

Uses LLMs to retrieve relevant information from large corpus of data.

What are the components of RAG?#

  1. DataLoader - loads the data from sources as Documents

  2. Splitter - splits the Documents into processible entities Node

  3. Embedder - embeds the Nodes into a vector representation or Embedding

  4. VectorDB - stores the Embedding, along with metadata (insert), also retrieves based on similarity (query)

  5. LLM - a Large Language Model, takes a user input or Prompt to generate a response

How does ingestion work in RAG?#

  1. Data is converted to Documents using DataLoader

  2. Documents are converted to Nodes using Splitter

  3. Nodes are enriched with Embedding using Embedder

  4. Nodes and Embeddings are inserted as Records using VectorDB insert

RAG Ingestion Pipeline

How does query work in RAG?#

  1. User provides Input Query

  2. Input Query is converted into Embedding using Embedder

  3. Embedding is used to fetch similar Records using VectorDB query

  4. Input Query and Records are used to create Prompt using PromptTemplate

  5. Prompt is used to generate Response using LLM

RAG Query Pipeline

RAG using bodhilib#

1. Installation#

Install the required libraries:

  1. bodhilib - the core library that defines the models and interfaces

  2. bodhiext.openai - the plugin implementing LLM interface for OpenAI

  3. bodhiext.qdrant - the plugin implementing VectorDB interface for Qdrant

  4. bodhiext.sentence_transformers - the plugin implementing Embedder interface to use Sentence Transformers

  5. bodhiext.file - the plugin implementing DataLoader interface to load local files (packaged with bodhilib)

  6. bodhiext.text_splitter - the plugin implementing Splitter interface to split based on sentences (packaged with bodhilib)

  7. python-dotenv - utility library to load environment variables from local .env file

  8. fn.py - Python library providing functional programming constructs, optional, used for Composability demo

[1]:
!pip install -q bodhilib bodhiext.openai bodhiext.qdrant bodhiext.sentence_transformers python-dotenv fn.py

2. API Keys#

[2]:
import os
from getpass import getpass

from dotenv import load_dotenv

load_dotenv()
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")
[3]:
# utility method
import textwrap
from reprlib import repr

def wrap_text(text, width=100):
    wrapped_lines = []
    for line in text.splitlines():
        wrapped_lines.extend(textwrap.fill(line, width=width).splitlines())
    return '\n'.join(wrapped_lines)

def trim_text(text):
    return repr(text)

3. Initialize the Components#

[4]:
from bodhilib import (
    Distance,
    get_data_loader,
    get_embedder,
    get_llm,
    get_splitter,
    get_vector_db,
)

data_loader = get_data_loader("file")
splitter = get_splitter("text_splitter", max_len=300, overlap=30)
embedder = get_embedder("sentence_transformers")
vector_db = get_vector_db("qdrant", location=":memory:")
llm = get_llm("openai_chat", model="gpt-3.5-turbo")
# llm = get_llm("cohere", model="command")

# recreate vectordb database
collection_name = "test_collection"

if "test_collection" in vector_db.get_collections():
    vector_db.delete_collection("test_collection")
vector_db.create_collection(
    collection_name=collection_name,
    dimension=embedder.dimension,
    distance=Distance.COSINE,
)
[4]:
True

4. RAG Ingestion#

4.1 Load the data as Documents#

[5]:
import os
from pathlib import Path

current_dir = Path(os.getcwd())
data_dir = current_dir / ".." / "data" / "data-loader"
data_loader.add_resource(dir=str(data_dir))
docs = data_loader.load()
len(docs)
[5]:
2

4.2 Split the Document into Nodes#

[6]:
nodes = splitter.split(docs)
len(nodes)
[6]:
47

4.3 Enrich the Nodes with Embeddings#

[7]:
_ = embedder.embed(nodes)

print(trim_text(nodes[0].embedding))
[-0.04627109691500664, -0.09699960052967072, 0.09419205784797668, 0.014190209098160267, 0.02280910685658455, -0.015392833389341831, ...]

4.4 Insert the Nodes into VectorDB#

[8]:
_ = vector_db.upsert(collection_name, nodes)

print(vector_db.client.get_collection(collection_name).vectors_count)
47

5. RAG Query#

5.1 Input Query#

[9]:
input_query = "According to Paul Graham, how to tackle when you are in doubt?"

5.2 Embed the Query#

[10]:
query_embedding = embedder.embed(input_query)

print(trim_text(query_embedding[0].embedding))
[-0.04342738911509514, 0.03586525842547417, 0.0004517710767686367, -0.009470310062170029, -0.02134144864976406, 0.026086824014782906, ...]

5.3 Get Similar Records#

[11]:
records = vector_db.query(collection_name, query_embedding[0].embedding, limit=5)

print(wrap_text(records[0].text))
who sits back and offers sophisticated-sounding criticisms of them. "It's easy to criticize" is true
in the most literal sense, and the route to great work is never easy.
There may be some jobs where it's an advantage to be cynical and pessimistic, but if you want to do
great work it's an advantage to be optimistic, even though that means you'll risk looking like a
fool sometimes. There's an old tradition of doing the opposite. The Old Testament says it's better
to keep quiet lest you look like a fool. But that's advice for seeming smart. If you actually want
to discover new things, it's better to take the risk of telling people your ideas.
Some people are naturally earnest, and with others it takes a conscious effort. Either kind of
earnestness will suffice. But I doubt it would be possible to do great work without being earnest.
It's so hard to do even if you are. You don't have enough margin for error to accommodate the
distortions introduced by being affected, intellectually dishonest, orthodox, fashionable, or cool.
[14]
Great work is consistent not only with who did it, but with itself. It's usually all of a piece. So
if you face a decision in the middle of working on something, ask which choice is more consistent.
You may have to throw things away and redo them. You won't necessarily have to, but you have to be
willing to. And that can take some effort; when there's something you need to redo, status quo bias
and laziness will combine to keep you in denial about it. To beat this ask: If I'd already made the
change, would I want to revert to what I have now?
Have the confidence to cut.

5.4 Compose the Prompt#

[12]:
# prepare prompt template
from bodhilib import PromptTemplate

template = """Below are the text chunks from a blog/article.
1. Read and understand the text chunks
2. After the text chunks, there are list of questions starting with `Question:`
3. Answer the questions from the information given in the text chunks
4. If you don't find the answer in the provided text chunks, say 'I couldn't find the answer to this question in the given text'


{% for text in texts %}
### START
{{ text }}
### END
{% endfor %}

Question: {{ query }}
Answer:
"""

prompt_template = PromptTemplate(template=template, format='jinja2')
[13]:
# compose the prompt
texts = [r.text for r in records]
prompt = prompt_template.to_prompts(texts=texts, query=input_query)

print(wrap_text(prompt[0].text))
Below are the text chunks from a blog/article.
1. Read and understand the text chunks
2. After the text chunks, there are list of questions starting with `Question:`
3. Answer the questions from the information given in the text chunks
4. If you don't find the answer in the provided text chunks, say 'I couldn't find the answer to this
question in the given text'
### START
who sits back and offers sophisticated-sounding criticisms of them. "It's easy to criticize" is true
in the most literal sense, and the route to great work is never easy.
There may be some jobs where it's an advantage to be cynical and pessimistic, but if you want to do
great work it's an advantage to be optimistic, even though that means you'll risk looking like a
fool sometimes. There's an old tradition of doing the opposite. The Old Testament says it's better
to keep quiet lest you look like a fool. But that's advice for seeming smart. If you actually want
to discover new things, it's better to take the risk of telling people your ideas.
Some people are naturally earnest, and with others it takes a conscious effort. Either kind of
earnestness will suffice. But I doubt it would be possible to do great work without being earnest.
It's so hard to do even if you are. You don't have enough margin for error to accommodate the
distortions introduced by being affected, intellectually dishonest, orthodox, fashionable, or cool.
[14]
Great work is consistent not only with who did it, but with itself. It's usually all of a piece. So
if you face a decision in the middle of working on something, ask which choice is more consistent.
You may have to throw things away and redo them. You won't necessarily have to, but you have to be
willing to. And that can take some effort; when there's something you need to redo, status quo bias
and laziness will combine to keep you in denial about it. To beat this ask: If I'd already made the
change, would I want to revert to what I have now?
Have the confidence to cut.
### END
### START
gaps.
The next step is to notice them. This takes some skill, because your brain wants to ignore such gaps
in order to make a simpler model of the world. Many discoveries have come from asking questions
about things that everyone else took for granted. [2]
If the answers seem strange, so much the better. Great work often has a tincture of strangeness. You
see this from painting to math. It would be affected to try to manufacture it, but if it appears,
embrace it.
Boldly chase outlier ideas, even if other people aren't interested in them — in fact, especially if
they aren't. If you're excited about some possibility that everyone else ignores, and you have
enough expertise to say precisely what they're all overlooking, that's as good a bet as you'll find.
[3]
Four steps: choose a field, learn enough to get to the frontier, notice gaps, explore promising
ones. This is how practically everyone who's done great work has done it, from painters to
physicists.
Steps two and four will require hard work. It may not be possible to prove that you have to work
hard to do great things, but the empirical evidence is on the scale of the evidence for mortality.
That's why it's essential to work on something you're deeply interested in. Interest will drive you
to work harder than mere diligence ever could.
The three most powerful motives are curiosity, delight, and the desire to do something impressive.
Sometimes they converge, and that combination is the most powerful of all.
The big prize is to discover a new fractal bud. You notice a crack in the surface of knowledge, pry
it open, and there's a whole world inside.
### END
### START
around with you. But the more you're carrying, the greater the chance of noticing a solution — or
perhaps even more excitingly, noticing that two unanswered questions are the same.
Sometimes you carry a question for a long time. Great work often comes from returning to a question
you first noticed years before — in your childhood, even — and couldn't stop thinking about. People
talk a lot about the importance of keeping your youthful dreams alive, but it's just as important to
keep your youthful questions alive. [19]
This is one of the places where actual expertise differs most from the popular picture of it. In the
popular picture, experts are certain. But actually the more puzzled you are, the better, so long as
(a) the things you're puzzled about matter, and (b) no one else understands them either.
Think about what's happening at the moment just before a new idea is discovered. Often someone with
sufficient expertise is puzzled about something. Which means that originality consists partly of
puzzlement — of confusion! You have to be comfortable enough with the world being full of puzzles
that you're willing to see them, but not so comfortable that you don't want to solve them. [20]
It's a great thing to be rich in unanswered questions. And this is one of those situations where the
rich get richer, because the best way to acquire new questions is to try answering existing ones.
Questions don't just lead to answers, but also to more questions.
The best questions grow in the answering. You notice a thread protruding from the current paradigm
and try pulling on it, and it just gets longer and longer. So don't require a question to be
obviously big before you try answering it. You can rarely predict that.
### END
### START
assumption that you'll somehow magically guess as a teenager. They don't tell you, but I will: when
it comes to figuring out what to work on, you're on your own. Some people get lucky and do guess
correctly, but the rest will find themselves scrambling diagonally across tracks laid down on the
assumption that everyone does.
What should you do if you're young and ambitious but don't know what to work on? What you should not
do is drift along passively, assuming the problem will solve itself. You need to take action. But
there is no systematic procedure you can follow. When you read biographies of people who've done
great work, it's remarkable how much luck is involved. They discover what to work on as a result of
a chance meeting, or by reading a book they happen to pick up. So you need to make yourself a big
target for luck, and the way to do that is to be curious. Try lots of things, meet lots of people,
read lots of books, ask lots of questions. [5]
When in doubt, optimize for interestingness. Fields change as you learn more about them. What
mathematicians do, for example, is very different from what you do in high school math classes. So
you need to give different types of work a chance to show you what they're like. But a field should
become increasingly interesting as you learn more about it. If it doesn't, it's probably not for
you.
Don't worry if you find you're interested in different things than other people. The stranger your
tastes in interestingness, the better. Strange tastes are often strong ones, and a strong taste for
work means you'll be productive.
### END
### START
about, so we can ignore that. And we can assume effort, if you do in fact want to do great work. So
the problem boils down to ability and interest. Can you find a kind of work where your ability and
interest will combine to yield an explosion of new ideas?
Here there are grounds for optimism. There are so many different ways to do great work, and even
more that are still undiscovered. Out of all those different types of work, the one you're most
suited for is probably a pretty close match. Probably a comically close match. It's just a question
of finding it, and how far into it your ability and interest can take you. And you can only answer
that by trying.
Many more people could try to do great work than do. What holds them back is a combination of
modesty and fear. It seems presumptuous to try to be Newton or Shakespeare. It also seems hard;
surely if you tried something like that, you'd fail. Presumably the calculation is rarely explicit.
Few people consciously decide not to try to do great work. But that's what's going on
subconsciously; they shy away from the question.
So I'm going to pull a sneaky trick on you. Do you want to do great work, or not? Now you have to
decide consciously. Sorry about that. I wouldn't have done it to a general audience. But we already
know you're interested.
Don't worry about being presumptuous. You don't have to tell anyone. And if it's too hard and you
fail, so what? Lots of people have worse problems than that. In fact you'll be lucky if it's the
worst problem you have.
Yes, you'll have to work hard. But again, lots of people have to work hard.
### END
Question: According to Paul Graham, how to tackle when you are in doubt?
Answer:

5.5 Generate Response from LLM#

[14]:
response = llm.generate(prompt)

print(wrap_text("Question: " + input_query))
print(wrap_text("Answer: " + response.text))
Question: According to Paul Graham, how to tackle when you are in doubt?
Answer: According to the given text chunks, when you are in doubt, you should optimize for
interestingness, try different types of work, and learn more about different fields to see if they
become increasingly interesting.

Benefits of Bodhilib#

1. Plugin Architecture#

  • Clean and concise core

  • Easy to understand

  • Extensible using plugins

  • Interchangeable components, uniform interface

  • Selective integration, install only what you need

  • Democratic and Distributed development, no single-repo for all implementation

  • No PR queue on single repo, open-issues, few core-committers

  • Stable core library, scheduled releases

  • Independent plugin library fixes and releases

  • Batteries in the box with bodhiext.* packages, implementing interface for popular components

  • No preferred partner integration, common interface for 3rd party to implement

  • No pay-wall, plus-offering, walled-garden approach

  • No re-inventing the wheel, cornered custom eco-system

2. Composable Functional Interface#

See Composability notebook for the full example.

[15]:
from fn import F # fn.py

if "test_collection" in vector_db.get_collections():
    vector_db.delete_collection("test_collection")
vector_db.create_collection(
    collection_name=collection_name,
    dimension=embedder.dimension,
    distance=Distance.COSINE,
)

# RAG Ingestion Pipeline
f = (
    F(data_loader.load)
    >> F(splitter.split)
    >> F(embedder.embed)
    >> F(lambda nodes: vector_db.upsert(collection_name=collection_name, nodes=nodes))
)()

# RAG Query Pipeline
response = (
    F(embedder.embed)
    >> F(
        lambda e: vector_db.query(
            collection_name=collection_name, embedding=e[0].embedding, limit=5
        )
    )
    >> F(lambda nodes: prompt_template.to_prompts(query=input_query, texts = [node.text for node in nodes]))
    >> F(llm.generate)
)(input_query)

print(wrap_text(response.text))
According to the text, when in doubt, one should optimize for interestingness and give different
types of work a chance to see if they become increasingly interesting as one learns more about them.

Roadmap and Contributing#

Refer to Roadmap page for draft roadmap and contributing to Bodhilib.

Contact#

Github: BodhiSearch/bodhilib

Guide: BodhiSearch/bodhilib-guide

Follow on Twitter [@BodhilibAI](https://twitter.com/BodhilibAI)


🙏🏽 Thanks 🙏🏽