Models#

The core models of bodhilib are -

  1. Role

  2. Source

  3. Prompt

  4. PromptStream

  5. PromptTemplate

  6. Document

  7. Node

Let’s see the composition for each of these models.


Role#

Role indicates the persona of the Prompt. The role has 3 possible values:

  1. system

  2. ai

  3. user

class Role#

class Role(str, Enum):
    SYSTEM = "system"
    AI = "ai"
    USER = "user"

Role = system#

The Role system indicates the inputs given to the LLM directly. These inputs can be used to control the output and ensure safe output is produced from the LLM, avoiding hallucination or Prompt injection attacks.

Role = ai#

The Role ai indicates the output generated by the LLM. During a chat conversation, the previous chat history with the LLM is passed as input back to the LLM for context. To indicate the output generated by LLM in the previous operation, role=ai is used.

Role = user#

The Role user is the input from the user to generate a response from LLM. The LLM generates the output corresponding to the input given by the user.


Source#

The source indicates if a Prompt is provided as input, or generated as output from LLM.

class Source#

class Source(str, Enum):
    INPUT = "input"
    OUTPUT = "output"

Prompt#

Prompt encapsulates the input to and output from the LLM.

As an input, the text contains the query to the LLM, role is one of system or user, and source is input.

As an output from the LLM, text contains the response from the LLM, role is ai, and source is output.

class Prompt#

class Prompt(BaseModel):
    text: str
    role: Role
    source: Source

PromptStream#

The PromptStream allows asynchronous streaming of response from the LLM.

class PromptStream(Iterator[Prompt]):
    def __iter__(self) -> Iterator[Prompt]: ...
    def __next__(self) -> Prompt: ...
    @property
    def text(self) -> str: ...

PromptStream uses Pythonic interface of a generator iterator to produce response as it is generated. So you can use PromptStream as:

prompt_stream = llm.generate(...)
for prompt in prompt_stream:
    print(prompt.text, end="")

PromptTemplate#

PromptTemplate allows you to generate prompt for your use-case injecting it with the right context. It re-uses the rich eco-system of python, and does not re-invent the wheel in the process.

PromptTemplate supports 4 formats:

  1. fstring

    For simple prompts involving variable injection, you can use the fstring format. It uses python’s native f-string formatting and interpolation to inject your variables. You can then pass your variables to the to_prompts method to build your prompt.

  2. jinja2

    For more complex prompts involving loop, if-else conditionals, you can uses jinja2 templating library, and pass the template as a jinja2 compatible template. You can then pass your variables to the to_prompts method build your prompt.

  3. bodhilib-fstring

    bodhilib-fstring allows you to load simple prompts using PromptSource component. The prompts are serialized in bodhilib-prompt-template format, and uses f-string format for variable injections. Check out PromptSource component for details.

  4. bodhilib-jinja2

    bodhilib-jinja2 allows you to load complex prompts using PromptSource component. The prompts are serialized in bodhilib-prompt-template format, and uses jinja2 templates for variable injections. Check out PromptSource component for details.

TemplateFormat#

TemplateFormat = Literal["fstring", "jinja2", "bodhilib-fstring", "bodhilib-jinja2"]

class PromptTemplate#

class PromptTemplate:
    def __init__(
        self,
        template: str,
        format: Optional[TemplateFormat] = "fstring",
        metadata: Dict[str, Any] = Field(default_factory=dict),
        vars: Dict[str, Any] = Field(default_factory=dict),
    ) -> None: ...

    def to_prompts(self, **kwargs: Dict[str, Any]) -> [Prompt]: ...

PromptTemplate use-case#

# using prompt_template to generate dynamic prompt with variable injection
prompt_template = PromptTemplate(template=..., format="jinja2")
prompts = prompt_template.to_prompts(var1=..., var2=...)

Document#

Document captures the resource loaded from a source. The main content is captured in field text of the Document, any other metadata like file location, url etc. are captured as part of metadata field.

class Document#

class Document(BaseModel):
    text: str
    metadata: Dict[str, Any]

Node#

Node captures the processible chunk for LLM operation. Document can be very large and cannot be processed in its original form. You can split Document in to processible entity Node.

  • The main content of the Node is captured in the field text

  • The parent resource from where this Node was split is captured in field parent

  • Any metadata related to Node is captured in field metadata

  • The embedding related to this Node is captured in field embedding

  • If the Node is persisted in a Vector DB, the record identifier is captured in field id

class Node(BaseModel):
    id: Optional[str]
    text: str
    parent: Optional[Document]
    metadata: Dict[str, Any]
    embedding: Optional[Embedding]

🎉 We just got familiar with the models of bodhilib.

Next, let’s see different Components used in the library.