Skip to main content

Understand how your bot generates content from your knowledge base

When you connect your bot to your knowledge base and start to serve automatically generated content to your chatters, it might feel like magic. But it's not! This topic takes you through what happens behind the scenes when you start serving knowledge base content to chatters.

How Ada ingests your knowledge base

When you link your knowledge base to your bot, your bot copies down all of your bot content, so it can quickly search through your knowledge base content and serve relevant information from it. Here's how it happens:

kb-llm-flow.png
  1. When you link your Ada bot with your knowledge base, your bot imports all of your knowledge base content.

    Depending on the tools you use to create and host your knowledge base, your knowledge base then updates with different frequencies:

    • If your knowledge base is in Zendesk or Salesforce, your bot checks back for updates every 15 minutes.

      • If your bot hasn't had any conversations, either immediately after you linked it with your knowledge base or in the last 30 days, your bot pauses syncing. To trigger a sync with your knowledge base, have a test conversation with your bot.

    • If your knowledge base is hosted elsewhere, you or your Ada team have to build an integration to scrape it and upload content to Ada's Knowledge API. If this is the case, the frequency of updates depends on the integration.

  2. Your bot splits your articles into chunks, so it doesn't have to search through long articles each time it looks for information - it can just look at the shorter chunks instead.

    While each article can cover a variety of related concepts, each chunk should only cover one key concept. Additionally, your bot includes context for each chunk; each chunk contains the headings that preceded it.

  3. Your bot sends each chunk to a Large Language Model (LLM), which it uses to assign the chunks numerical representations that correspond to the meaning of each chunk. These numerical values are called embeddings, and it saves them into a database.

    The database is then ready to provide information for GPT to put together into natural-sounding responses to chatter questions.

How Ada creates responses from knowledge base content

After saving your knowledge base content into a database, your bot is ready to provide content from it to answer your chatters' questions. Here's how it does that:

generative-content-flow.png
  1. Your bot sends the chatter's query to the LLM, so it can get an embedding (a numerical value) that corresponds with the information the chatter was asking for.

    Before proceeding, the bot sends the content through a moderation check via the LLM to see if the chatter's question was inappropriate or toxic. If it was, your bot rejects the query and doesn't continue with the answer generation process.

  2. Your bot then compares embeddings between the chatter's question and the chunks in its database, to see if it can find relevant chunks that match the meaning of the chatter's question. This process is called retrieval.

    Your bot looks for the best match in meaning in the database to what the chatter asked for, which is called semantic similarity, and saves the top three most relevant chunks.

    If the chatter's question is a follow-up to a previous question, your bot might get the LLM to rewrite the chatter's question to include context to increase the chances of getting relevant chunks. For example, if a chatter asks your bot whether your store sells cookies, and your bot says yes, your chatter may respond with "how much are they?" That question doesn't have enough information on its own, but a question like "how much are your cookies?" provides enough context to get a meaningful chunk of information back.

    If your bot isn't able to find any relevant matches to the chatter's question in the database's chunks at this point, it serves the chatter a message asking them to rephrase their question or escalates the query to a human agent, rather than attempting to generate a response and risking serving inaccurate information.

  3. Your bot sends the three chunks from the database that are the most relevant to the chatter's question to GPT to stitch together into a response. Then, your bot sends the generated response through three filters:

    1. The Safety filter checks to make sure that the generated response doesn't contain any harmful content.

    2. The Relevance filter checks to make sure that the generated response actually answers the chatter's question. Even if the information in the response is correct, it has to be the information the chatter was looking for in order to give the chatter a positive experience.

    3. The Accuracy filter checks to make sure that the generated response matches the content in your knowledge base, so it can verify that the bot's response is true.

  4. If the generated response passes these three filters, your bot serves it to the chatter.

Have any questions? Contact your Ada team—or email us at help@ada.support.