Creating "pragmatic AGI" (artificial general intelligence) with current technology is achievable without waiting for future advancements like GPT-5 or magically appearing uber-conscience from more layers in the matrix multiplication architecture of today’s ANNs (artificial neural networks).
This approach doesn’t claim to surpass human intelligence or achieve consciousness but aims to perform various tasks based on loosely phrased verbal prompts. The cornerstone of this capability within AI multi-agents is Retrieval Augmented Generation (RAG)
Do read the previous article for more details and basic definitions, and today let’s focus on the KEY factor for creating any useful AI multi-agent: RAG, or retrieval augmented generation.
Picture of a typical LLM (large language model) architecture
RAG ensures that when a user interacts with an agent, such as a chatbot or large language model, the agent first retrieves relevant context before formulating a response. This prevents the model from generating inaccurate or imaginative answers, often referred to as "hallucinations." Similar to human memory, RAG allows the agent to recall pertinent information to provide accurate and contextually appropriate responses.
For example, comparing responses from leading LLMs like GPT-4 and Claude Opus with those from a RAG-enabled multi-agent reveals significant accuracy improvements. The latter consistently delivers correct and detailed answers tailored to the specific context of the query. Implementing RAG typically involves utilizing methods such as textual or web searches. Integrail Studio simplifies this process with a "poor man’s RAG" multi-agent setup. This configuration integrates three key components: querying a designated website, converting retrieved data into markdown format for efficient processing, and utilizing a large language model to generate responses based on the acquired contextual information.
Ootbi is the name of the product of ObjectFirst.com — a company of our co-founder and serial entrepreneur Ratmir Timashev. As you can see, both of the top models give a likely “correct” answer, but clearly not the one we are looking for. Our RAG-enabled multi-agent on the other hand provides 100% correct and detailed response.
So, how do you enable RAG? There are many ways, with two of the most popular being:
Today, we will look at the first one, being the simplest, and in the next parts will look into building a much more efficient vector search RAG in detail.
Here is how easy it is to create what we call a “poor man’s RAG” multi-agent using Integrail Studio:
All you need to do is to connect three “boxes” following very simple logic:
That’s it, no need to write any code or come up with expensive infrastructure — you just created your first useful multi-agent!
For it to be useful in most of the real-world scenarios we have to further improve it. For instance, not every user’s request translates well into a search query — so we’ll need to build an additional “converter” for it. Another improvement would be to analyze several found pages, not just one.
We will look at how to create such a production-ready multi-agent next time, for now, we encourage you to follow this publication (Superstring Theory) or me (Anton Antich) as an author — for everyone who starts following before the public launch of the Integrail Studio in May 2024 will receive $10 worth of token credits — more than enough to run lots of fun experiments of your own.
In the next installments, we’ll cover increasingly more complex agents that are not just able to have a conversation with you but can do stuff on your behalf — such as send an email, schedule a meeting, write a summary of the news articles interesting for you, etc etc. The journey couldn’t be more exciting — join us!