Use llama_index to implement chatGPT intelligent question answering robot

Original link: https://reiner.host/posts/8f0289a0.html

foreword

With the opening of the API interface by OPENAI, the AI ​​of major manufacturers has sprung up like mushrooms after the rain. Just like the Internet fire ten years ago, the future outlet must be on AI.

Of course, based on the self-training model/self-developed AI, the threshold is too high, it is not capable of individuals or small and medium-sized factories, and even if there is, there is a big gap with OPENAI, so the only thing that ordinary people can roll is the application layer.

Based on this background, I started to study the GPT-based custom data index question answering robot, and then I discovered the two frameworks llama_index and langchain, and I will record their usage here.

llama_index: GitHub – jerryjliu/llama_index: LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM’s with external data.

langchain: GitHub – hwchase17/langchain: ⚡ Building applications with LLMs through composability ⚡

OPENAI model fine-tuning?

At first, I tried to use OPENAI’s model fine-tuning. I tried to feed hundreds of KB of text data into it, but found that when I used the fine-tuning model to talk, the AI’s reply was always a few words or even a complete sentence.

After searching for information, I realized that model fine-tuning cannot achieve my desired goal with hundreds or megabytes of text data.

Finally I found that llama_index + langchain can achieve the desired effect

llama_index + langchian realizes intelligent question answering robot

step 1. Installation environment

  • Install python3.10 or above

  • Install dependent libraries:

    1. pip install llama-index

    2. pip install openai

    3. pip install langchain

    4. pip install pandas

  • Prepare API KEY for OPENAI

step 2. Prepare data

Prepare the database for the robot to answer, which can include PDF, HTML, WORD documents, SQL, API interfaces, or even network resources such as GITHUB, WIKI, etc. In this chapter, I will use simple TXT text, and the content of the example is as follows:

When is The Legend of Zelda: Tears of the Kingdom coming out? “The Legend of Zelda: Tears of the Kingdom” will be released on May 12, 2023, so stay tuned!

step 3. Write python code

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
twenty one
twenty two
twenty three
twenty four
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
 from llama_index import SimpleDirectoryReader, ServiceContext, GPTVectorStoreIndex, PromptHelper, load_index_from_storage, StorageContext
from llama_index.llm_predictor.chatgpt import ChatGPTLLMPredictor
from langchain.chat_models import ChatOpenAI
import os

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

#Set your OPENAI KEY
os.environ[ "OPENAI_API_KEY" ] = 'sk-xxx'

def init_index ( directory_path ):
# set maximum input size
max_input_size = 4096
# set number of output tokens
num_outputs = 2000
# set maximum chunk overlap
max_chunk_overlap = 20
# set chunk size limit
chunk_size_limit = 600

#If streaming output is required, set streaming=Ture
#define LLM
llm_predictor = ChatGPTLLMPredictor(llm=ChatOpenAI(temperature= 0.5 , model_name= "gpt-3.5-turbo" , streaming= False ))
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
# read data
documents = SimpleDirectoryReader(directory_path).load_data()
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTVectorStoreIndex.from_documents(documents,service_context=service_context)

#Save the data context to disk, default ./storage
index.storage_context.persist( 'D://cache/storage' )

return service_context

def ask ( service_context ):
storage_context = StorageContext.from_defaults(persist_dir= 'D://cache/storage' )
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine(
response_mode= "compact" ,
streaming = False ,
similarity_top_k= 1 , service_context=service_context)

while True :
query = input ( "What do you want to ask?" )
response = query_engine. query(query)
print(response)
#Streaming output is printed using this method
#response. print_response_stream()

#The path is the database path, such as D://data/data.txt
service_context=init_index( 'D://data' )
ask(service_context)


final step. Run the py file

Run the python code to see if it can answer the questions in the database normally

Points that can be optimized in the future:

  • Use websocket + streaming test output to achieve a typewriter-like effect, and the streaming output responds faster and the user experience is better

  • Record the user’s historical dialogue context

  • A robot that integrates database Q&A + chat, automatically identifies whether it belongs to data Q&A or ordinary chat

Recommend GPT related projects

AutoGPT: GitHub – Significant-Gravitas/Auto-GPT: An experimental open-source attempt to make GPT-4 fully autonomous.

GPT4-FREE: GitHub – xtekky/gpt4free: decentralizing the Ai Industry, just some language model api’s…

OPENAI-JAVA: GitHub – TheoKanning/openai-java: OpenAI GPT-3 Api Client in Java

This article is transferred from: https://reiner.host/posts/8f0289a0.html
This site is only for collection, and the copyright belongs to the original author.