Original link: https://www.shuizilong.com/house/archives/introduction-to-gradio-client/
With the recent popularity of HuggingGPT and AutoGPT, it has become a very common requirement for LLM to be used as a Control to call various Domain Expert Models.
So Gradio kept up with the pulse of the community and launched the gradio_client function last week .
Let’s see what it can do~!
Cloud Model VS Open Model
Before discussing gradio_client in detail, let’s take a look back at the current status of developers. With the popularity of ChatGPT and its extremely cheap, almost dumping pricing model, OpenAI API has almost become the standard configuration for all developers, and even many developers Both of them encapsulate the open source LLM into a calling format similar to OpenAI API. In fact, the use of HuggingFace Inference Endpoints has provided functions similar to OpenAI API, and gradio_client allows a Gradio program to maintain a unified communication format even if it is separated from the HuggingFace platform.
An example – ChatGLM
- https://github.com/Akegarasu/ChatGLM-webui
- https://huggingface.co/spaces/multimodalart/ChatGLM-6B
- https://github.com/lychees/ChatGLM-Gradio
The above are three sections of ChatGLM code, all implemented with Gradio.
The first is the version of Hongye Dada. This version has the most abundant controls, but it is a completely local version, requiring users to download the model to a local place to run it.
The second is HF’s high-yield stuff, the multimodalart version, which utilizes HF’s infrastructure, but requires users to pay for the host model to run.
The third version is my magical modification from the second version. The core is that only the predict function needs to be modified, and the process of calling the model is skipped, and the call is made directly from the remote end.
def predict(input, history=None): if history is None: history = [] response, history = model. chat(tokenizer, input, history) return history, history
def predict(input, history=None): if history is None: history = [] client = Client('https://multimodalart-chatglm-6b.hf.space/') with open(client. predict(input, fn_index=0)) as f: text = process_text(f. read()) output = json. loads(text)[0] history += [output] return history, history
Further Discussion
gradio_client seems to fit my ideal expectations for open source software, all for one, and one for all.
But if you think about it carefully, the biggest victim may be the child who deploys the model, which is equivalent to spending money to make public good for everyone.
And if a model is very popular, the scalability of the existing architecture intersected with the Cloud Model will soon show its insufficiency.
This article is transferred from: https://www.shuizilong.com/house/archives/introduction-to-gradio-client/
This site is only for collection, and the copyright belongs to the original author.