Running the API example for the stream mode I get the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'softprompts/What I would like to say is the following: .zip'
The non-stream mode one works fine. Think maybe a recent update changed the form format? There's no mention of it even being possible to upload a softprompt in the API example.
I'm running the webui with python server.py --load-in-4bit --model llama-13b --listen
Same bug, anyone know which commit this broke on? Was working maybe 2 or 3 days ago. I'll look into this. I believe going forward we should probably switch to a versioned API (so that we don't break backwards compat) and avoid using an array for the input (unless there's some reason to do so, it seems very confusing not to use a dict so that the labels are obvious)
I'm unfamiliar with the code here but I'll dig around a little bit to see if I can figure out what the issue is here as well as see what it would take to add a /v1/*
endpoint that accepts a dictionary, as well as providing default values for all unsubmitted fields to allow for simpler use i.e. submitting only
{
"prompt": "Factual Q/A\nQuestion: What is the atomic number of lithium?\nAnswer: "
}
and having the rest of the configuration filled in by whatever default is loaded in at the time.
Sorry if any of this is already implemented of if I'm missing something obvious.
So it looks like this API is something that's autogenerated by Gradio itself (sorry again for the naivete here I really have no idea what's going on) and because of the recent changes in the UI the shape of this API has changed as well. (I think we should try to figure out how to avoid this in the future)
For the time being here's an example of a working payload so anyone blocked by this can hack together working code for the time being and perhaps to give people an idea where to look next time this breaks.
Everything behaves the same as the example with the exception of the two lines with comments.
{
"fn_index": 9, //this was '7' originally and now must be '9'
"data": [
"Common sense questions and answers\\n\\nQuestion: What is the atomic number of lithium?\\nFactual answer:",
200,
true,
1.99,
0.18,
1,
1.15,
1, //this parameter is new and represents the "encoder_repetition_penalty"
30,
0,
0,
1,
0,
1,
false
],
"session_hash": "f420f69f"
}
P.S. Really seems not good to add a parameter in the middle of an array of nameless parameters, i sorta assume this API is meant purely as an M2M interface internally for Gradio?
@thot-experiment the parameters are named in api-example-stream.py
I am going to assume that simply hacking the FN index to be 9 will fix it, given that otherwise the numbers match up with that working payload.
@thot-experiment I really really would like to use **kwargs
in the main functions, which would allow the API to pass parameters as a dictionary as you mentioned. The problem is that gradio does not seem to accept that.
Specifically, the offending function is
def generate_reply(question, max_new_tokens, do_sample, temperature, top_p, typical_p, repetition_penalty, encoder_repetition_penalty, top_k, min_length, no_repeat_ngram_size, num_beams, penalty_alpha, length_penalty, early_stopping, eos_token=None, stopping_string=None):
in modules/text_generation.py
.
This long function definition causes modules/chat.py
to be very messy with several other long function definitions.
Also, if you are interested in APIs, make sure to also check https://github.com/oobabooga/text-generation-webui/pull/342
It is possible to create custom APIs using extensions.