Skip to content

Add cancel() method to interrupt a stream#733

Open
simonchatts wants to merge 1 commit into
abetlen:mainfrom
simonchatts:main
Open

Add cancel() method to interrupt a stream#733
simonchatts wants to merge 1 commit into
abetlen:mainfrom
simonchatts:main

Conversation

@simonchatts

Copy link
Copy Markdown

Fixes #599.

Thanks for all your work on this project!

@tk-master

Copy link
Copy Markdown
Contributor

please accept this pr @abetlen

@tk-master

Copy link
Copy Markdown
Contributor

Actually.. I found an issue with this method.. this will only cancel after a token is generated but if the llm is slow or gets stuck processing the prompt, this doesn't cancel it..

We need a better method.

@tk-master

Copy link
Copy Markdown
Contributor

I'm coming back to this because I need to figure out a better method to interrupt the generation programmatically..

For a console-based scenario it's pretty easy in python, all I have to do is surround the code with try except KeyboardInterrupt: .. then I can just press ctrl+c at any point to gracefully interrupt the llm..

But.. if I'm using a front-end user interface, I haven't managed to make it work properly let's say with a button "Stop generating" that can call a python function.. because of the issue I mentioned in the previous post..

@abetlen sorry to bother again but do you have any suggestions/ideas on how to accomplish this?

@abetlen abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24
@woheller69

Copy link
Copy Markdown

Why not add it now and improve if there is a better solution. For now this would work in most cases.

@woheller69

Copy link
Copy Markdown

has anyone found a reasonable solution for this? Or am I the only one not willing to wait until the model finishes without killing the job and losing context?

@jewser

jewser commented May 11, 2024

Copy link
Copy Markdown

Any chance this gets merged for now?

@madprops

Copy link
Copy Markdown

It indeed blocks until the first token is produced, but cancelling it after that is trivial. The other similar issue is cancelling a model that is loading.

@woheller69

Copy link
Copy Markdown

gpt4all python bindings offer a similar way which allows stopping with the next token

@ekcrisp

ekcrisp commented Nov 21, 2024

Copy link
Copy Markdown

+1 can we merge this?

@kingbri1

Copy link
Copy Markdown

Take a look at ggml-org/llama.cpp#10509 which should permanently solve this problem on lcpp's side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamically intterupt token generation

7 participants