Unsloth's quantizations are already SOTA, with the introduction of the Unsloth Studio, I would love the capability of providing my corporate team a multi-user simple authentication. Currently, I am ok if there is no history saved, I just want a simple interface to access for multiple users concurrently. (inference can be queued if concurrent requests are not possible in llama.cpp yet)
Let me know if I can clarify the request - I see the Unsloth studio taking over as the defacto local UI engine for GGUFs and llama.cpp
Thank you,
Aztec
Unsloth's quantizations are already SOTA, with the introduction of the Unsloth Studio, I would love the capability of providing my corporate team a multi-user simple authentication. Currently, I am ok if there is no history saved, I just want a simple interface to access for multiple users concurrently. (inference can be queued if concurrent requests are not possible in llama.cpp yet)
Let me know if I can clarify the request - I see the Unsloth studio taking over as the defacto local UI engine for GGUFs and llama.cpp
Thank you,
Aztec