OnsiteLLM
Inference Console
Checking API
01
Runtime
API URL
Mode
Generate
Stream
WebSocket
Max output tokens
Parallel requests
Run single inference
Run parallel inference
Clear output
02
Prompt
Say hello in one short sentence.
Responses
Ready
Ready.