Im using Ollama on my server with the WebUI. It has no GPU so its not quick to reply but not too slow either.

Im thinking about removing the VM as i just dont use it, are there any good uses or integrations into other apps that might convince me to keep it?

  • Bluesheep@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    I don’t know how tech savvy you are, but I’m assuming since your on lemmy it’s pretty good :)

    The way we’ve solved this sort of problem in the office is by using the LLM’s JSON response, and a prompt that essentially keeps a set of JSON objects alongside the actual chat response.

    In the DND example, this would be a set character sheets that get returned every response but only changed when the narrative changes them. More expensive, and needing a larger context window, but reasonably effective.