mirror of
https://github.com/jehna/humanify.git
synced 2026-04-27 09:35:58 +03:00
[GH-ISSUE #14] Ollama Support #10
Labels
No labels
bug
enhancement
pull-request
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/humanify#10
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @0xrsydn on GitHub (Jun 18, 2024).
Original GitHub issue: https://github.com/jehna/humanify/issues/14
is it possible to use llama3 via ollama rather than huggingface one?
@jehna commented on GitHub (Jun 19, 2024):
Not possible at the moment, but should be straightforward to implement if you'd like to give it a shot
You can check
LlamaCppdocs from Guidance and change (preferably parametrize) the config:https://github.com/jehna/humanify/blob/main/local-inference/guidance_config.py
@0xdevalias commented on GitHub (Jun 19, 2024):
I wonder if it's worth implementing a wrapper/abstraction layer like LiteLLM to make things more flexible?
This is what projects like
aideruse:Though I'm not currently sure if/how compatible that is with the
guidancemodule you're currently using:See also:
@0xdevalias commented on GitHub (Jun 20, 2024):
@jehna Curious, what aspects of
guidancedoeshumanifycurrently rely on? Is it using much of the deeper 'controls' provided by it?Skimming the following prompt files:
It looks like
gen,stopandstop_regexare used:@jehna commented on GitHub (Aug 12, 2024):
There's now v2 that runs on top of llama.cpp, so adding llama3 support should be even more straightforward.
@0xrsydn which version of llama3 were you planning to run? I could add it in to the new version
@0xrsydn commented on GitHub (Aug 13, 2024):
i think the recent one (llama3.1 8b) is great. Thanks btw
@jehna commented on GitHub (Aug 14, 2024):
I researched a bit about Ollama. If I'm correct, you could run Ollama locally and Humanify could connect to its API to use any model that Ollama uses.
There seems to be an undocumented feature that allows passing GBNF grammars as an argument to the model:
https://github.com/ollama/ollama/issues/3616#issuecomment-2068195083
...but judging from other open issues about the topic I'm not really sure if it works or not. But I'll give it a try!
@0xdevalias commented on GitHub (Aug 15, 2024):
This seems like it's a good overarching/summarising issue; still doesn't provide full clarity yet, but links to seemingly all the related issues, and points out that now that OpenAI supports it, it's sort of become a higher priority:
Based on my read of these:
It sounds like it's not currently possible to use the GBNF functionality on the current main/released version of Ollama.
According to this:
It sounds like ollama currently supports JSON mode, and that is built as a GBNF grammar (presumably on top of llama.cpp's support of it), but that the ability to use a custom grammar isn't currently exposed to the end user.
@Kinglord commented on GitHub (Aug 15, 2024):
Sadly @0xdevalias is correct and what you want to do @jehna will not work unless you patch your version of Ollama with the PR that was linked above, the release version still has no support for GBNF outside of the built in json mode. Ollama still refuses to even reply to this issue for some really strange reason, I still have no idea why they won't talk about it all and simply let the PRs keep rolling in and sit there. At this point all we can do is keep pressuring them by raising issues, making noise both here and on the Discord until we can get someone to take 10 minutes and explain to us this decision to essentially completely block this feature from end users in Ollama.
@jehna commented on GitHub (Aug 15, 2024):
Thank you for looking into this. I just pushed
ollama-supportbranch that should start working if they start supporting the grammar flag@jehna commented on GitHub (Aug 15, 2024):
☝️ added llama3.1 8b model support
@dangelo352 commented on GitHub (Sep 29, 2024):
How do we use ollama with this sorry if this is a dumb question.
@jehna commented on GitHub (Oct 18, 2024):
@dangelo352 unfortunately there's no Ollama support yet. You can run the model locally using
humanify local@jehna commented on GitHub (Oct 18, 2024):
Since v2.2.0 there's now a configurable
--baseURLparameter at the OpenAI mode. Unfortunately Ollama does not yet support structured outputs, although I'm sure it's on their roadmap as the official OpenAI API supports it now.Please keep an eye out for ollama/ollama#6473, as soon as they add the support and close that issue, you should be able to use Ollama as humanity backend by:
@0xdevalias commented on GitHub (Feb 18, 2025):
That issue is closed now:
There is a more detailed note in this related issue:
Looking at the issue it links to:
I'm not sure if that is sufficient for
humanifyto work with ollama; and if it is, whether it will work as-is or if something more needs to be done on this side to make it work. @jehna would probably be the best to know that off-hand.While I haven't looked into it deeper, I noticed that there is an Ollama JS SDK, so if we wanted to build more specific support into
humanify, perhaps that would be a good place to start:@0xdevalias commented on GitHub (Apr 10, 2025):
See also:
@0xdevalias commented on GitHub (Sep 20, 2025):