[GH-ISSUE #226] Coalescence #59

New issue

Closed

opened 2026-03-03 13:52:39 +03:00 by kerem · 3 comments

kerem commented

2026-03-03 13:52:39 +03:00

Owner

Originally created by @neoOpus on GitHub (Nov 26, 2024).
Original GitHub issue: https://github.com/jehna/humanify/issues/226

I've found this and thought that it is maybe interesting for this project.

https://blog.dottxt.co/coalescence.html#org25a7abb

Originally created by @neoOpus on GitHub (Nov 26, 2024). Original GitHub issue: https://github.com/jehna/humanify/issues/226 I've found this and thought that it is maybe interesting for this project. https://blog.dottxt.co/coalescence.html#org25a7abb

kerem closed this issue

2026-03-03 13:52:39 +03:00

kerem commented

2026-03-03 13:52:40 +03:00

Author

Owner

@jehna commented on GitHub (Nov 26, 2024):

Interesting! Unfortunately there's not much humanify can do, as it's using llama.cpp for inference under the hood. This would need to be implemented in llama.cpp side, but unfortunately there seems not to be that much interest:

https://github.com/ggerganov/llama.cpp/issues/5292

@jehna commented on GitHub (Nov 26, 2024): Interesting! Unfortunately there's not much humanify can do, as it's using llama.cpp for inference under the hood. This would need to be implemented in llama.cpp side, but unfortunately there seems not to be that much interest: - https://github.com/ggerganov/llama.cpp/issues/5292

kerem commented

2026-03-03 13:52:40 +03:00

Author

Owner

@neoOpus commented on GitHub (Nov 26, 2024):

I see thank you for the explanation.

@neoOpus commented on GitHub (Nov 26, 2024): I see thank you for the explanation.

kerem commented

2026-03-03 13:52:40 +03:00

Author

Owner

@0xdevalias commented on GitHub (Feb 18, 2025):

See also:

https://github.com/ggml-org/llama.cpp/discussions/5455
- It's not implemented - the main problem is that a deterministic string can be represented in many different ways in terms of tokens (e.g. "hello" -> ["h", "ello"], ["he", "llo"], ["hel", "l", "o"], ...) so it is not clear beforehand which representation the LLM would generate.
  
  The closest thing to this functionality that can be done even now is to use a fast draft model which you also constrain with the same grammar (see speculative example)
  
  Originally posted by @ggerganov in https://github.com/ggml-org/llama.cpp/discussions/5455#discussioncomment-8438137
https://github.com/dottxt-ai/outlines
- Structured Text Generation
https://github.com/dottxt-ai/outlines-core
- Structured generation in Rust

Also this note in ollama:

Hey everyone!

With the merging of https://github.com/ollama/ollama/issues/7900, we're introducing structured output to be able to go from a json schema to structured generation! Really appreciate all the feedback and contributions. Extremely thankful for all of you being so involved in this 🙏🏽

There are a few things we're still keeping in mind over the next few months. The first focus is going to be around performance - speed and accuracy. There has been a lot of research coming out around this, we're keeping a close eye and are going to see how we can integrate some of this into Ollama. We're also thinking about how to support structured generation in the long term and that'll play nicely with a lot of the work we're doing on our new engine.

Stoked for the coming few months, hope to improve both performance and accuracy around sampling and constrained decoding.

Thank you again for your patience, we're super excited to get this out in an upcoming release! Will spin out more issues around this as well - happy to keep you all posted as well!

Originally posted by @ParthSareen in https://github.com/ollama/ollama/issues/6237#issuecomment-2518836758

Sounds like there might be some scope for supporting something like this there.. I asked more specifically:

Curious, would that potentially include support for Coalesce/similar?

https://blog.dottxt.co/coalescence.html

https://github.com/ggml-org/llama.cpp/issues/5292

https://github.com/ggml-org/llama.cpp/discussions/5455

Originally posted by @0xdevalias in https://github.com/ollama/ollama/issues/6237#issuecomment-2665163558

@0xdevalias commented on GitHub (Feb 18, 2025): See also: - https://github.com/ggml-org/llama.cpp/discussions/5455 - > It's not implemented - the main problem is that a deterministic string can be represented in many different ways in terms of tokens (e.g. "hello" -> ["h", "ello"], ["he", "llo"], ["hel", "l", "o"], ...) so it is not clear beforehand which representation the LLM would generate. > > The closest thing to this functionality that can be done even now is to use a fast draft model which you also constrain with the same grammar (see `speculative` example) > > _Originally posted by @ggerganov in https://github.com/ggml-org/llama.cpp/discussions/5455#discussioncomment-8438137_ - https://github.com/dottxt-ai/outlines - > Structured Text Generation - https://github.com/dottxt-ai/outlines-core - > Structured generation in Rust --- Also this note in `ollama`: > Hey everyone! > > With the merging of https://github.com/ollama/ollama/issues/7900, we're introducing structured output to be able to go from a json schema to structured generation! Really appreciate all the feedback and contributions. Extremely thankful for all of you being so involved in this 🙏🏽 > > There are a few things we're still keeping in mind over the next few months. The first focus is going to be around performance - speed and accuracy. There has been a lot of research coming out around this, we're keeping a close eye and are going to see how we can integrate some of this into Ollama. We're also thinking about how to support structured generation in the long term and that'll play nicely with a lot of the work we're doing on our new engine. > > Stoked for the coming few months, hope to improve both performance and accuracy around sampling and constrained decoding. > > Thank you again for your patience, we're super excited to get this out in an upcoming release! Will spin out more issues around this as well - happy to keep you all posted as well! > > _Originally posted by @ParthSareen in https://github.com/ollama/ollama/issues/6237#issuecomment-2518836758_ Sounds like there might be some scope for supporting something like this there.. I asked more specifically: > Curious, would that potentially include support for Coalesce/similar? > > - https://blog.dottxt.co/coalescence.html > - https://github.com/ggml-org/llama.cpp/issues/5292 > - https://github.com/ggml-org/llama.cpp/discussions/5455 > > _Originally posted by @0xdevalias in https://github.com/ollama/ollama/issues/6237#issuecomment-2665163558_

kerem referenced this issue

2026-03-03 13:53:12 +03:00

[PR #59] [MERGED] Bump eslint from 9.9.0 to 9.9.1 #129