mirror of
https://github.com/AJaySi/ALwrity.git
synced 2026-04-25 08:55:58 +03:00
[GH-ISSUE #297] [FEATURE] OpenRouter Support for Custom Models Evaluation #550
Labels
No labels
AI Content Agents
AI Content Strategy
AI Content planning
AI Marketing Tools
AI SEO
AI personalization
AI writer
ALwrity Copi-lot
Alwrity web search
Anthropic
DeepSeek
Gemini AI
Integration
LLM
OnBoarding
OnBoarding
RAG knowledgebase Memory
bug
documentation
enhancement
good first issue
help wanted
invalid
openai
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ALwrity#550
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @doncat99 on GitHub (Oct 13, 2025).
Original GitHub issue: https://github.com/AJaySi/ALwrity/issues/297
Originally assigned to: @AJaySi on GitHub.
🚀 Feature Description
Integration of OpenRouter to enable support for custom AI models, facilitating comprehensive model evaluation within the platform.
💡 Motivation
This feature is essential to enhance the platform's flexibility by allowing users to incorporate and assess a diverse array of AI models from multiple providers. It addresses the limitation of relying solely on predefined models, enabling users to evaluate performance metrics such as accuracy, response quality, and efficiency in content generation tasks, thereby optimizing outcomes for specific use cases.
📝 Detailed Description
The feature should integrate OpenRouter as an API gateway to route requests to custom AI models. Users would access a dedicated settings panel to input their OpenRouter API key and select from available models. An evaluation module would be implemented, allowing side-by-side comparisons of model outputs based on user-defined prompts, with metrics including generation speed, coherence, relevance, and creativity scores. Integration would involve backend handling of API calls via OpenRouter, ensuring secure authentication and error management. Frontend components would include dashboards for visualization of evaluation results, such as charts displaying comparative performance data.
🎯 Use Cases
Describe specific use cases for this feature:
A content creator evaluates multiple models for blog writing to identify the one producing the most engaging and SEO-optimized articles.
An SEO specialist compares model outputs for keyword integration and content planning, selecting the optimal model for dashboard analytics.
A social media manager tests custom models for generating LinkedIn or Facebook posts, assessing tone consistency and audience engagement potential.
🎨 Mockups/Designs
Not applicable at this stage; however, wireframes could include a settings interface for API key entry, a model selection dropdown, and a results dashboard with tabular and graphical representations of evaluation metrics.
🔧 Technical Considerations
Any technical considerations or implementation notes:
Requires backend changes
Requires frontend changes
Requires database changes
Requires third-party integration
Other: _______________
Implementation notes: Ensure compliance with OpenRouter's API rate limits and authentication protocols. Handle potential latency variations across models and incorporate fallback mechanisms to default models in case of integration failures.
🏷️ Component/Feature Area
Which component or feature area does this relate to?
Blog Writer
SEO Dashboard
Content Planning
Facebook Writer
LinkedIn Writer
Onboarding
Authentication
API
UI/UX
Performance
Other: Model Management
🎯 Priority
Critical (essential for core functionality)
High (significant value add)
Medium (nice to have)
Low (future consideration)
🔄 Alternatives Considered
Direct integrations with individual AI providers (e.g., OpenAI, Anthropic, or Google) were evaluated; however, these would require multiple separate implementations, increasing maintenance complexity. OpenRouter offers a unified interface for accessing diverse models, reducing development overhead while providing greater extensibility.
📚 Additional Context
Reference OpenRouter's official documentation for API specifications: https://openrouter.ai/docs. This integration aligns with industry trends toward model-agnostic platforms, as seen in similar tools like LangChain, which emphasize evaluation frameworks for AI performance benchmarking.
🤝 Contribution
Are you willing to contribute to implementing this feature?
Yes, I can help implement this
Yes, I can help with testing
Yes, I can help with documentation
No, but I can provide feedback
No, just suggesting the idea
@AJaySi commented on GitHub (Oct 14, 2025):
@doncat99
Thank you so much for the great feature suggestion. @Om-Singh1808 and @Ratna-Babu have been exploring ollama and also unsloth.
We are looking at 2 things to achieve, in recent future:
1). Fine Tuning Small LLMs on enduser's digital Presence and also a base models as SMEs for SEO, Blogging, Social Media platforms etc. Thus, ALwrity will have home-grown/fine-tuned SLM and will be lot cheaper to experiment with.
2). We want to do this to inch closer to content Hyper personalization, through fine-tuning on end user's data, there is a basic implementation present in onboarding. At present, this allows us to not ask irritating inputs and produce/mimic end user linguistic styles from previously written articles.
3). You are absolutely right to point to openrouter as a solution to orchestrate ALwrity fine-tuned models for specific tasks.
But, then as copilotkit is already integrated, maybe using AG-UI with ADK Or Dify makes more sense. We also went down the crewai path, but all AI agents frameworks are always expensive and overkill, if one can design better workflows.
4). At present, our onboarding process collects end user website articles, gsc and social media accounts and competitor data(in-progress) and then generates a persona. Also, SML with better prompting and context, produces better results than best LLMs.
5). I agress with you on OpenRouter, but let us also know your views on litellm and ollama(with custome routing) ?
Please refer to previous discussion and maybe we can align this there : https://github.com/AJaySi/ALwrity/issues/287
@doncat99 commented on GitHub (Oct 14, 2025):
Thank you for your thoughtful response to my feature request.
I appreciate your proposal to incorporate model evaluation capabilities, aimed at selecting the most suitable model for specific scenarios or use cases. Regarding LiteLLM and Ollama, I agree they extend beyond mere model providers, offering robust tools for management, local deployment, and routing.
In my configuration, Google's ADK integrates LiteLLM to enable OpenRouter support (https://github.com/google/adk-python/issues/171)
A hybrid approach—leveraging OpenRouter for core routing while integrating LiteLLM for advanced evaluation and fine-tuning—could optimally support ALwrity's goals of cost efficiency and hyper-personalization. I am open to further dialogue on implementation.
@AJaySi commented on GitHub (Oct 14, 2025):
Thank you @doncat99, your hybrid approach makes a lot of sense.
Request your patience, while I clarify my doubts and seek your guidance:
We are in complete agreement on the following core principles:
User Choice: Informed users should be able to select their LLMs for the final content generation step, possibly presented via a dedicated column in the Step 4 workflow that includes OpenRouter options.
Hybrid Functionality: The end user should choose the model for the final draft, but the platform should use SLMs for iterative refinement and small edits (Copilot/Editor tasks).
Advanced Routing: Once we scale specialized fine-tuned models, advanced routing or an Agent framework will be essential.
Assumption1: ALwrity target audience are non-tech content creators, digital marketing professionals, solopreneurs etc, who cannot compete in biased online market. We need to KIS, for them.
Assumption2: ALwrity is AI-First, Copilot & VUI(TBD) based platform, with multimodal content generation. As a SME digital marketing platform, we want to guide the end users and abstract all AI complexities, including prompting.
Assumption3: Digital Marketing is Tough and Alwrity as SME platform, will need to decide a lot of models for the end user, seo, platform-specific, analytics, research, db, editor specific, copilot etc. Which, when left to them, will prove to be too much for non-tech-marketing end users.
Assumption4: ALwrity is a complete AI content lifecycle platform, while it makes sense to choose AI model for final content draft, but there is iterative online research, gsc insights, competitor gap analysis, outline generation, refining outline, generating draft, hallucination checker, assistive writing, AI editor.
Using, one AI model for the content life cycle will be too expensive and environmentally.
As a specific usecase, in ALwrity onboarding, step 4, we generate a end user persona, based on home page content and structured style, linguistic analysis. The idea is to scale this to multiple blogs and social media content. In this step 4, the end user can see the results of content generated with/without persona and provides feedback to tweak the persona and accept/confirm it. This persona will improve with every content generated and user feedback + mem0(any AI memory layer).
In the above workflow, the non tech end user can arrive at a personality, without knowing the underlying AI model used. In future, a fine tuned, persona generation SLM will suffice. Also, stuffing persona in system prompts wont scale. I am inclined towards the environmental impact, when the end users simply throw 700b models for mundane tasks.
Should we first go down the path of fine tuning as per end user digital assests and analytics first with unsloth ? We have designed our onboarding to help gather all the data needed for fine-tuning a gpt.
We can then experiment with our fine-tuned models and route to them with openrouter ? Thus, we have fine-tuning with unsloth and routing with openrouter/ADK and not use litellm ?
@doncat99 commented on GitHub (Oct 14, 2025):
Thank you for your detailed reply and for clarifying your assumptions and questions. I appreciate the alignment on core principles like user choice, hybrid functionality, and advanced routing.
From a product manager's perspective, while non-technical end users should indeed be shielded from AI complexities to keep the platform simple (KIS principle), they deserve agency in selecting output quality levels without delving into model specifics. This can be achieved by mapping models to intuitive quality tiers (e.g., "Basic," "Standard," "Premium") based on factors like accuracy, speed, and cost. For instance, extend your existing TASK_LLM_CONFIGS structure with a "quality" key to reflect these relationships:
This allows users to select quality in the Step 4 workflow or final draft, while the platform routes to appropriate models behind the scenes, supporting evaluation and personalization without overwhelming them.
I must frankly note that the project code remains tightly coupled to Google Gemini from its initial 2024 version. Last year, I attempted to refactor it with a general LLM abstraction layer, achieving partial success before shifting focus to other commitments. To enable flexibility, I recommend decoupling Gemini from business logic and implementing a general model wrapper layer (e.g., using OpenAI-compatible interfaces) to facilitate seamless integration of diverse providers.
Regarding your question on prioritization: Yes, starting with fine-tuning based on end-user digital assets and analytics using Unsloth makes strategic sense, leveraging your onboarding data collection for efficient GPT-like model customization. This can then transition to experimentation and routing via OpenRouter or ADK, bypassing LiteLLM if it adds unnecessary complexity. This path aligns with environmental and cost considerations by reserving larger models for high-value tasks while using fine-tuned SLMs for mundane ones.
@AJaySi commented on GitHub (Oct 15, 2025):
Hello @doncat99
Thank you for intellectually stimulating dialogue and bearing with my monologues.
1). "This can be achieved by mapping models to intuitive quality tiers (e.g., "Basic," "Standard," "Premium") based on factors like accuracy, speed, and cost."
Should the mapping be with content life cycle phase and models suited for it ? For example, ALwrity divides content lifecycle into: Content Planning(strategy, research, competitor analysis, calendar generation), content generation, publish, Analytics, Engage and remarket.
For research, we depend on Google grounding, Exa and Tavily and feed that result to LLM, thus quality of online research context is more important than LLM selection, if we agree that better context can yield better results from most AI models.
In all of the above 6 phases, your mapping with quality tiers makes sense in content generation phase.
Request you to please review https://github.com/AJaySi/ALwrity/issues/287 and we can shift focus to openrouter . @Ratna-Babu has been a great addition and I request him to take openrouter support on priority, pretty please. @doncat99 I would request your guidance to Ratna, "decoupling Gemini from business logic and implementing a general model wrapper layer"
"I must frankly note that the project code remains tightly coupled to Google Gemini from its initial 2024 version."
My Bad, and I do feel ashamed about it.
Thank you so much & Regards.
@Ratna-Babu commented on GitHub (Oct 15, 2025):
I will check the #299 and merge it with #288 if possible. After completing it i will take a look at this.
@Om-Singh1808 commented on GitHub (Oct 15, 2025):
I have checked out everything here and I am ready to work on making Alwrity better
Special thanks to @doncat99 for helping us with this.
@doncat99 commented on GitHub (Oct 15, 2025):
Indeed, I prefer Google Grounding as well. I prefer Google ADK and A2A protocol for multi-agent communication.
Below is my agent code for your reference.
@AJaySi commented on GitHub (Oct 16, 2025):
You should also checkout metaphor websets and tavily AI. ALwrity using google grounding for SERP analysis.
Me and @uniqueumesh experimented with CrewAI, when ADK, A2A, MCP et al were not even coined. Its gathering dust somewhere in the codebase. Following are the reasons, for agent-framework loathing:
If, AI agent is an AI + Tool calling and + collaboration among agents gives AI agent team, then what I am doing as a software engineer ?
I am required to make my own tooling, prompt engineer agents and then throw a problem at them, hoping for the best(out-of-tokens). Its like a black box. Of course, iteratively, I will tweak each agent to shut-up and talk to other ones and get that from that agent and also talk to the manager, if this is not good enough, do HITL also. This is too much of uncertainty for me and an overkill.
I like my software to be deterministic, io driven, phased and iteratively built, where each step, produces a result I can predict and pass on. This can be easily achieved, when we dont make softwares for the AI agents, but for the end users.
Agent frameworks are too chatty, I have not come across an AI agent implementation that could not have been done but using plain glue coding and passing result from one AI to another, getting context from tooling and moving on to next step of shitty code.
A simple example is like 'Hey, agents, write me SEO optimized Blog on AI Agents, that will rank in top 10'. Any AI will NOT refuse and give you an answer, An AI agent team will work 100x, chatting and making you convinced, the post will go viral (and i can fly without my paraglider...).
Without AI framework, we need to web research, get your target audience, gsc, bing analytics, checks your existing blogs, trending topics, analyze competitors, fact/hallucinations checks, outline, draft, SEO metadata generation, supporting articles for other social media platforms, publish, analyze/monitor, edit/update, remarket. Thus, To get the above done AI is only 20% and 80% is existing shitty-glue-code the world is used to. So, the idea is to make ALwrity a more intelligent platform than its AI, we only need AI to talk back in human and machines languages, reasoning is in algos, tool calling is shitty-glue-code.
ALwrity is environmentally conscious AI-first team, we will not give an extra joule of energy to AI and will starve them, whenever possible. We will fine-tune our own SLM and glue them together, without them even knowing that there is another AI in the workflow, let one small AI do, one small thing, very well.
I will be working on this soon : https://openrouter.ai/docs/use-cases/oauth-pkce
@AJaySi commented on GitHub (Oct 30, 2025):
Ok, So, I will be supporting huggingface response API: https://huggingface.co/docs/inference-providers/guides/responses-api
Note: This is implemented in AI blog writer and committed, I will now change the onboarding process with it.
Use : HF_TOKEN and GPT_PROVIDER=gemini|huggingface_response_api
"""
The Responses API (from OpenAI) provides a unified interface for model interactions with Hugging Face Inference Providers. Use your existing OpenAI SDKs to access features like multi-provider routing, event streaming, structured outputs, and Remote MCP tools.
"""