[PR #380] [CLOSED] Add competitor_analysis fallback for deep competitor onboarding scheduling #684

New issue

Closed

opened 2026-03-13 21:03:43 +03:00 by kerem · 0 comments

kerem commented

2026-03-13 21:03:43 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/380
Author: @AJaySi
Created: 3/5/2026
Status: ❌ Closed

Base: main ← Head: codex/update-deep-competitor-scheduling-logic

📝 Commits (1)

0bc2f5c Add competitor_analysis fallback for deep competitor scheduling

📊 Changes

2 files changed (+212 additions, -5 deletions)

View changed files

📝 backend/api/onboarding_utils/onboarding_completion_service.py (+70 -5)
➕ backend/tests/test_onboarding_completion_service.py (+142 -0)

📄 Description

Motivation

Ensure deep competitor analysis is scheduled even when research_preferences.competitors is empty by falling back to persisted CompetitorAnalysis records.
Normalize legacy CompetitorAnalysis records into the URL/domain format expected by the DeepCompetitorAnalysisTask payload so the analysis executor receives valid inputs.
Avoid silently skipping the deep competitor scheduling when competitor info exists in alternate persistence, and provide explicit logs for observability.

Description

Added _normalize_competitor_identifier to convert a string or CompetitorAnalysis-style record (keys like competitor_url, url, website_url, competitor_domain, domain) into a normalized URL or domain string.
Added _get_deep_competitor_targets which keeps research_preferences.competitors as the primary source and falls back to normalized integrated_data["competitor_analysis"] with deduplication.
Integrated the fallback into complete_onboarding() so a deep competitor DeepCompetitorAnalysisTask is scheduled when either source has entries, and only skipped when both are empty.
Added explicit logs indicating which source was used (research_preferences vs competitor_analysis) and the competitor count, and adjusted the skip warning message.
Added a unit test backend/tests/test_onboarding_completion_service.py that mocks integration points and asserts a deep competitor task is created and that the payload contains normalized competitor URLs/domains when only competitor_analysis exists.

Testing

Ran pytest -q backend/tests/test_onboarding_completion_service.py, which passed (1 test, 1 passed, 1 warning).
The new unit test exercises complete_onboarding() with mocked integration and database hooks and verifies the created task payload contains normalized competitors.

Codex Task

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/380 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/5/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `codex/update-deep-competitor-scheduling-logic` --- ### 📝 Commits (1) - [`0bc2f5c`](https://github.com/AJaySi/ALwrity/commit/0bc2f5c3b426cbf32e6dd24e915acba051fee9df) Add competitor_analysis fallback for deep competitor scheduling ### 📊 Changes **2 files changed** (+212 additions, -5 deletions) <details> <summary>View changed files</summary> 📝 `backend/api/onboarding_utils/onboarding_completion_service.py` (+70 -5) ➕ `backend/tests/test_onboarding_completion_service.py` (+142 -0) </details> ### 📄 Description ### Motivation - Ensure deep competitor analysis is scheduled even when `research_preferences.competitors` is empty by falling back to persisted `CompetitorAnalysis` records. - Normalize legacy `CompetitorAnalysis` records into the URL/domain format expected by the `DeepCompetitorAnalysisTask` payload so the analysis executor receives valid inputs. - Avoid silently skipping the deep competitor scheduling when competitor info exists in alternate persistence, and provide explicit logs for observability. ### Description - Added `_normalize_competitor_identifier` to convert a string or `CompetitorAnalysis`-style record (keys like `competitor_url`, `url`, `website_url`, `competitor_domain`, `domain`) into a normalized URL or domain string. - Added `_get_deep_competitor_targets` which keeps `research_preferences.competitors` as the primary source and falls back to normalized `integrated_data["competitor_analysis"]` with deduplication. - Integrated the fallback into `complete_onboarding()` so a deep competitor `DeepCompetitorAnalysisTask` is scheduled when either source has entries, and only skipped when both are empty. - Added explicit logs indicating which source was used (`research_preferences` vs `competitor_analysis`) and the competitor count, and adjusted the skip warning message. - Added a unit test `backend/tests/test_onboarding_completion_service.py` that mocks integration points and asserts a deep competitor task is created and that the payload contains normalized competitor URLs/domains when only `competitor_analysis` exists. ### Testing - Ran `pytest -q backend/tests/test_onboarding_completion_service.py`, which passed (1 test, 1 passed, 1 warning). - The new unit test exercises `complete_onboarding()` with mocked integration and database hooks and verifies the created task payload contains normalized competitors. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69a9a27e7c848328b88fcf57c02d72e0) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

kerem

2026-03-13 21:03:43 +03:00

closed this issue
added the
pull-request
label

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/ALwrity#684

No description provided.

Rows
Columns