mirror of
https://github.com/RayLabsHQ/gitea-mirror.git
synced 2026-04-25 15:25:55 +03:00
[GH-ISSUE #90] Error when a trying to mirror large number of repos #47
Labels
No labels
bug
documentation
enhancement
help wanted
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/gitea-mirror#47
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @sunny-g on GitHub (Sep 7, 2025).
Original GitHub issue: https://github.com/RayLabsHQ/gitea-mirror/issues/90
Originally assigned to: @arunavo4 on GitHub.
When trying to mirror my github account, I'm repeatedly running into an issue mirroring my starred repositories (4.6k). However, when just focusing on everything else, there doesn't seem any issue (thanks for the great project btw).
What other info can I provide for you?
gitea-mirror: v3.5.4
@Tailscale-VPS commented on GitHub (Sep 8, 2025):
@sunny-g Pls provide the repo url.
Also the env variables u were using, redact info
@sunny-g commented on GitHub (Sep 8, 2025):
Don't think I can provide a repo URL, as the job is failing just when trying to "mirror starred repositories" - here's a more complete log output when running the following configuration:
the service yaml:
@Tailscale-VPS commented on GitHub (Sep 8, 2025):
SO it never started mirroring for u?
Can u try changing the batch value to 100 or 500 (I'm not an expert , this may lead to overload your server/network/disk if low spec )
Also for testing u can change the schedule interval to 5 or 10m, u dont have to wait for hours to test.
If u can clean /mnt/user/appdata/gitea-mirror and start fresh its better, else deploy another temp comtainer.
Also just to be sure your repos haven't mirrored yet right? They are in imported state?
Because after the new update , Creating the 1st user and after the restarting the container will automatically start Repo mirror.
Not to troll or something
But how much space u have for archiving 4.6k repos 😅, also u are using auto mirroring, if i were u i would definitely do them manually and slowly (Not to get banned or blocked)
@sunny-g commented on GitHub (Sep 8, 2025):
Correct - I originally started with this config, enabling everything and nothing was imported or migrated. I then nuked the gitea-mirror directory, started over with a barebones config, and was able to import AND migrate ~480 personal repos.
My own are imported and migrated (in gitea), and look great! The issue here is solely regarding "expanding" the space of imports and migrations to include starred repos.
No problem - I've had a github account since 2012 or something, and have TBs of space so not worried :)
How do I do this from within the UI? I'm seemingly not able to even just get the list of starred repos successfully ingested (though in some logs I can see that they are being retrieved)
Ultimately, it seems like
SQLite query expected 37133 values, received 102669is suggesting a query/insert statement size issue, though I havent dug into the codebase deep enough to confirm.However, I'll try your suggestions as soon as I can. Thank you!
@Tailscale-VPS commented on GitHub (Sep 8, 2025):
To turn off auto mirroring u need to check the env docs there are centain env , which when used will also enable auto mirror (like schedule/mirror interval i believe).
If u are not able to even just get the list of starred repos successfully, i believe its the massive 4.6k repo import which is failing probably due sqlite constraints.
Here i think what's the simple way of checking -
Deploy a new container with auto mirror off and then On Web UI click on import from github and check logs , if it still gives the same error then we can convert this issue into an enhancement/bug, If that happened then
Soultion Probably
Similarly like we have mirror batch in schedule we will need to implement a batch on import also OR a bit easier will be use the same env but use it for both mirror batch and import batch.
The 1st one gives the user more flexibility though because import is not really disk or network intensive , so users can have a high imoort value (Still should be under limits cuz we are sending requests to github), on the other hand mirror batch depends solely on how good your gitea server/network is.
Also i believe we are hitting different API when importing and when mirroring. Importing is just information based on your account but mirroring a repo includes the information,codebase,realesse,PR,atifacts,etc
@arunavo4 commented on GitHub (Sep 9, 2025):
@sunny-g Please try out v3.6.0. And let me know if this fixes your issue. I have also added Github API rate limiting so that it can mirror your large number of repos.
@Jefferderp commented on GitHub (Sep 12, 2025):
I am currently experiencing this bug on v3.6.0 installed via
docker-compose.yml.Please let me know what logs I can provide or troubleshooting steps to take. Thanks!
@arunavo4 commented on GitHub (Sep 13, 2025):
@Jefferderp how many repos do you have in total and when does it happen ? Does it happen at the start or after how many mirrors? Can you provide some more context. Also what is your config setup.
Cause I am trying to replicate this and it has not occurred on my end.
@Jefferderp commented on GitHub (Sep 13, 2025):
@arunavo4 This is a fresh setup of gitea-mirror. I click "Import GitHub Data" for the first time, and the scan consistently runs until page 53, where it ends with the above error. I assume page 53 is the final page, no error there. The error is thrown immediately afterwards.
GitHub says I have "5,000+" stars (wow!). Given that each request is for 100 repos, and there are 53 pages fetched, that comes out to ~5,300 repos. Seems like a match to me.
My docker-compose.yml is completely stock - all settings changed via the GUI. Screenshot:
The "Test Connection" buttons for GitHub and Gitea both return a "Successfully connected" message. No issues there.
I've added a few more starred repos in the past day, and the error message has changed a bit:
Consistently (at least 4 times now), the error triggers immediately after fetching page 53, where I assume it tries to iterate through the downloaded list of starred repos. Based on the timing of the error, I can tell this is some kind of internal issue. Each GitHub response takes ~2.5s, but the error triggers immediately after page 53 has been downloaded, which I strongly believe is the final page anyway. So not a networking issue, and not a rate limit (I can immediately retry the operation and download all 53 pages again).
I wish I could provide more context, but to my knowledge I have a pretty standard setup here.
Let me know if I can provide anything else, and thanks for your help!
@Jefferderp commented on GitHub (Sep 13, 2025):
In case it's helpful, I restarted the Docker container, and the following happened automatically on boot:
@arunavo4 commented on GitHub (Sep 13, 2025):
@Jefferderp thanks for the detailed info. So the github signed in api request is rated at 5000 requests/hr and tho I have implemented the rate limit. since i dont have that many repos it seems at my end it never fails and this failure is happening when there is a high number of repos.
will add more logs and see how this can be fixed.
@arunavo4 commented on GitHub (Sep 13, 2025):
@Jefferderp after checking looks like due to high number of repos was exceeding the 999 parameter limit of sqlite db so I have added batch inserts to the sqlite db now that should solve the issue. Can you test out the
v3.7.0@Jefferderp commented on GitHub (Sep 13, 2025):
@arunavo4 That worked! Repos are mirroring now. Thank you so much :)
@arunavo4 commented on GitHub (Sep 14, 2025):
@Jefferderp Awesome
@sunny-g I think your problem should then also be solved now.
I will close this issue if you still have issues then feel free to reopen this.
@sunny-g commented on GitHub (Sep 14, 2025):
@arunavo4 seems good to go, thank you!