mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-25 06:55:53 +03:00
[GH-ISSUE #1054] ERROR: Too many open files #735
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#735
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @bilalvadion on GitHub (Sep 2, 2024).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/1054
Healthcheck ping returned a 500 internal server error to my client app. Upon investigation, I found the following error:
OSError: [Errno 24] Too many open files:Please find syslog subset in log file attached below.
too-many-open-files.log
I have tried increases the ulimit, but issue still persists.
Reference: https://stackoverflow.com/questions/39537731/errno-24-too-many-open-files-but-i-am-not-opening-files
Current ulimit is 65535.
Environment:
Ubuntu 20.04.6 LTS
Python 3.8
@cuu508 commented on GitHub (Sep 2, 2024):
Can you please check if the limit is in fact 65535:
Also, please check what files are open:
@bilalvadion commented on GitHub (Sep 2, 2024):
@cuu508 here are the command results:
@cuu508 commented on GitHub (Sep 2, 2024):
/proc/<PID>/limitsshows the soft limit is 1024 and the hard limit is 524288. Neither is 65535, so something's not right with your ulimit usage.Can you reproduce the "Too many open files" error? If yes, please run
lsof -p <pid>then send me the full output (you can send it privately to contact at healthchecks io). I'd like to figure out how the file handles are being spent.@bilalvadion commented on GitHub (Sep 4, 2024):
I increased the soft limits using the steps below:
But I had an unexpected sideaffect on the healthchecks service, below error keeps repeating per ping request:
Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: ---------------------------------------- Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: Exception happened during processing of request from ('xxx.xx.xx.xxx', 56782) Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: Traceback (most recent call last): Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: File "/usr/lib/python3.8/socketserver.py", line 316, in _handle_request_noblock Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: self.process_request(request, client_address) Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: File "/usr/lib/python3.8/socketserver.py", line 697, in process_request Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: t.start() Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: File "/usr/lib/python3.8/threading.py", line 852, in start Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: _start_new_thread(self._bootstrap, ()) Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: RuntimeError: can't start new thread Sep 4 00:00:00 ip-172-31-6-192 python3.8[3902760]: ----------------------------------------Some other logs at the time of exception:
@cuu508 commented on GitHub (Sep 4, 2024):
I haven't run into
RuntimeError: can't start new threaderror during request handling before, and don't know what might be causing it.A few observations:
manage.py runserveris a development server. For production use, consider switching to uwsgi or gunicornlsof -p 3902760 | wc -l -> 2285output seems high. On my dev system, with the development server running, there are only ~100 file handles open. Can I see the list of open files (same command, withoutwc -l)?python manage.py runserverprocesses stared at different times. This looks suspicious to me. Do you perhaps have an old process taking over a port?@bilalvadion commented on GitHub (Sep 4, 2024):
@cuu508 I have emailed you the lsof output on contact at healthchecks io.
We are using systemctl to manage our service. Given below is the configuration:
I am using healthchecks 2.10. Would be happy to migrate to the latest version, is there a migration guide for that? As we dont want to re enter all the projects and integrations, and want the ping urls to ideally not change for our production clients.
@cuu508 commented on GitHub (Sep 4, 2024):
I received the full lsof logs, thanks!
They show ~2000 open TCP connections to port 8080. Normally, a client would close the connection after it is done with the request. And if it doesn't the server would close it after some timeout. I guess
manage.py runserverdoesn't do that.Regarding upgrade, there's no full guide, but a minimal version would be:
manage.py migrateto apply database migrationsmanage.py collectstaticandmanage.py compress.You can jump multiple releases, you do not need to do upgrades one-by-one.
If you are also upgrading python version, you will need to recreate the virtualenv. The virtualenv created with Python 3.8 will not work with Python 3.9+.
@bilalvadion commented on GitHub (Sep 5, 2024):
@cuu508 running on uWSGI fixed the issue. Possibly a Django dev server limitation. I am making a PR with minimal steps for uWSGI on the Production section.