[GH-ISSUE #187] mongodb timeouts (serviceapi crashes) #183

Closed
opened 2026-02-27 15:55:31 +03:00 by kerem · 7 comments
Owner

Originally created by @maydo on GitHub (Aug 30, 2018).
Original GitHub issue: https://github.com/RD17/ambar/issues/187

Hi,

im running ambar a few month now. it has to be crawl over 2mil files.

setup:
4 smb shares > 4 crawlers
6 pipelines
ES_JAVA_OPTS=-Xms12g -Xmx12g
mongodb cacheSizeGB=4
Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 20 cores
40GB RAM
as (Ubuntu) VM on ssds

it has currently crawled ~750k fles and was running fine so far.
a few days ago it has started crashing serviceapi once per hour with this error

MongoError: connection 6 to db:27017 timed out
at Function.MongoError.create (/node_modules/mongodb-core/lib/error.js:29:11)
at Socket. (/node_modules/mongodb-core/lib/connection/connection.js:186:20)
at Object.onceWrapper (events.js:313:30)
at emitNone (events.js:106:13)
at Socket.emit (events.js:208:7)
at Socket._onTimeout (net.js:420:8)
at ontimeout (timers.js:482:11)
at tryOnTimeout (timers.js:317:5)
at Timer.listOnTimeout (timers.js:277:5)

i have deleted all containers + images and repulled / recreated them again. but same issue
how to fix ?

Originally created by @maydo on GitHub (Aug 30, 2018). Original GitHub issue: https://github.com/RD17/ambar/issues/187 Hi, im running ambar a few month now. it has to be crawl over 2mil files. setup: 4 smb shares > 4 crawlers 6 pipelines ES_JAVA_OPTS=-Xms12g -Xmx12g mongodb cacheSizeGB=4 Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, 20 cores 40GB RAM as (Ubuntu) VM on ssds it has currently crawled ~750k fles and was running fine so far. a few days ago it has started crashing serviceapi once per hour with this error MongoError: connection 6 to db:27017 timed out at Function.MongoError.create (/node_modules/mongodb-core/lib/error.js:29:11) at Socket.<anonymous> (/node_modules/mongodb-core/lib/connection/connection.js:186:20) at Object.onceWrapper (events.js:313:30) at emitNone (events.js:106:13) at Socket.emit (events.js:208:7) at Socket._onTimeout (net.js:420:8) at ontimeout (timers.js:482:11) at tryOnTimeout (timers.js:317:5) at Timer.listOnTimeout (timers.js:277:5) i have deleted all containers + images and repulled / recreated them again. but same issue how to fix ?
kerem 2026-02-27 15:55:31 +03:00
  • closed this issue
  • added the
    wontfix
    label
Author
Owner

@sochix commented on GitHub (Aug 30, 2018):

@maydo good question! I think you have 3 options:

  1. Play with mongodb cacheSize, maybe it will help
  2. Decrease number of pipelines, it will help for sure, but speed of file processing will decrease too
  3. Try to create mongodb cluster. I think it's possible, and will require changes only in docker-compose file.
<!-- gh-comment-id:417221892 --> @sochix commented on GitHub (Aug 30, 2018): @maydo good question! I think you have 3 options: 1) Play with mongodb cacheSize, maybe it will help 2) Decrease number of pipelines, it will help for sure, but speed of file processing will decrease too 3) Try to create mongodb cluster. I think it's possible, and will require changes only in docker-compose file.
Author
Owner

@maydo commented on GitHub (Aug 30, 2018):

@sochix thank you for informations.

i have just removed one pipeline and increased mongo cache to 6GB

will also try mongodb cluster, i think it could work.
here is some instructions to build the cluster on docker-compose
https://dzone.com/articles/composing-a-sharded-mongodb-on-docker

<!-- gh-comment-id:417226402 --> @maydo commented on GitHub (Aug 30, 2018): @sochix thank you for informations. i have just removed one pipeline and increased mongo cache to 6GB will also try mongodb cluster, i think it could work. here is some instructions to build the cluster on docker-compose https://dzone.com/articles/composing-a-sharded-mongodb-on-docker
Author
Owner

@sochix commented on GitHub (Aug 30, 2018):

@maydo please, keep this issue updated on your progress

<!-- gh-comment-id:417228450 --> @sochix commented on GitHub (Aug 30, 2018): @maydo please, keep this issue updated on your progress
Author
Owner

@maydo commented on GitHub (Sep 4, 2018):

Hi, i think there is some more issues than mongodb.

I have decreased pipelines 2 pipelines.
I think this is not the issue. system has enough ressources.

when i start the containers, it starts crawling thru the shares,
it needs about 24h to crawl thru 750k files (which is already been crawled) to get to the point where uncrawled files starts.

so than it crawles 1-2hours new files, than all containers become "instable"
they are "healthy" status, but you cannot restart, stop or kill them. they are frozen.
also docker service cant be killed.

you need to kill the whole vm.

the container logs, in all of them are connection issues

{"log":"2018-09-04 08:05:05.106401: [error] [2] 0 HTTPConnectionPool(host='serviceapi', port=8081): Max retries exceeded with url: /api/logs (Caused by NewConnectionError('\u003curllib3.connection.HTTPConnection object at 0x7f878af321d0\u003e: Failed to establish a new connection: [Errno 111] Connection refused',))\n","stream":"stderr","time":"2018-09-04T08:05:05.109670518Z"}

{"log":"closing AMQP connection \u003c0.23443.0\u003e (172.18.0.6:37418 -\u003e 172.18.0.11:5672, vhost: '/', user: 'guest'):\n","stream":"stdout","time":"2018-09-04T01:17:25.023473373Z"}
{"log":"client unexpectedly closed TCP connection\n","stream":"stdout","time":"2018-09-04T01:17:25.023485073Z"}

<!-- gh-comment-id:418281690 --> @maydo commented on GitHub (Sep 4, 2018): Hi, i think there is some more issues than mongodb. I have decreased pipelines 2 pipelines. I think this is not the issue. system has enough ressources. when i start the containers, it starts crawling thru the shares, it needs about 24h to crawl thru 750k files (which is already been crawled) to get to the point where uncrawled files starts. so than it crawles 1-2hours new files, than all containers become "instable" they are "healthy" status, but you cannot restart, stop or kill them. they are frozen. also docker service cant be killed. you need to kill the whole vm. the container logs, in all of them are connection issues {"log":"2018-09-04 08:05:05.106401: [error] [2] 0 HTTPConnectionPool(host='serviceapi', port=8081): Max retries exceeded with url: /api/logs (Caused by NewConnectionError('\u003curllib3.connection.HTTPConnection object at 0x7f878af321d0\u003e: Failed to establish a new connection: [Errno 111] Connection refused',))\n","stream":"stderr","time":"2018-09-04T08:05:05.109670518Z"} {"log":"closing AMQP connection \u003c0.23443.0\u003e (172.18.0.6:37418 -\u003e 172.18.0.11:5672, vhost: '/', user: 'guest'):\n","stream":"stdout","time":"2018-09-04T01:17:25.023473373Z"} {"log":"client unexpectedly closed TCP connection\n","stream":"stdout","time":"2018-09-04T01:17:25.023485073Z"}
Author
Owner

@sochix commented on GitHub (Sep 6, 2018):

@maydo decrease the maximum file size to 15mb and check if everything runs smoothly

<!-- gh-comment-id:419007391 --> @sochix commented on GitHub (Sep 6, 2018): @maydo decrease the maximum file size to 15mb and check if everything runs smoothly
Author
Owner

@maydo commented on GitHub (Sep 7, 2018):

@sochix thank you for the tip.

it seems ok for now, it is crawling smoothly.
2x PL + 15mb filesize

maybe you can add the possibility for mongo clusters in future versions

<!-- gh-comment-id:419439402 --> @maydo commented on GitHub (Sep 7, 2018): @sochix thank you for the tip. it seems ok for now, it is crawling smoothly. 2x PL + 15mb filesize maybe you can add the possibility for mongo clusters in future versions
Author
Owner

@stale[bot] commented on GitHub (Sep 22, 2018):

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

<!-- gh-comment-id:423745888 --> @stale[bot] commented on GitHub (Sep 22, 2018): This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#183
No description provided.