mirror of
https://github.com/RD17/ambar.git
synced 2026-04-25 15:35:49 +03:00
[GH-ISSUE #192] ES keep restarting, won't process any files #188
Labels
No labels
$$ Paid Support
bug
bug
enhancement
help wanted
invalid
pull-request
question
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ambar#188
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Triangulum9r on GitHub (Oct 11, 2018).
Original GitHub issue: https://github.com/RD17/ambar/issues/192
Elastic Search keeps restarting.
webapi_1 | Catastrophic failure! { Error: read ECONNRESET webapi_1 | at _errnoException (util.js:1022:11) webapi_1 | at TCP.onread (net.js:628:25) webapi_1 | cause: { Error: read ECONNRESET webapi_1 | at _errnoException (util.js:1022:11) webapi_1 | at TCP.onread (net.js:628:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }, webapi_1 | isOperational: true, webapi_1 | code: 'ECONNRESET', webapi_1 | errno: 'ECONNRESET', webapi_1 | syscall: 'read' } webapi_1 | Catastrophic failure! { Error: read ECONNRESET webapi_1 | at _errnoException (util.js:1022:11) webapi_1 | at TCP.onread (net.js:628:25) webapi_1 | cause: { Error: read ECONNRESET webapi_1 | at _errnoException (util.js:1022:11) webapi_1 | at TCP.onread (net.js:628:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }, webapi_1 | isOperational: true, webapi_1 | code: 'ECONNRESET', webapi_1 | errno: 'ECONNRESET', webapi_1 | syscall: 'read' } webapi_1 | Started on :::8080RabbitMQ is crashing too:
`2018-10-11 19:35:52 Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces
{auth,init,['Argument__1']}
2018-10-11 19:35:52 crash_report
<0.46.0>
[]
{exit,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,352}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
[net_sup,kernel_sup,<0.34.0>]
[]
[<0.44.0>]
[]
true
running
610
27
668
initial_call: pid: registered_name: error_info: ancestors: messages: links: dictionary: trap_exit: status: heap_size: stack_size: reductions: 2018-10-11 19:35:52 supervisor_report
{local,net_sup}
start_error
{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
[{pid,undefined},{id,auth},{mfargs,{auth,start_link,[]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]
supervisor: errorContext: reason: offender: 2018-10-11 19:35:52 supervisor_report
{local,kernel_sup}
start_error
{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}}}
[{pid,undefined},{id,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]
supervisor: errorContext: reason: offender: 2018-10-11 19:35:52 crash_report
{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}
<0.33.0>
[]
{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,134}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
[<0.32.0>]
[{'EXIT',<0.34.0>,normal}]
[<0.32.0>,<0.31.0>]
[]
true
running
987
27
175
initial_call: pid: registered_name: error_info: ancestors: messages: links: dictionary: trap_exit: status: heap_size: stack_size: reductions: 2018-10-11 19:35:52 std_info
kernel
{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}}}}},{kernel,start,[normal,[]]}}
permanent
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}}}}},{kernel,start,[normal,[]]}}}"}
application: exited: type: Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq
Crash dump is being written to: erl_crash.dump.../usr/lib/rabbitmq/bin/rabbitmq-server: 51: /usr/lib/rabbitmq/bin/rabbitmq-server: cannot create /var/lib/rabbitmq/mnesia/rabbit@rabbit.pid: Permission denied
Failed to write pid file: /var/lib/rabbitmq/mnesia/rabbit@rabbit.pid
=INFO REPORT==== 11-Oct-2018::19:36:04 ===
Starting RabbitMQ 3.6.16 on Erlang 19.2.1
Copyright (C) 2007-2018 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
## Licensed under the MPL. See http://www.rabbitmq.com/
########## Logs: tty
## tty
##########
Starting broker...
=INFO REPORT==== 11-Oct-2018::19:36:04 ===
node : rabbit@rabbit
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : s57KxGNCGYKGa241b3lTgg==
log : tty
sasl log : tty
database dir : /var/lib/rabbitmq/mnesia/rabbit@rabbit
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
Memory high watermark set to 3193 MiB (3348381696 bytes) of 7983 MiB (8370954240 bytes) total
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
Enabling free disk space monitoring
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
Disk free limit set to 50MB
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
Limiting to approx 1048476 file handles (943626 sockets)
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
FHC read buffering: OFF
FHC write buffering: ON
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
Waiting for Mnesia tables for 30000 ms, 9 retries left
=CRASH REPORT==== 11-Oct-2018::19:36:06 ===
crasher:
initial call: application_master:init/4
pid: <0.158.0>
registered_name: []
exception exit: {{could_not_write_file,
"/var/lib/rabbitmq/mnesia/rabbit@rabbit/cluster_nodes.config",
eacces},
{rabbit,start,[normal,[]]}}
in function application_master:init/4 (application_master.erl, line 134)
ancestors: [<0.156.0>]
messages: [{'EXIT',<0.160.0>,normal}]
links: [<0.156.0>,<0.31.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 1598
stack_size: 27
reductions: 98
neighbours:
=INFO REPORT==== 11-Oct-2018::19:36:06 ===
application: rabbit
exited: {{could_not_write_file,"/var/lib/rabbitmq/mnesia/rabbit@rabbit/cluster_nodes.config",
eacces},
{rabbit,start,[normal,[]]}}
type: transient
2018-10-11 19:36:07 Error in process
p with exit value:nnp<0.3.0>
{badarg,[{ets,lookup,[ac_tab,{env,rabbit,error_logger}],[]},{application_controller,get_env,2,[{file,"application_controller.erl"},{line,332}]},{rabbit,log_location,1,[{file,"src/rabbit.erl"},{line,893}]},{rabbit,boot_error,2,[{file,"src/rabbit.erl"},{line,786}]},{rabbit,start_it,1,[{file,"src/rabbit.erl"},{line,430}]},{init,start_em,1,[]},{init,do_boot,3,[]}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{could_not_write_file,"/var/lib/rabbitmq/mnesia/rabbit@rabbit/cluster_nodes.config",eacces},{rabbit,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{could_not_write_file,"/var/lib/rabbitmq/mnesia/rabbit@rabbit/cluster_nodes.config",eacces},{rabbit,start,[normal,[]]
Crash dump is being written to: erl_crash.dump...
=ERROR REPORT==== 11-Oct-2018::19:36:18 ===
Mnesia(rabbit@rabbit): ** ERROR ** (could not write core file: eacces)
** FATAL ** mnesia_tm crashed: {"Cannot read schema",
"/var/lib/rabbitmq/mnesia/rabbit@rabbit/schema.DAT",
{error,
{file_error,
"/var/lib/rabbitmq/mnesia/rabbit@rabbit/schema.DAT",
eacces}}} state: [<0.123.0>]`
Docker Compose:
`
version: "2.1"
networks:
internal_network:
services:
db:
restart: always
networks:
- internal_network
image: ambar/ambar-mongodb:latest
environment:
- cacheSizeGB=2
volumes:
- ~/midas/ambar:/data/db
expose:
- "27017"
ports:
- "27017:27017"
es:
restart: always
networks:
- internal_network
image: ambar/ambar-es:latest
expose:
- "9200"
environment:
- cluster.name=ambar-es
- ES_JAVA_OPTS=-Xms2g -Xmx2g
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
cap_add:
- IPC_LOCK
volumes:
- ~/midas/ambar:/usr/share/elasticsearch/data
rabbit:
restart: always
networks:
- internal_network
image: ambar/ambar-rabbit:latest
hostname: rabbit
expose:
- "15672"
- "5672"
volumes:
- ~/midas/ambar:/var/lib/rabbitmq
redis:
restart: always
sysctls:
- net.core.somaxconn=1024
networks:
- internal_network
image: ambar/ambar-redis:latest
expose:
- "6379"
serviceapi:
depends_on:
redis:
condition: service_healthy
rabbit:
condition: service_healthy
es:
condition: service_healthy
db:
condition: service_healthy
restart: always
networks:
- internal_network
image: ambar/ambar-serviceapi:latest
expose:
- "8081"
environment:
- mongoDbUrl=mongodb://db:27017/ambar_data
- elasticSearchUrl=http://es:9200
- redisHost=redis
- redisPort=6379
- rabbitHost=amqp://rabbit
- langAnalyzer=ambar_en
webapi:
depends_on:
serviceapi:
condition: service_healthy
restart: always
networks:
- internal_network
image: ambar/ambar-webapi:latest
expose:
- "8080"
ports:
- "8080:8080"
environment:
- uiLang=en
- mongoDbUrl=mongodb://db:27017/ambar_data
- elasticSearchUrl=http://es:9200
- redisHost=redis
- redisPort=6379
- serviceApiUrl=http://serviceapi:8081
- rabbitHost=amqp://rabbit
frontend:
depends_on:
webapi:
condition: service_healthy
# image: ambar/ambar-frontend:latest
build:
context: ./FrontEnd
dockerfile: Dockerfile
restart: always
networks:
- internal_network
ports:
- "80:80"
expose:
- "80"
environment:
- api=http://127.0.0.1:8080
pipeline0:
depends_on:
serviceapi:
condition: service_healthy
image: ambar/ambar-pipeline:latest
restart: always
networks:
- internal_network
environment:
- id=0
- api_url=http://serviceapi:8081
- rabbit_host=amqp://rabbit
`
@Triangulum9r commented on GitHub (Oct 11, 2018):
Figured it out myself.. i deleted and recreated the folder for which the volumes were pointing. This fixed the issues.
@ajeebkp23 commented on GitHub (Oct 17, 2018):
@mas-dse-juremigi You may close this issue. If your issue resolved.
@zx2slow commented on GitHub (Jan 15, 2019):
I am seeing a similar issue on a new install:
docker-compose ps
docker-compose.yml
# docker-compose logs | tail -40
@sanikolov commented on GitHub (May 13, 2019):
yes, still happening with latest images. Tends to occur when the file to be processed is large, in my case a 150MB djvu (which cannot be parsed by tika) and a 120MB epub which can be parsed by apache tika.
Is there a timeout that can be defined or tuned in the yml file?
The other option of course is to reduce sizes but then some files are left out.
Ideally the infinite loop ought to be detected by the container and it should move on to other files.