Description
I have a small setup of elasticsearch with a single node to which I'm writing using logstash with jdbc data source (paging enabled) and the aggregate filter. Since data import involves processing about 60M rows from postgresql, I'm running 10 independent logstash instances (with -w 1) working on different data. All logstash instances run on the same machine. After some time one of logstash instances starts to give the following errors:
[2018-01-31T13:24:43,545][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/][Manticore::SocketException] Connection reset {:url=>http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/, :error_message=>"Elasticsearch Unreachable: [http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/][Manticore::SocketException] Connection reset", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[2018-01-31T13:24:43,569][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/][Manticore::SocketException] Connection reset", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[2018-01-31T13:24:43,571][DEBUG][logstash.outputs.elasticsearch] Failed actions for last bad bulk request! {:actions=>[["index", {:_id=>nil (line truncated for brevity)
[2018-01-31T13:24:46,186][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/, :path=>"/"}
[2018-01-31T13:24:46,193][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/"}
[2018-01-31T13:25:02,342][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/][Manticore::SocketException] Connection reset {:url=>http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/, :error_message=>"Elasticsearch Unreachable: [http://XXXXXXX:xxxxxx@YYYYYYYYYYYY:9200/][Manticore::SocketException] Connection reset", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
(above lines repeated indefinitely)
From this moment _node/stats/pipelines for faulty instance shows constant event out count. Other instances keep writing data successfully, i.e. doc count in elastic grows. At the end some logstash instances finish properly and some end in the state described above.
- Version: logstash-6.1.3
- Operating System: centos7
Any idea what could be wrong here? BTW I don't see anything worrying in the elasticsearch logs.