Skip to content
This repository was archived by the owner on Dec 15, 2021. It is now read-only.
This repository was archived by the owner on Dec 15, 2021. It is now read-only.

Logstash (ELK) not streaming data real-time from dynamodb, duplicates upon restart & fetching in some random order. #24

@ameyaloni

Description

@ameyaloni

We're facing multiple issues while using ELK stack. We suspect they're Logstash Configuration issues. Issues are as follows:

  1. Logstash connected to Dynamodb streams isn't showing real-time changes. We even have an explicit perform_stream=>true in our Logstash configuration. Note: We do get the latest data if we restart the logstash (which is running in a docker container). Could this be cross-region issue? Dynamodb (in us-east-1) while Logstash & Elasticsearch (in us-west-1)?

  2. Upon restarting Logstash the entire Dynamodb table data is presumably duplicated in ElasticSearch. Dynamodb has around 70K+ Item Count while ElasticSearch has more than double Searchable Documents. Could it be because we have perform_stream=>true config?

  3. Intermittently the latest data can be seen but it is sandwiched between older records; some kind of random data fetch order. Could it be due to multiple workers trying to log at the same time?

  4. We need the json message contents from Dynamodb as is. However, we noticed that when we run Logstash the output shows the data in "Stream Records". When we use log_format=>"json_binary_as_text", we can see the json message as we require. Is this sufficient?

Following is our Logstash Configuration:

input { 
    dynamodb {
      endpoint => "dynamodb.us-east-1.amazonaws.com"
      streams_endpoint => "streams.dynamodb.us-east-1.amazonaws.com"
      view_type => "new_image"
      perform_scan => true
      perform_stream => true
      publish_metrics => true
      table_name => "here-we-have-dynamodb-table-name"
      log_format => "json_binary_as_text"
  }
}
output {
    elasticsearch {
      hosts => "here-we-have-our-elasticsearch-endpoint-which-is-in-us-west-1"
    } 
}

NOTE: There are no errors in the logs (docker logs --follow container-name).
Any help on these issues is really appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions