- Docker Container STDOUT/STDERR
- >1 process container logs
- EC2 Instances
- System Logs
- Application Logs
- AWS CloudWatch
- AWS Kinesis
- Fast Retrieval
- Archived Storage
- Storage: ElasticSearch and S3
- Visualize: Kibana
- Manage: Curator
- Queue Management: LogStash
- Log Management: FileBeat
- Metric Collection: MetricBeat
We start at the beginning and work our way up the stack.
We are FileBeat. Our job is to consume log files and send them to LogStash. We should be efficient (written in Go!), and we should provide the appropriate mechanisms to guarantee what we send. We also have to ensure that our communication with LogStash and our local storage is fully encrypted for compliance reasons.
We are LogStash. Our job is to consume *Beat data and send it to ElasticSearch and S3. We need to ensure that a destination failure does not block the other. We also provide a choice between local memory or local filesystem queues of your required size. Our communication with our destinations and local storage must also be encrypted.
We are ElasticSearch. Our job is to store data and provide a very fault tolerant platform. We do cool things like store our data in really fast Lucene indices and provide horizontal scaling. Our communication with our cluster members and local storage must be encrypted. All clients that connect to us must use encryption.
We are Kibana. We provide a web UI that allows for fast searching of all our logs! We can do some cool dashboards too like Grafana! We also provide some nifty AI and monitoring functionality too (this should not be overlooked!). Our communication with ElasticSearch and with clients must be encrypted.
We are Curator. Our job is a necessary one unless you wish to tool in automated administrative functions of the ElasticSearch API, by yourself, alone, and crying. We automate snapshots (S3 backup), snapshot cleanup, index cleanup, and other fun things. We run with YAML files that define actions against index patterns. We should be dynamic and provide a central point of administration for any ElasticSearch cluster.
- EC2 ElasticSearch Cluster
- Chef Cookbook (SSL, DNS, Discovery, Updates, Volumes)
- S3 Cross-Region Replicated PHI Compliant Backup Bucket
- EC2 LogStash Cluster
- S3 Cross-Region Replicated PHI Compliant Storage Bucket
- ECS Beat containers and services
- ECS Curator container and service
- S3 YAML Configuration Bucket
- ECS Kibana container and service
- EC2 Beat Services via Chef Base Recipe
- EC2 ElasticSearch Cluster
- TLS Encryption for all network communication
- EBS Encryption for all AWS EC2 volumes
- Data Retention fulfilled via AWS S3
- ElasticSearch index snapshots
- LogStash raw logs
- Query via Athena or Kibana
- At least once delivery
- Fast access to short term data time span (X days)
- Access to long term storage archives
- Alpine / Phusion Passenger base images
- Supervisord process management
- Captures all process STDOUT/STDERR
- json-file logging driver (CloudWatch for Kinesis alternative)
- EFS mounted container log directory
- Mounts /var/lib/docker/containers
- Globs /var/lib/docker/containers/***.log
- AWS ECS Daemon Task
- Pre-built Curator version with custom Docker ENTRYPOINT
- Consumes S3 key as Curator action.yml
- Consumes ENV vars as Curator config.yml
- Output to Slack
- Allows scheduling and single run usage
- Allows usage against any ElasticSearch cluster
- AWS t2.small EC2 instance
- Input: Beats
- Output: ElasticSearch and S3
- Pipeline: Parallel and Non-Blocking
- Queuing: Persistent Main, Memory Outputs
In the following way a failure on the ElasticSearch backend will not block LogStash from sending to S3. The opposite is true as well. We filter out MetricBeat data from being stored in S3 because at this point it is not a requirement.
The good news here is that LogStash pipelines are pretty configurable and can support many use cases.
elasticsearch and s3 pipeline:
Notice the input pipeline. This is considered a virtual address that can be accessed by another pipeline. Having Kibana manage the pipelines allows for easy administration of the LogStash running configuration.
- X-Pack Enabled
- Transport and HTTP Encryption
- Automated system_key management
- Supports separate data and log volumes (EFS, EBS, Instance)
- Currently investigating…
- Configure LogStash with Kinesis input
- Configure Output for Kinesis inputs