Logstash: Add a general purpose gzip codec

Created on 1 Oct 2014  路  14Comments  路  Source: elastic/logstash

Add gzip codec or file input option for gzipped files

Edit: Add a general purpose gzip codec which can be used in inputs and outputs

new plugin

Most helpful comment

Everytime you comment +1 to this ticket, 25-75 emails are sent to out. Instead, please use Github's "reaction" feature to +1 this issue. It looks like this:

image

I will delete the +1 comments now to disuade this further. I appreciate y'alls eagerness for this feature.

All 14 comments

gzip codec is something we totally should have.

To make it work with file input, we'll have to fix how the file input is implemented. We need to do this improvement anyway, but it is a prerequisite for any gzip codec being usable on the file input.

<3 for the idea

@jordansissel thanks! Another thought is possibly a complete flag of some sort.
Let's say I have a directory of logs and I point logstash file input at the glob. What would be cool in some cases is "do something when Logstash is done processing them" like

  • gzip them
  • move them to an archive directory
  • delete
  • send email
  • anything

Basically like a shell exec

Of course you'd have to know this directory of files is static and doesn't have open file handles but that's up to admin to determine.

The file input currently has no concept of "done processing them". Files are assumed to be live streams that live forever, and as a result have no end. Reaching EOF on a log file generally means "wait a while and more data will show up".

Unfortunately, this 'files are live streams' means that folks doing archival or backfilling with old and "complete" logs will be caught without a way to inform Logstash about way to terminate.

Hello Everyone,

Has this been implemented?

@yukti nope, not implemented. PR welcome :)

This is a quick, not nicely implemented, working alternative:
https://github.com/tan-tan-kanarek/logstash-input-gzfile

Everytime you comment +1 to this ticket, 25-75 emails are sent to out. Instead, please use Github's "reaction" feature to +1 this issue. It looks like this:

image

I will delete the +1 comments now to disuade this further. I appreciate y'alls eagerness for this feature.

I have deleted approximately 15 +1 comments.

Do we have any news about this? Using PIPE which is another official alternative is probably inefficient.

Hi, how can we use gzip_lines plugin, logstash to read .gz files?

Hello Everyone,
Has this been implemented? 馃槃

As from this comment, not yet?
https://github.com/elastic/logstash/issues/1817#issuecomment-143258131

It's not quite what you all are talking about, but a grassroots codec has recently popped up on RubyGems: https://rubygems.org/gems/logstash-codec-json_gz

It is specific to GZIP'd JSON, but the version I downloaded was working well for me.

What about bzip2? It would be possible to leverage parallel decompression and probably easier to track the progress and pick up processing where it was left if logstash failed or stopped.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dvic picture dvic  路  3Comments

scheung38 picture scheung38  路  5Comments

jakelandis picture jakelandis  路  4Comments

amodakvnera picture amodakvnera  路  3Comments

suyograo picture suyograo  路  5Comments