Original comment by @kobelb:
Currently, all reporting exports are stored in a single Elasticsearch document. Additionally, CSV exports are capped at a configurable 10mb presently. We can increase this limit by splitting the result across multiple Elasticsearch documents and streaming them to the user in such a way that it appears to be one file.
For perspective, 10mb = Seven 3.5-floppy-discs,
Please increase the cap to something less Windows 95.
@matthew-b-b the cap is configurable. The setting is xpack.reporting.csv.maxSizeBytes
I've seen user's successfully generate reports of around 200mb. The limitation is really capacity on network (web proxies) and Elasticsearch HTTP payload limits. All of those things are configurable as well.
I'm not sure this is a real issue.
@tsullivan I believe this is a real issue, we have numerous customers who are asking for larger exports and I believe maxSizeBytes still has a limitation set by Elasticsearch payload limits as you've mentioned. The chunked export was always meant to be the next phase in CSV export, allowing users from within the UI and via Watcher to export larger data sets with ease and in a way where we can take as many precautions as possible in order to reduce impact on the cluster itself.
I believe the request is still valid and the user experience as defined by @kobelb is still common amongst our community.
cc: @AlonaNadler for awareness
@alexfrancoeur do you know if the Kibana team is working on enhancements to allow for chunked exports? The chunked export approach is the only maintainable approach to allowing for large CSV files to be downloaded.
Based on this blog post it looks like some improvements are being made to export to CSV.
https://www.elastic.co/blog/keeping-up-with-kibana-2019-04-15
We have dozens of users that would like for the capacity to export large CSV files and the chunked export would be a perfect solution.
@tbone2sk we are not actively working on it atm but we do prepare the reporting backend so it will be easier to address it in the future. The upcoming ability to export CSV from a saved search in a dashboard is using a different implementation that should help us chunk csv export in the future. How big the files the users in your org want to export? and out of curiosity what do they do with these files after they export ?
Sounds great, I look forward to the enhancements!
@AlonaNadler
We have been using Kibana for the analysis of marketing data. We try to complete most of the analysis within the tool, but there are cases where the functionality of Kibana is not as flexible as other tools and a smaller dataset is exported from Kibana. Most of these files are larger than 200,000 records and some can contain a few million records. It would be preferred if we can export files as large as 1GB.
Pinging @elastic/kibana-reporting-services (Team:Reporting Services)
Hi Elastic team,
We have a usecase where the CSV report around 1GB needs to be exported as well. The data is then used for analysis and tagging. Do you have any visibility of when the functionality chunked export be available?
Thanks Alex, this definitely is a real issue. Chunking the data into smaller documents for storage also means relieving the memory pressure being put on the Kibana server to perform this feature. Doing this enhancement would greatly mature the CSV export feature.
Do you have any visibility of when the functionality chunked export be available?
@i-aggarwal the Reporting Services currently doesn't have a timeline for starting work on this.
In addition to breaking the exported content into as many documents as necessary, another idea has come up about Base64-encoding the chunks, or gzipping the chunks.
Also, using the Binary datatype for the content field in the Reporting document would probably be needed to store gzipped content.
Also some code in the wild of folks working around the current state of CSV reporting:
https://github.com/fabiopipitone/elasticsearch-tocsv - builds a CSV straight from elasticsearch data
https://github.com/Safecast/ingest/blob/master/db/generate_reports.rb - chunks a kibana CSV report URL into monthly time slices