Kibana: Data Visualizer fails to import data without a timestamp

Created on 14 Apr 2020  Â·  7Comments  Â·  Source: elastic/kibana

Kibana version: 7.7.0 BC6

Elasticsearch version: 7.7.0 BC6

Server OS version: Windows 2012 Server

Browser version: Chrome (also IE11)

Browser OS version: Windows 10

Original install method (e.g. download page, yum, from source, etc.): zip files default distribution

Describe the bug: If Data Visualizer should accept files without timestamps, its not working in this version.

Steps to reproduce:

  1. From Kibana home, click Import a CSV, NDJSON, or log file
  2. try uploading a file without timestamps. I'll attach the one I tried.
    xpack-ascii.txt

Error on screen:

File could not be read
Bad Request: [illegal_argument_exception] Could not find a timestamp in the sample provided

Expected behavior: It should ingest the data

Screenshots (if relevant):
image

Errors in browser console (if relevant):

DevTools failed to load SourceMap: Could not load content for chrome-extension://hdokiejnpimakedhajhdlcegeplioahd/sourcemaps/onloadwff.js.map: HTTP error: status code 404, net::ERR_UNKNOWN_URL_SCHEME
ml#/filedatavisualizer:342 Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'unsafe-eval' 'self'". Either the 'unsafe-inline' keyword, a hash ('sha256-P5polb1UreUSOe5V/Pv7tc+yeZuJXiOi/3fqhGsU7BE='), or a nonce ('nonce-...') is required to enable inline execution.

bootstrap.js:10 ^ A single error about an inline script not firing due to content security policy is expected!
kbn-ui-shared-deps.js:381 INFO: 2020-04-14T21:04:50Z
  Adding connection to https://localhost:5601/elasticsearch


4.plugin.js:1 overrides undefined
VM469:1 POST https://localhost:5601/api/ml/file_data_visualizer/analyze_file 400 (Bad Request)
(anonymous) @ VM469:1
_callee3$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
fetchResponse @ commons.bundle.js:3
_callee$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
Promise.then (async)
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
_callee2$ @ commons.bundle.js:3
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ commons.bundle.js:3
_next @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
(anonymous) @ commons.bundle.js:3
_callee$ @ 1.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 1.plugin.js:1
_next @ 1.plugin.js:1
(anonymous) @ 1.plugin.js:1
(anonymous) @ 1.plugin.js:1
http @ 1.plugin.js:1
analyzeFile @ 1.plugin.js:1
_callee3$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadSettings @ 4.plugin.js:1
_callee2$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadFile @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
bo @ kbn-ui-shared-deps.js:342
vo @ kbn-ui-shared-deps.js:342
vl @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
yl @ kbn-ui-shared-deps.js:342
ol @ kbn-ui-shared-deps.js:342
(anonymous) @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
Gi @ kbn-ui-shared-deps.js:342
Yi @ kbn-ui-shared-deps.js:342
ce @ kbn-ui-shared-deps.js:342
Ln @ kbn-ui-shared-deps.js:342
Dn @ kbn-ui-shared-deps.js:342
On @ kbn-ui-shared-deps.js:342
Show 25 more frames
4.plugin.js:1 Error: Bad Request
    at Fetch._callee3$ (commons.bundle.js:3)
    at l (kbn-ui-shared-deps.js:288)
    at Generator._invoke (kbn-ui-shared-deps.js:288)
    at Generator.forEach.e.<computed> [as next] (kbn-ui-shared-deps.js:288)
    at asyncGeneratorStep (commons.bundle.js:3)
    at _next (commons.bundle.js:3)
_callee3$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_throw @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadSettings @ 4.plugin.js:1
_callee2$ @ 4.plugin.js:1
l @ kbn-ui-shared-deps.js:288
(anonymous) @ kbn-ui-shared-deps.js:288
forEach.e.<computed> @ kbn-ui-shared-deps.js:288
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
Promise.then (async)
asyncGeneratorStep @ 4.plugin.js:1
_next @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
loadFile @ 4.plugin.js:1
(anonymous) @ 4.plugin.js:1
bo @ kbn-ui-shared-deps.js:342
vo @ kbn-ui-shared-deps.js:342
vl @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
yl @ kbn-ui-shared-deps.js:342
ol @ kbn-ui-shared-deps.js:342
(anonymous) @ kbn-ui-shared-deps.js:342
t.unstable_runWithPriority @ kbn-ui-shared-deps.js:350
Hi @ kbn-ui-shared-deps.js:342
Gi @ kbn-ui-shared-deps.js:342
Yi @ kbn-ui-shared-deps.js:342
ce @ kbn-ui-shared-deps.js:342
Ln @ kbn-ui-shared-deps.js:342
Dn @ kbn-ui-shared-deps.js:342
On @ kbn-ui-shared-deps.js:342

Provide logs and/or server output (if relevant):

Any additional context:

:ml File Data Viz bug

Most helpful comment

I am sorry I didn't want to be mean

No problem, I didn't think you were being mean, it's just that I wasn't completely clear what didn't work for you.

It seems that you explained it in https://discuss.elastic.co/t/upload-csv-file-without-timestamp-to-kibana-with-ml-fails/257376.

What happened is that there was something about your CSV file that failed to upload that meant the file structure finder didn't think it was CSV. As a result, it tried to analyse it as semi-structured text, and currently that only works when a timestamp can be detected.

So, the next question is, why wasn't your CSV file recognized as a CSV file? There are a few possible reasons:

  1. Maybe there was some extra non-CSV data at the end of the file?
  2. Maybe there were too few fields per row for it to be auto-detected as CSV - see https://github.com/elastic/elasticsearch/issues/56325#issuecomment-630910723 for more discussion on this
  3. Maybe there were different numbers of fields on a few lines?

If it is reason 2 or 3 then you should upgrade to 7.10 where you will be able to take advantage of https://github.com/elastic/elasticsearch/pull/55735 and https://github.com/elastic/kibana/pull/74376. When the initial analysis fails due to one of those reasons you'll be able to go to the overrides flyout and tell it that your file is CSV, and then up to 10% of the rows will be allowed to have a column count that's inconsistent with the header row and it will still be imported as best it can be.

The other benefit of upgrading is that you'll get the explanation of why it wasn't considered to be CSV, for example, "row 375 had 19 columns whereas the header had 17". This is really hard to spot by eye in a big text file (although easier in a spreadsheet program).

All 7 comments

Pinging @elastic/ml-ui (:ml)

Only highly structured formats like CSV and NDJSON are accepted without timestamps. The reason is that for semi-structured log files the definition of the first line of each message is the line containing the identified timestamp, so without a timestamp there’s no way to split the file into messages.

We should probably spell this out more clearly in the docs. Currently it is buried away in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-find-file-structure.html in the sentence:

For structured file formats, it is not compulsory to have a timestamp in the file.

I think we _could_ make it possible to support import of semi-structured log files without timestamps, by implementing elastic/kibana#38868 and elastic/elasticsearch#55219.

This feature was very useful indeed and was promoted in many of the elasticsearch/kibana tutos or videos. It would be good to have it back as now I am stuck with such a basic stuff. Is there any workaround ?

@cspielmann are you complaining that the entire feature has disappeared, or specifically that it doesn't work for semi-structured log files without timestamps?

I believe the whole feature was accidentally made inaccessible on a basic license for one minor release and then fixed in the following patch release. There will be a separate issue for that somewhere if that's the problem you've got.

Only highly structured formats like CSV and NDJSON are accepted without timestamps. That has always been the case. We could do an enhancement for semi-structured log files without timestamps, but that has never been demonstrated in a video as it has never worked. So please be more specific about exactly what doesn't work for you.

Hello,
I am sorry I didn't want to be mean.I really enjoy using your product and
like your support.
The fact is that today I am not able to reproduce.(?!)
I am pushing a very simple csv and it works.
any way, thanks for your reply

On Wed, Dec 2, 2020 at 4:30 PM David Roberts notifications@github.com
wrote:

@cspielmann https://github.com/cspielmann are you complaining that the
entire feature has disappeared, or specifically that it doesn't work for
semi-structured log files without timestamps?

I believe the whole feature was accidentally made inaccessible on a basic
license for one minor release and then fixed in the following patch
release. There will be a separate issue for that somewhere if that's the
problem you've got.

Only highly structured formats like CSV and NDJSON are accepted without
timestamps. That has always been the case. We could do an enhancement for
semi-structured log files without timestamps, but that has never been
demonstrated in a video as it has never worked. So please be more specific
about exactly what doesn't work for you.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/elastic/kibana/issues/63526#issuecomment-737304124,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADRWUIBMOO3DI7JEJRMXU4TSSZMPRANCNFSM4MIB6GTQ
.

I am sorry I didn't want to be mean

No problem, I didn't think you were being mean, it's just that I wasn't completely clear what didn't work for you.

It seems that you explained it in https://discuss.elastic.co/t/upload-csv-file-without-timestamp-to-kibana-with-ml-fails/257376.

What happened is that there was something about your CSV file that failed to upload that meant the file structure finder didn't think it was CSV. As a result, it tried to analyse it as semi-structured text, and currently that only works when a timestamp can be detected.

So, the next question is, why wasn't your CSV file recognized as a CSV file? There are a few possible reasons:

  1. Maybe there was some extra non-CSV data at the end of the file?
  2. Maybe there were too few fields per row for it to be auto-detected as CSV - see https://github.com/elastic/elasticsearch/issues/56325#issuecomment-630910723 for more discussion on this
  3. Maybe there were different numbers of fields on a few lines?

If it is reason 2 or 3 then you should upgrade to 7.10 where you will be able to take advantage of https://github.com/elastic/elasticsearch/pull/55735 and https://github.com/elastic/kibana/pull/74376. When the initial analysis fails due to one of those reasons you'll be able to go to the overrides flyout and tell it that your file is CSV, and then up to 10% of the rows will be allowed to have a column count that's inconsistent with the header row and it will still be imported as best it can be.

The other benefit of upgrading is that you'll get the explanation of why it wasn't considered to be CSV, for example, "row 375 had 19 columns whereas the header had 17". This is really hard to spot by eye in a big text file (although easier in a spreadsheet program).

Was this page helpful?
0 / 5 - 0 ratings