Presto: Jackson InvalidDefinitionException in Druid connector

Created on 25 Feb 2020  Â·  25Comments  Â·  Source: prestodb/presto

I compiled master code at commit 1e7780e7a151f43b3107267b0bdedde0686ce511, prepared a local druid 0.17 instance and loaded its sample dataset (wikiticker-2015-09-12-sampled.json.gz), then I tested the druid connector with some statements:

1) "SHOW TABLES": worked
2) "DESC wikiticker": worked
3) "SELECT * FROM wikiticker LIMIT 10": didn't work.

The druid datasource "wikiticker" was loaded via druid console following its "next step" wizard and the query works fine in druid console too.

The druid catalog config file is like below:

connector.name=druid

druid.coordinator-url=http://localhost:8081
druid.broker-url=http://localhost:8082
druid.schema-name=druid

Here is the error stack in presto log:

2020-02-25T14:23:43.470+0800    ERROR   remote-task-callback-0  com.facebook.airlift.concurrent.BoundedExecutor Task failed
java.lang.IllegalArgumentException: com.facebook.presto.server.TaskUpdateRequest could not be converted to JSON
    at com.facebook.airlift.json.JsonCodec.toJsonBytes(JsonCodec.java:214)
    at com.facebook.presto.server.smile.JsonCodecWrapper.toBytes(JsonCodecWrapper.java:45)
    at com.facebook.presto.server.remotetask.HttpRemoteTask.sendUpdate(HttpRemoteTask.java:642)
    at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.joda.time.Interval and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: com.facebook.presto.server.TaskUpdateRequest["sources"]->com.google.common.collect.SingletonImmutableList[0]->com.facebook.presto.execution.TaskSource["splits"]->com.google.common.collect.SingletonImmutableSet[0]->com.facebook.presto.execution.ScheduledSplit["split"]->com.facebook.presto.metadata.Split["connectorSplit"]->com.facebook.presto.druid.DruidSplit["segmentInfo"]->com.facebook.presto.druid.metadata.DruidSegmentInfo["interval"])
    at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
    at com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1191)
    at com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:313)
    at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.failForEmpty(UnknownSerializer.java:71)
    at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.serialize(UnknownSerializer.java:33)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeWithType(BeanSerializerBase.java:604)
    at com.facebook.presto.metadata.AbstractTypedJacksonModule$InternalTypeSerializer.serialize(AbstractTypedJacksonModule.java:115)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.std.CollectionSerializer.serializeContents(CollectionSerializer.java:145)
    at com.fasterxml.jackson.databind.ser.std.CollectionSerializer.serialize(CollectionSerializer.java:107)
    at com.fasterxml.jackson.databind.ser.std.CollectionSerializer.serialize(CollectionSerializer.java:25)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedListSerializer.java:119)
    at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:79)
    at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:18)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:480)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:400)
    at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1392)
    at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1120)
    at com.fasterxml.jackson.databind.ObjectWriter.writeValueAsBytes(ObjectWriter.java:1017)
    at com.facebook.airlift.json.JsonCodec.toJsonBytes(JsonCodec.java:211)
    ... 6 more

Most helpful comment

Hi @skyahead @qwalker sorry for the problem
am working on a fix: https://github.com/prestodb/presto/pull/14171
limit pushdown, aggregation pushdown, and topN pushdown will be implemented in following PRs
Will debug more issues, give me a few days

All 25 comments

@zhenxiao Really appreciate your work, hop you can help on this issue.

I didn't think we had a druid connector. Is that a private connector you wrote? Basically the task update request is trying to send something in the update that uses the Interval type from joda time. It doesn't know how to serialize that type, so the update fails. I suspect it would be something in the ConnectorTransactionHandle or ConnectorSplit for that connector.

@rschlussel The druid connector is from @zhenxiao and it's already in current master branch.

yep, thanks for trying it, @qwalker
am on a business trip right now. Will look at it in a few days
could you please share ur describe table result?

@zhenxiao I'd love to help test since I was expecting druid connector for long time:)

Here is the table description:

presto:druid> desc wikiticker;
     Column     |   Type    | Extra | Comment
----------------+-----------+-------+---------
 __time         | timestamp |       |
 added          | bigint    |       |
 channel        | varchar   |       |
 cityname       | varchar   |       |
 comment        | varchar   |       |
 countryisocode | varchar   |       |
 countryname    | varchar   |       |
 deleted        | bigint    |       |
 delta          | bigint    |       |
 isanonymous    | varchar   |       |
 isminor        | varchar   |       |
 isnew          | varchar   |       |
 isrobot        | varchar   |       |
 isunpatrolled  | varchar   |       |
 namespace      | varchar   |       |
 page           | varchar   |       |
 regionisocode  | varchar   |       |
 regionname     | varchar   |       |
 user           | varchar   |       |
(19 rows)

Query 20200225_042848_00014_v2fsx, FINISHED, 1 node
http://localhost:8080/ui/query.html?20200225_042848_00014_v2fsx
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total, 1.73K rows/s,  118KB/s, 25% active
Per Node: 0.0 parallelism,    46 rows/s, 3.18KB/s
Parallelism: 0.0
Peak Memory: 0B
0:00 [19 rows, 1.3KB] [46 rows/s, 3.18KB/s]

Ah sorry, there was an issue with my intellij project and couldn't see the presto-druid module. I think the problem is with the Interval field in DruidSegmentInfo. There's no jackson serialization defined for the interval type.

Hi @qwalker how about trying selecting NOT __time columns?

BTW, did you have a chance to try; https://github.com/prestodb/presto/pull/14155
I fixed the support for timestamp and added predicate pushdown.

@zhenxiao I saw your pushdown PR was merged into master, so I tested with latest master code, but now the server can't start. The error is :

com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Explicit bindings are required and com.facebook.presto.spi.relation.DeterminismEvaluator is not explicitly bound.
  while locating com.facebook.presto.spi.relation.DeterminismEvaluator
    for the 3rd parameter of com.facebook.presto.druid.DruidConnectorPlanOptimizer.<init>(DruidConnectorPlanOptimizer.java:67)
  at com.facebook.presto.druid.DruidModule.configure(DruidModule.java:37)

1 error
    at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
    at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
    at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
    at com.google.inject.Guice.createInjector(Guice.java:87)
    at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:245)
    at com.facebook.presto.druid.DruidConnectorFactory.create(DruidConnectorFactory.java:67)
    at com.facebook.presto.connector.ConnectorManager.createConnector(ConnectorManager.java:364)
    at com.facebook.presto.connector.ConnectorManager.addCatalogConnector(ConnectorManager.java:222)
    at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:214)
    at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:200)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:123)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:98)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:80)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:68)
    at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:135)
    at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:77)

I was trying the run the latest code this afternoon and got the same error, adding
binder.bind(DeterminismEvaluator.class).toInstance(context.getRowExpressionService().getDeterminismEvaluator()); });
in /presto-druid/src/main/java/com/facebook/presto/druid/DruidConnectorFactory.java can fix it.

@skyahead I added that line and it worked too. Now I can run select * from table limit 10 successfully, but select count(*) from table still fails with that jackson serialization error. Does that work for you?

@qwalker I could not run any queries as I got another error: Unable to create class com.facebook.presto.druid.metadata.DruidSegmentInfo from JSON response:. I run against S3, and I am not sure if that is the reason.

BTW, I do not think limit and aggregation are pushdown yet. I was thinking to try filters only.

Hi @skyahead @qwalker sorry for the problem
am working on a fix: https://github.com/prestodb/presto/pull/14171
limit pushdown, aggregation pushdown, and topN pushdown will be implemented in following PRs
Will debug more issues, give me a few days

Hi @skyahead @qwalker sorry for the problem
am working on a fix: #14171
limit pushdown, aggregation pushdown, and topN pushdown will be implemented in following PRs
Will debug more issues, give me a few days

Hi @zhenxiao
I have tried the new code and got the same error when running "select * from wikipedia limit 10"

2020-02-28T14:28:43.211+0800 ERROR remote-task-callback-14 com.facebook.airlift.concurrent.BoundedExecutor Task failed
java.lang.IllegalArgumentException: com.facebook.presto.server.TaskUpdateRequest could not be converted to JSON
at com.facebook.airlift.json.JsonCodec.toJsonBytes(JsonCodec.java:214)
at com.facebook.presto.server.smile.JsonCodecWrapper.toBytes(JsonCodecWrapper.java:45)
at com.facebook.presto.server.remotetask.HttpRemoteTask.sendUpdate(HttpRemoteTask.java:646)
at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.joda.time.Interval and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: com.facebook.presto.server.TaskUpdateRequest["sources"]->com.google.common.collect.SingletonImmutableList[0]->com.facebook.presto.execution.TaskSource["splits"]->com.google.common.collect.SingletonImmutableSet[0]->com.facebook.presto.execution.ScheduledSplit["split"]->com.facebook.presto.metadata.Split["connectorSplit"]->com.facebook.presto.druid.DruidSplit["segmentInfo"]->com.facebook.presto.druid.metadata.DruidSegmentInfo["interval"])
at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
at com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1191)
at com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:313)
at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.failForEmpty(UnknownSerializer.java:71)

@zhenxiao Really appreciate your work, I hope the feature can release as soon as possiable.

@skyahead I added that line and it worked too. Now I can run select * from table limit 10 successfully, but select count(*) from table still fails with that jackson serialization error. Does that work for you?

I tried the latest code in the master and still get the same error. Does that work for you? @qwalker

@zllclang I didn't test after that. I will wait for couple weeks and I believe it will be all good then.

@zllclang The latest code works for me.

@zllclang Really appreciate too. Just need this recently in my project, also experiencing the same interval serializer issue on select * , select count(*) or select without __time.

hi @rocar @zllclang @qwalker sorry for the late reply.
Give me some time to fix it. Just add support for limit pushdown.

The select * issue seems: there is a count column in wikipedia table, where when we translate into Druid queries, Druid will complain count is a reserved word. Let me think more how to fix it.

does selecting without count have problem?

@zllclang I have tested with the following combinations:
select count(user) from wikipedia;
select user from wikipedia;
select namespace from wikipedia;
select * from wikipedia;
select count(*) from wikipedia;
select 1 from wikipedia;

All above SQL gave the same error.
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.joda.time.Interval and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: com.facebook.presto.server.TaskUpdateRequest["sources"]->com.google.common.collect.SingletonImmutableList[0]->com.facebook.presto.execution.TaskSource["splits"]->com.google.common.collect.SingletonImmutableSet[0]->com.facebook.presto.execution.ScheduledSplit["split"]->com.facebook.presto.metadata.Split["connectorSplit"]->com.facebook.presto.druid.DruidSplit["segmentInfo"]->com.facebook.presto.druid.metadata.DruidSegmentInfo["interval"])

hi @rocar @zllclang @qwalker
aggregation pushdown and limit pushdown is merged. Could you please test latest master and keep me posted if you still see errors?

hi @zhenxiao
Thanks for the updates. Tested the latest master, still hv the same error. I cant found any SQL that runs without this error yet.
It seems related to unable to serialize the interval in DruidSegmentInfo.java.

hi @zhenxiao
Thanks for your wok.I still found err, when I executed "select count(*) from wikipedia;", log shows:
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer found for class org.joda.time.Interval and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain: com.facebook.presto.server.TaskUpdateRequest["sources"]->com.google.common.collect.SingletonImmutableList[0]->com.facebook.presto.execution.TaskSource["splits"]->com.google.common.collect.SingletonImmutableSet[0]->com.facebook.presto.execution.ScheduledSplit["split"]->com.facebook.presto.metadata.Split["connectorSplit"]->com.facebook.presto.druid.DruidSplit["segmentInfo"]->com.facebook.presto.druid.metadata.DruidSegmentInfo["interval"])

@zhenxiao
I did some tests and found the following issuses

(1)exec "select count() from table" fail,but exec "select count() from table limit 1" success.
at com.facebook.presto.druid.DruidQueryGeneratorContext.toQuery()
when execute "select count(*) from table" ,variable "pushdown" will be still false,and then execute failed。
I change variable "pushdown" to true,it will be success.

(2)exec "select column,count(1) from table group by column limit 10", it still get 10+ results.
at com.facebook.presto.druid.DruidQueryGeneratorContext.toQuery() line 236
if (!hasAggregation() && limit.isPresent()) {
query += " LIMIT " + limit.getAsLong();
pushdown = true;
}
this code will ignore "limit 10",because !hasAggregation() is false

(3)exec "select count(1) from table group by column" fail
at com.facebook.presto.druid.DruidQueryGeneratorContext.toQuery() line 231
String groupByExpression = groupByColumns.stream().map(x -> selections.get(x).getDefinition()).collect(Collectors.joining(", "));
this line will check if group field in select fields,and then it will throw exception

(4)if table columns contain field "count",then exec "select * from table",it will fail。
I found it will exec "select count from table",rather than "select "count" from table"

thank you, @mqiang @rocar
am working on a fix for aggregation pushdown:
https://github.com/prestodb/presto/pull/14325

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zsaltys picture zsaltys  Â·  4Comments

rajeshd3v picture rajeshd3v  Â·  3Comments

haozhun picture haozhun  Â·  4Comments

tesseract2048 picture tesseract2048  Â·  3Comments

dterror-zz picture dterror-zz  Â·  4Comments