Presto: How to use CarbonData Presto plugin with Presto 318?

Created on 17 Sep 2019  Â·  13Comments  Â·  Source: prestosql/presto

Hi Friends,
I am planning to use CarbonData file format with Presto 318 but I can not see any support for carbon data in Presto 318.

Where can I confirm this?
Which latest version have support for carbondata?

Presently I am using Presto 217 with carbondata, but Presto 217 doesn't have CBO implementation. I want to use carbondata file format with Presto CBO supported version.

@joshk @dain

All 13 comments

We just started a thread on Slack: https://prestosql.slack.com/archives/CFLB9AMBN/p1568753291039600

If you're not already signed up, there's an invite link here: https://prestosql.io/community.html

@PranayMunshi1982 : Currently carbondata works with prestodb 217. If you want to make it work with presto sql 316, you can cherry-pick the open PR https://github.com/apache/carbondata/pull/3205

With this changes I have tested this with 317 also. It was working fine.
Officially we support prestosql in next version of carbondata. We cannot immediately support because some users don't want to migrate to prestosql yet.

join slack as @electrum suggested, we can discuss further

Hi @ajantha-bhat ,
Currently, prestosql 320 not work with carbondata, any plan about it ?
Seem likes prestosql change api ?

Thanks.

Hi ,

In Latest carbondata presto integration code, carbondata just contains read
support.
You can create a carbon table from spark / hive (hive write is in progress)
or use carbon SDK and query it from presto.

Latest master is integrated with only prestodb 0.217.

As many user requested. Now carbondata wants to support prestosql also.
Currently I have raised a PR to integrate with prestosql 316.

https://github.com/apache/carbondata/pull/3641, This PR will be merged this
week, you can cherry-pick and try before that also.
Also at a time carbon can support only one version of prestosql. So,
currently PR is only for pretosql316 (some customer using prestosql 316)

But for you, I will work on prestosql 320 and share you a PR.

Upcoming plans:
a. Support write carbon data files form presto.
b. support latest version of prestosql.
c. publish latest benchmarking reports.
d. Instead of keeping integration code in carbondata repo, try contributing
it in prestosql community.

Thanks,
Ajantha

On Sat, Mar 7, 2020 at 5:25 PM brucemen711 notifications@github.com wrote:

Hi @ajantha-bhat https://github.com/ajantha-bhat ,
Currently, prestosql 320 not work with carbondata, any plan about it ?
Seem likes prestosql change api ?

Thanks.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/prestosql/presto/issues/1545?email_source=notifications&email_token=ABM527BENY2E7TBYW7OMAQDRGIY3NA5CNFSM4IXMDIZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODXPIY#issuecomment-596080547,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABM527EL3SVOAXY27CII6V3RGIY3NANCNFSM4IXMDIZA
.

Hi @ajantha-bhat ,
I think Presto change very frequently, so if plan D to be true, it would be great.
For now, i'm looking for your upcomming plans (a,b).
Becauce, i'm testing to upgrate presto so you can make a PR for lastest Presto (not presto 320).

Thanks for your help,

https://github.com/apache/carbondata/pull/3662

@brucemen711 : I have created this PR, I have tested locally. There was a small change in HiveModule and HiveSplit, hence carbon had to change the code.
Please take the latest carbondata master code and cherry-pick this PR.
Try it and let me know.

Also just curious, Have you used carbondata with presto in your production ?

Hi,
I’m testing carbondata for production (data warehouse, data mart)
But our company requirement pretty stricted, it must work with Spark and Presto and other things.

Thanks for your pr, i will give a try.

Hi ,
For everyone, I try preso-320 and its working fine.

With presto version 329,330, it show error, seem likes presto change api frequently.

ERROR main io.prestosql.server.PrestoServer io.prestosql.spi.procedure.Procedure$Argument.(Ljava/lang/String;Ljava/lang/String;)V
java.lang.NoSuchMethodError: io.prestosql.spi.procedure.Procedure$Argument.(Ljava/lang/String;Ljava/lang/String;)V
at io.prestosql.plugin.hive.CreateEmptyPartitionProcedure.get(CreateEmptyPartitionProcedure.java:78)
at io.prestosql.plugin.hive.CreateEmptyPartitionProcedure.get(CreateEmptyPartitionProcedure.java:49)
at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:85)
at com.google.inject.internal.BoundProviderFactory.provision(BoundProviderFactory.java:77)
at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:59)
at com.google.inject.internal.BoundProviderFactory.get(BoundProviderFactory.java:61)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:168)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:39)
at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:211)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:182)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109)
at com.google.inject.Guice.createInjector(Guice.java:87)
at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242)
at org.apache.carbondata.presto.CarbondataConnectorFactory.create(CarbondataConnectorFactory.java:140)
at io.prestosql.connector.ConnectorManager.createConnector(ConnectorManager.java:341)
at io.prestosql.connector.ConnectorManager.createCatalog(ConnectorManager.java:203)
at io.prestosql.connector.ConnectorManager.createCatalog(ConnectorManager.java:195)
at io.prestosql.connector.ConnectorManager.createCatalog(ConnectorManager.java:181)
at io.prestosql.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:88)
at io.prestosql.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:68)
at io.prestosql.server.PrestoServer.run(PrestoServer.java:129)
at io.prestosql.$gen.Presto_329____20200308_082819_1.run(Unknown Source)
at io.prestosql.server.PrestoServer.main(PrestoServer.java:72)

Hope @ajantha-bhat will woking on plan a,b.
With presto 320, we has error with decimal accuracy on parquet new format (.parq). Presto recently has fixed it so we have to wait lastest presto-carbondata integration.

As your initial message was mentioned you need for presto 320. I supported carbon for presto 320.

If you need latest version. I need to recheck @brucemen711

We are on batle test with current system so i found out presto 320 have errors with decimal accuracy on parquet new format (.parq).
We cant not give up parquet right now so If you can support lastest version presto, it would be great.

Thanks for your supporting.

@brucemen711 : Hi, I have worked on carbondata connector support prestosql 330.

Take the latest carbondata master open source code and cherry-pick the below PR and test it out.

https://github.com/apache/carbondata/pull/3662

Hi @ajantha-bhat ,
i will try and feedback.
Thanks.

Was this page helpful?
0 / 5 - 0 ratings