Presto: Variable support

Created on 22 Aug 2016  Â·  38Comments  Â·  Source: prestodb/presto

We're writing adhoc queries for data interested business users. Variable support or alike would be great, something like MySQL;

SET @user_email = '[email protected]';
SELECT *
FROM users
WHERE email_address = @user_email;

Alternatively parameters similar to Impala's implementation;

SELECT *
FROM users
WHERE email_address = $user_email;

In the meantime, I've come up with a hacky solution using WITH statements :(

WITH variables AS (SELECT '[email protected]' AS user_email)
SELECT *
FROM
  users u
  JOIN variables v ON u.email_address = v.user_email
enhancement syntax-needs-review

Most helpful comment

Thanks Rebecca, What you wrote makes sense. Also, while searching I came across also the following text about T-SQL: https://en.m.wikipedia.org/wiki/Transact-SQL The structure there is similar but uses @ before the variable name, i.e.: DECLARE @Myvar VARCHAR(10) SET @Myvar = 'a string' So I'm wondering if it makes sense to allow (but not force) the use of @ before the var name, for people coming from some other SQL flavors.

My GitHub username is Myvar so its very funny that i randomly got mentioned on some random project's issue, this happens why to often. :)

All 38 comments

when this is going to get merged and released?

When could it be ready ?

(PING)
I'm also interested in this feature. Any chance this could get added in the near future?
Thanks.

You can also use PREPARED STATEMENTS, see https://prestodb.io/docs/current/sql/prepare.html

Hi @kokosing
I tried and couldn't figure out how to use it as an alternative to SET, can you suggest the code to do it?

The above example would look like:

PREPARE my_select FROM
SELECT *
FROM users
WHERE email_address = ?;
EXECUTE my_select USING '[email protected]';
EXECUTE my_select USING '[email protected]';
DEALLOCATE PREPARE my_select;

Interesting, thanks. Could it be done with more than one argument? (?1, ?2 etc?)

Please see the second example from https://docs.starburstdata.com/latest/sql/execute.html

+1.. This would really be great to have.

+1 Would make my life much easier!

@kokosing The issue with using prepared statements is that it does not serve all usecases.
For example:

  • Let us say I am using presto via jupyter/Zeppelin notebooks and and execution 4 queries in order.
set var_a = 'xxx';
Q1: select var_a from T1;
Q2: select var_a from T2;
and so on...

And i need to see outputs after execution of each of those queries. Since prepared statements support just once command at a time, I will have to write a prepared statement for each one of them, which defeats the purpose.
Please let me know if i am somehow mistaken.
The above solution server the usecase of one query multiple values for params but now multiple queries with one value for params.
I can take this up, if this makes sense.

query variable will surely make my queries much shorter and human readable - +1!

PostgreSQL supports this on the client side (https://www.postgresql.org/docs/current/app-psql.html). Seems like a reasonable thing to do on presto cli, thoughts?

I'd love to be able to do something like below:

WITH CTE_p80 as (
  SELECT APPROX_PERCENTILE(<column>, 0.8) as col_p80
  FROM <table>
)

SET my_var_p80 = SELECT col_p80 FROM CTE_p80;

SELECT 
...
FROM
...
WHERE <colnew>   >= '${hiveconf:my_var_p80}'

I just need variables with numbers, such as

DAYS=180
...
WHERE (ds >= CAST(date_add('day', -DAYS, NOW()) as VARCHAR))

It's extremely paintful to have to edit this number in literally 6 different lines of my script every time PRESTO cluster times out because the query exceeds the standard interactive time limit.

+1, it would be very helpful

I'd propose to support the basic form of this, which seems to be what people are asking for most:
SET variable = constant.

If we can agree on this syntax we can move forward to discuss implementation. Once we settle on how this should be implemented, we can mark this as available-to-pickup and see whether anyone would volunteer to work on this.

This generally looks good to me. My only question is whether we should add another word to the syntax to distinguish it from SET SESSION/ SET ROLE/ SET PATH

Sounds good to me. Would there be a preference to
SET VAR
or
SET VARIABLE
?
(I'm leaning towards the first option)

On Mon, Aug 26, 2019, 19:00 Rebecca Schlussel notifications@github.com
wrote:

This generally looks good to me. My only question is whether we should add
another word to the syntax to distinguish it from SET SESSION/ SET ROLE/
SET PATH

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/5918?email_source=notifications&email_token=AAHOJBR6BRG4CTBOT2KXZ73QGP4YNA5CNFSM4CNI7A6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5EZ6FI#issuecomment-524918549,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHOJBQ2L3WMKZUAAUE2QTTQGP4YNANCNFSM4CNI7A6A
.

set var sounds good to me

SQL spec seems to suggest it's just SET variable_name = variable_value. But I'm not that good at reading spec, so if someone else want to double check that would be great.

15.2
This Subclause modifies Subclause 14.5, “”, in ISO/IEC 9075-4.
Function
Assign a value to an SQL variable, SQL parameter, host parameter, or host variable.
Format
<singleton variable assignment> ::=
SET <assignment target> <equals operator> <assignment source>
[ <XML passing mechanism> ]

Do you have a link?

On Mon, Aug 26, 2019, 20:44 Rongrong Zhong notifications@github.com wrote:

SQL spec seems to suggest it's just SET variable_name = variable_value.
But I'm not that good at reading spec, so if someone else want to double
check that would be great.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/5918?email_source=notifications&email_token=AAHOJBXEE72JCWEIVR6TBDDQGQJAFA5CNFSM4CNI7A6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5FDG3I#issuecomment-524956525,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHOJBSTYM3BMNWAVANOB5DQGQJAFANCNFSM4CNI7A6A
.

Do you have a link?

Nope, SQL Spec is only available for money. 🤣

(From 9075-4:2011, which is the part on sql/psm, sections 14.4 and 14.5)

The standard also requires you to first declare your variables using

14.4 <SQL variable declaration> Function 
Declare one or more variables. 
Format 
<SQL variable declaration> ::=
  DECLARE <SQL variable name list> <data type> [ <default clause> ]
<SQL variable name list> ::=
  <SQL variable name> [ { <comma> <SQL variable name> }... ]
Syntax Rules 
1) The specified <data type> is the declared type of each variable declared by the <SQL variable declaration>. 
...
General Rules 
1) If <SQL variable declaration> contains <default clause> DC, then let DV be the <default option> contained in DC. Otherwise let DV be <null specification>. Let SV be the variable defined by the <SQL variable declaration>. The following SQL-statement is effectively executed: 
SET SV = DV 
Conformance Rules 
1) Without Feature P002, “Computational completeness”, conforming SQL language shall not contain a <SQL variable declaration>.

Then you have the variable assignment syntax. (There are a bunch of rules about it, but they aren't relevant if we only allow constants)

14.5 <assignment statement>
This Subclause is modified by Subclause 11.16, “<assignment statement>”, in ISO/IEC 9075-10. This Subclause is modified by Subclause 15.2, “<assignment statement>”, in ISO/IEC 9075-14.
Function
Assign a value to an SQL variable, SQL parameter, host parameter, or host variable.
Format
<assignment statement> ::=
    <singleton variable assignment>
  | <multiple variable assignment>
<multiple variable assignment> ::=
  SET <assignment target list> <equals operator> <assigned row>
<assignment target list> ::=
  <left paren> <assignment target> [ { <comma> <assignment target> }... ] <right paren>

  SET <assignment target> <equals operator> <assignment source>
<assignment target> ::=
    <target specification>
  | <modified field reference>
  | <mutator reference>
<assignment source> ::=
    <value expression>
  | <contextually typed source>
<contextually typed source> ::=
    <implicitly typed value specification>
  | <contextually typed row value expression>

So I guess the syntax should be as follows (though it's clunkier than a one line statement)

DECLARE my_var type
SET my_var=value

Thanks Rebecca,
What you wrote makes sense.

Also, while searching I came across also the following text about T-SQL:
https://en.m.wikipedia.org/wiki/Transact-SQL

The structure there is similar but uses @ before the variable name, i.e.:
DECLARE @myvar VARCHAR(10)
SET @myvar = 'a string'

So I'm wondering if it makes sense to allow (but not force) the use of @
before the var name, for people coming from some other SQL flavors.

On Mon, Aug 26, 2019, 22:52 Rebecca Schlussel notifications@github.com
wrote:

(From 9075-4:2011, which is the part on sql/psm, sections 14.4 and 14.5)

The standard also requires you to first declare your variables using

14.4 Function

Declare one or more variables.

Format

::=

DECLARE [ ]

::=

[ { }... ]

Syntax Rules

1) The specified is the declared type of each variable declared by the .

...

General Rules

1) If contains DC, then let DV be the contained in DC. Otherwise let DV be . Let SV be the variable defined by the . The following SQL-statement is effectively executed:

SET SV = DV

Conformance Rules

1) Without Feature P002, “Computational completeness”, conforming SQL language shall not contain a .

Then you have the variable assignment syntax. (There are a bunch of rules
about it, but they aren't relevant if we only allow constants)

14.5

This Subclause is modified by Subclause 11.16, “”, in ISO/IEC 9075-10. This Subclause is modified by Subclause 15.2, “”, in ISO/IEC 9075-14.

Function

Assign a value to an SQL variable, SQL parameter, host parameter, or host variable.

Format

::=

<singleton variable assignment>

|

::=

SET

::=

[ { }... ]

SET

::=

<target specification>

|

|

::=

<value expression>

|

::=

<implicitly typed value specification>

|

So I guess the syntax should be as follows (though it's clunkier than a
one line statement)

DECLARE my_var type

SET my_var=value

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/5918?email_source=notifications&email_token=AAHOJBU54G35JRZPQIQU5GDQGQX6LA5CNFSM4CNI7A6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5FPDEA#issuecomment-525005200,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHOJBQXVTUBHAJVTAIR5HDQGQX6LANCNFSM4CNI7A6A
.

I was thinking about the DECLARE. It seems not necessary for type since we are only allowing constant, but it would solve the problem of distinguish from other SET statements. So I'm ok either ways. If we ever want to support other type of variables, it makes sense to introduce DECLARE from the beginning.

@talgalili I don't think @ is part of SQL spec so I don't think we need to require @. I didn't find any specification on the format of variable name. There is a mention that it should not be equivalent to any other SQL parameter name or column name. So maybe any identifier would do. What do you think @rschlussel?

Several things:

  1. I don't think @ should be required. But it would be nice to allow it.

  2. I support allowing DECLARE and then SET.
    I wonder if it makes sense to also allow it with the syntax:
    SET VAR
    So to make the process faster to write (since it saves us writing the var
    type).

WDYT?

On Tue, Aug 27, 2019, 01:31 Rongrong Zhong notifications@github.com wrote:

I was thinking about the DECLARE. It seems not necessary for type since
we are only allowing constant, but it would solve the problem of
distinguish from other SET statements. So I'm ok either ways. If we ever
want to support other type of variables, it makes sense to introduce
DECLARE from the beginning.

@talgalili https://github.com/talgalili I don't think @ is part of SQL
spec so I don't think we need to require @. I didn't find any
specification on the format of variable name. There is a mention that it
should not be equivalent to any other SQL parameter name or column name. So
maybe any identifier would do. What do you think @rschlussel
https://github.com/rschlussel?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/5918?email_source=notifications&email_token=AAHOJBQQBVHZI27DZARIAXLQGRKUTA5CNFSM4CNI7A6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5F4CQA#issuecomment-525058368,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHOJBTTKBCALW42J4OL66TQGRKUTANCNFSM4CNI7A6A
.

I don't think VAR would be required either way (the grammar should parse fine without it). given that there is a specification for this feature, I think we shouldn't have var in the syntax. I would be okay leaving out the DECLARE and only requiring it later for non-constants, but we should be careful to make sure that plan is feasible.

Implementation-wise, I think this should be similar to prepared statements, since the server doesn't keep any session state. That means it requires changes both on the server side and on the client side.

I don't know what's the business with "@" in general and how to justify whether to allow it or not. It's not a valid identifier. If we want to allow it, what is a "variable name"?

The spec says a variable should be a <basic identifier chain>, which is just an identifier or period separated chain of identifiers. Identifiers only allow letters and numbers, so @ wouldn't be allowed

Hi, I'd like to check in on what the current process/consensus by the Presto team is on including variables is. Is there a thread I could follow?

I would love to see variables in Presto. This would save me a lot of headache.

@kaikalur Any thoughts on the syntax? @carlshan are you interested in working on this?

@rongrong just to say, I still hope this feature can be introduced.
I come across the need for it at least once a month (I just wrote a bunch of code that I shared with others that needed to be run regularly with different dates (DS) manually, and having this feature would have made the process more robust (since DS was used in several places in the query).

So I do hope someone can take over developing this feature (I would, accept that I have no familiarity with the code architecture or language, so it's not feasible that I'll work on it).

Is there an ETA for this to be available?

Nobody is actively working on this. Contribution is much appreciated!

Thanks Rebecca, What you wrote makes sense. Also, while searching I came across also the following text about T-SQL: https://en.m.wikipedia.org/wiki/Transact-SQL The structure there is similar but uses @ before the variable name, i.e.: DECLARE @Myvar VARCHAR(10) SET @Myvar = 'a string' So I'm wondering if it makes sense to allow (but not force) the use of @ before the var name, for people coming from some other SQL flavors.

My GitHub username is Myvar so its very funny that i randomly got mentioned on some random project's issue, this happens why to often. :)

LOL

On Mon, Apr 27, 2020, 12:25 Emile Badenhorst notifications@github.com
wrote:

Thanks Rebecca, What you wrote makes sense. Also, while searching I came
across also the following text about T-SQL:
https://en.m.wikipedia.org/wiki/Transact-SQL The structure there is
similar but uses @ before the variable name, i.e.: DECLARE @Myvar
https://github.com/Myvar VARCHAR(10) SET @Myvar
https://github.com/Myvar = 'a string' So I'm wondering if it makes
sense to allow (but not force) the use of @ before the var name, for people
coming from some other SQL flavors.

My GitHub username is Myvar so its very funny that i randomly got
mentioned in on some random project's issue, this happens why to often. :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/5918#issuecomment-619853437,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHOJBQN3QGKKCYKNF3SQBLROVFRNANCNFSM4CNI7A6A
.

+1 would love to have this functionality!

Was this page helpful?
0 / 5 - 0 ratings