It is a minor breaking change, but should eliminate a lot of confusion.
I don鈥檛 look forward to explaining over and over 鈥淵our default RP isn鈥檛 the default RP, you鈥檝e set hourly as the default. All writes go to hourly by default. default is not written to unless you explicitly write to the default retention policy."
Possible names from a Slack convo:
I am OK with this change. I am curious how we want to message this such that it doesn't cause too much problems in the field. It's a semi-breaking change.
How about autogen_inf?
@otoolep I made #3746 so that users can work around any implications of this change.
sure, I vote autogen
We're keeping it default for now, but users now have the ability to change the name of the RP that gets created when they create a database #2676
@pauldix I strongly disagree. This is a terrible name for documentation and training, and there's no defensible reason to keep it as such long-term. I understand it is a breaking change, but I need this for 1.0.
If we're going to make a breaking change, I would make the change so that CREATE DATABASE requires arguments to specify what the "default" retention policy is.
That works for me. Creating a database is an infrequent operation, and explicitly requiring the details avoids any confusion. @rkuchan @gunnaraasen what do you think about making CREATE DATABASE always require the retention policy details?
https://github.com/influxdb/influxdb/issues/5166 is related to this discussion.
I'm fine with requiring a user-defined retention policy on database creation. To make it easier, we could only require the duration. If no RP name or replication are provided, we could set the name to the string of the duration, e.g. 2w, and replication to the current number of nodes. Then the minimal create database statement would be
CREATE DATABASE mydb DURATION 2d
This would create a mydb database with a default retention policy named 2d with a duration of 2d and a replication of 1 (in a single node context).
@gunnaraasen CREATE DATABASE mydb WITH DURATION 2d is already supported.
Full syntax:
CREATE DATABASE [IF NOT EXISTS] NOAA_water_database
[WITH
[DURATION 24hr]
[REPLICATION 1]
[NAME something_other_than_default]
]
If NAME is not specified, then default is used.
@pires it is, we're talking about making that the required syntax, which would be a breaking API change.
@jwilder @pauldix can we please please please get this for 0.13?
I really don't want to go to 1.0 with this in place. It is very confusing for docs and tutorials.
There are two changes discussed here:
autogenCREATE DATABASE to require NAMEI'm opposed to option 2. Option 1 is a breaking change to any code that assumes "default"should always exists and it's explicitly specifying that retention policy in writes or queries. If it suddenly disappears, that would be confusing.
One idea to prevent the breaking change could be to automatically convert default retention policies requests to autogen if default _does not_ exist and log that we converted it. The log message should tell the user that they need to update their query/write to specify autogen instead. If someone created a default retention policy, we would not auto-convert it.
@jwilder I too am opposed to option 2. Users shouldn't need to grok RPs to make a database.
However, I think the original decision to use default as the name of the auto-created RP was a mistake, and I don't think fixing that mistake should be avoided because it's a breaking change.
I do like your idea of mitigating the breaking change, and I also agree that default is a legal name for an RP, it just shouldn't be the one we create on behalf of the user. I think the log message should ask them to update the code from database.default.meaurement to database..measurement. The name of the DEFAULT retention policy should be irrelevant to the code that intends only to call the default. If the code intends to call the autogenerated RP, that should be updated to use autogen but that's a much rare use case.
How about if the default is just a blank name or a missing retention policy and we rely on the database to fill in the name? We could then do a subtle name of _default and the underscore would indicate that the name was private. The documentation doesn't have to tell the user the name is _default, but just tell them to leave the retention policy blank and the database will fill it in with the default retention policy name.
@jsternberg Imagine the use case where data is initially written to the DEFAULT policy, but then later a new DEFAULT policy is set. How would the data in _default be queried then? They would still need to reference _default explicitly. We have to remember that the DEFAULT policy will change, and the autogenerated policy just happens to be the DEFAULT when it is created. There's nothing special about its DEFAULT status and we shouldn't name it as such.
Ok, looking at the code a bit just changing the name should be really simple. The default retention policy for a database is stored with the database itself so the change won't affect currently existing databases. As long as the person omits the retention policy name, it will choose the correct one automatically as @beckettsean has said.
The only change that needs to happen is a change to the parser for the default retention policy name and a change to the meta service for setting the default retention policy when one isn't specified.
One thing I noticed when digging through the code was that the default retention policy name wasn't in a constant and was sprinkled around the code everywhere. I'm thinking of changing it to a constant variable but I'm not sure which package to put it in.
Either way, I'll put up a change and we'll iterate over it until we're all happy.
Sounds great!
Most helpful comment
@jwilder I too am opposed to option 2. Users shouldn't need to grok RPs to make a database.
However, I think the original decision to use
defaultas the name of the auto-created RP was a mistake, and I don't think fixing that mistake should be avoided because it's a breaking change.I do like your idea of mitigating the breaking change, and I also agree that
defaultis a legal name for an RP, it just shouldn't be the one we create on behalf of the user. I think the log message should ask them to update the code fromdatabase.default.meaurementtodatabase..measurement. The name of the DEFAULT retention policy should be irrelevant to the code that intends only to call the default. If the code intends to call the autogenerated RP, that should be updated to useautogenbut that's a much rare use case.