Clickhouse: alter table ttl doesn't work

Created on 12 Aug 2019  路  3Comments  路  Source: ClickHouse/ClickHouse

Describe the bug
ALTER TABLE ontime MODIFY TTL doesn't work, the outdated data is not deleted

How to reproduce
version: 19.8.3.8.

  • CREATE TABLE statements for all tables involved

CREATE TABLE ontime (\
Year UInt16,\
Quarter UInt8,\
Month UInt8,\
DayofMonth UInt8,\
DayOfWeek UInt8,\
FlightDate Date,\
UniqueCarrier FixedString(7),\
AirlineID Int32,\
Carrier FixedString(2),\
TailNum String,\
FlightNum String,\
OriginAirportID Int32,\
OriginAirportSeqID Int32,\
OriginCityMarketID Int32,\
Origin FixedString(5),\
OriginCityName String,\
OriginState FixedString(2),\
OriginStateFips String,\
OriginStateName String,\
OriginWac Int32,\
DestAirportID Int32,\
DestAirportSeqID Int32,\
DestCityMarketID Int32,\
Dest FixedString(5),\
DestCityName String,\
DestState FixedString(2),\
DestStateFips String,\
DestStateName String,\
DestWac Int32,\
CRSDepTime Int32,\
DepTime Int32,\
DepDelay Int32,\
DepDelayMinutes Int32,\
DepDel15 Int32,\
DepartureDelayGroups String,\
DepTimeBlk String,\
TaxiOut Int32,\
WheelsOff Int32,\
WheelsOn Int32,\
TaxiIn Int32,\
CRSArrTime Int32,\
ArrTime Int32,\
ArrDelay Int32,\
ArrDelayMinutes Int32,\
ArrDel15 Int32,\
ArrivalDelayGroups Int32,\
ArrTimeBlk String,\
Cancelled UInt8,\
CancellationCode FixedString(1),\
Diverted UInt8,\
CRSElapsedTime Int32,\
ActualElapsedTime Int32,\
AirTime Int32,\
Flights Int32,\
Distance Int32,\
DistanceGroup UInt8,\
CarrierDelay Int32,\
WeatherDelay Int32,\
NASDelay Int32,\
SecurityDelay Int32,\
LateAircraftDelay Int32,\
FirstDepTime String,\
TotalAddGTime String,\
LongestAddGTime String,\
DivAirportLandings String,\
DivReachedDest String,\
DivActualElapsedTime String,\
DivArrDelay String,\
DivDistance String,\
Div1Airport String,\
Div1AirportID Int32,\
Div1AirportSeqID Int32,\
Div1WheelsOn String,\
Div1TotalGTime String,\
Div1LongestGTime String,\
Div1WheelsOff String,\
Div1TailNum String,\
Div2Airport String,\
Div2AirportID Int32,\
Div2AirportSeqID Int32,\
Div2WheelsOn String,\
Div2TotalGTime String,\
Div2LongestGTime String,\
Div2WheelsOff String,\
Div2TailNum String,\
Div3Airport String,\
Div3AirportID Int32,\
Div3AirportSeqID Int32,\
Div3WheelsOn String,\
Div3TotalGTime String,\
Div3LongestGTime String,\
Div3WheelsOff String,\
Div3TailNum String,\
Div4Airport String,\
Div4AirportID Int32,\
Div4AirportSeqID Int32,\
Div4WheelsOn String,\
Div4TotalGTime String,\
Div4LongestGTime String,\
Div4WheelsOff String,\
Div4TailNum String,\
Div5Airport String,\
Div5AirportID Int32,\
Div5AirportSeqID Int32,\
Div5WheelsOn String,\
Div5TotalGTime String,\
Div5LongestGTime String,\
Div5WheelsOff String,\
Div5TailNum String\
) ENGINE = \
ReplicatedMergeTree('/clickhouse-test/tables/{shard}/ontime', '{replica}') \
PARTITION BY toYYYYMM(FlightDate) \
ORDER BY (Year, Quarter, Month, DayofMonth, DayOfWeek, FlightDate) \
SETTINGS index_granularity = 8192

  • Sample data for all these tables
    the ontime table data
  • Queries to run that lead to unexpected result
    ALTER TABLE ontime MODIFY TTL FlightDate + INTERVAL 12 Month
    then to force do merge by
    OPTIMIZE table ontime final;

Expected behavior
select count(*) from ontime
the number of rows reduced

Additional context
The show create table query shows that the ttl is actually modified, so it is not same bug as #5494

When table is created with ttl, after import data into the table, and run the OPTIMIZE query, the number of rows is reduced

bug comp-ttl

Most helpful comment

Initially TTL was implemented with (maybe a bit confusing) logic, that parts created before ALTER TABLE ... MODIFY TTL would never be dropped by TTL. It was changed in PR #6274 and now those old parts can be filtered by TTL with OPTIMIZE ... FINAL query. This patch will be available in 19.14 release.
NOTE: Even now extra merges with TTL won't assign automatically for parts, created before ALTER query. That data can be deleted only manually with OPTIMIZE ... FINAL query.

All 3 comments

Initially TTL was implemented with (maybe a bit confusing) logic, that parts created before ALTER TABLE ... MODIFY TTL would never be dropped by TTL. It was changed in PR #6274 and now those old parts can be filtered by TTL with OPTIMIZE ... FINAL query. This patch will be available in 19.14 release.
NOTE: Even now extra merges with TTL won't assign automatically for parts, created before ALTER query. That data can be deleted only manually with OPTIMIZE ... FINAL query.

@CurtizJ I think the new ttl behavior is still not good enough and is still confusing. It's better to purge out-dated data automatically. Maybe a new background scanner thread to scan parts to do the purge periodically. A part should have statistics about min/max value of a Date/Datetime column, so the scanner should not cost lots of resources to do the scan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings