Home Assistant release (hass --version):
0.61.1
Python release (python3 --version):
3.5.2
Component/platform:
Recorder / purge database
Description of problem:
Events not deleted from the events table
Expected:
Purge states AND events
Problem-relevant configuration.yaml entries and steps to reproduce:
Activate DEBUG for recorder in HA configuration
logger:
default: warn
logs:
homeassistant.components.recorder: debug
I have set keep_days: 14 to 13 to have some data to purge.
With mysql client:
select count(*) from states
217890
select count(*) from events
839980
Log output:
2018-01-17 15:20:16 DEBUG (Recorder) [homeassistant.components.recorder.purge] Deleted 15609 states
2018-01-17 15:20:28 DEBUG (Recorder) [homeassistant.components.recorder.purge] Deleted 0 events
2018-01-17 15:20:28 DEBUG (Recorder) [homeassistant.components.recorder.purge] DB engine driver: pymysql
select count(*) from states
202313 -> 15609 states deleted
select count(*) from events
840000 -> no event deleted
I tested this with HA on ubuntu 16.04 in venv, MySQL in docker container mysql:5.6
and in Hass.io with MariaDB addon. Both with the same results.
Just checked my logs and I see something as well. Not sure if it is related, but here it is:
2018-01-17 12:31:42 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (_mysql_exceptions.OperationalError) (1055, "Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'homeassistant.states.state_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by") [SQL: 'SELECT states.state_id AS states_state_id, states.event_id AS states_event_id, max(states.last_updated) AS max_1 \nFROM states GROUP BY states.entity_id'] (Background on this error at: http://sqlalche.me/e/e3q8)
Yep, that's something that has changed im MySQL 5.7.5:
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by
I think I understand the problem: The state_id (and probably also the event_id) field are not aggregated, even though we are doing a group_by query. For some reason, I thought that if you have one max() aggregation in the fields, all non-aggregated fields get the value of the record selected by the max() aggregation. And seemingly this worked until recently (and still works for SQLite). I'll try to figure this out!
Just for your info.
I played a little with different DBMS docker containers..
If i was on mysql:5.6 i have seen the log messages like in my first post.
With mysql:5.7 i get the same error like @arsaboo .
In the moment i'm on mariadb:latest (thought i give it a try, cause it's used in hass.io as addon) and
i see the log messages like in my first post again.
Yep, we'll need to reformulate the queries a bit - again. I hope to come up with something over the next days. This will probably not make it into a bugfix release, as it is not as critical as the recorder crashing, but will hopefully ship with the next normal release.
Could you perhaps try if
https://github.com/home-assistant/home-assistant/compare/dev...tinloaf:sql_fix
solves the problem for you? This one should do a proper "largest-of" query, doesn't use aggregation anymore, but uses subqueries again. However, the subquery should be doable, even for MySQL.
@tinloaf Updated the purge.py file from your fix. Will update if I see the error again. Fingers crossed 🤞
If it goes well for you, I'll make that a PR tomorrow or so. :)
Still seeing the error :(
Log Details (ERROR)
Thu Jan 18 2018 13:39:37 GMT-0500 (Eastern Standard Time)
Error executing query: (_mysql_exceptions.OperationalError) (1055, "Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'homeassistant.states.state_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by") [SQL: 'SELECT states.state_id AS states_state_id, states.event_id AS states_event_id, max(states.last_updated) AS max_1 \nFROM states GROUP BY states.entity_id'] (Background on this error at: http://sqlalche.me/e/e3q8)
@arsaboo That's still the old SQL query. There is no group_by anymore in the updated purge.py. Something seems wrong here…?
Can confirm, no error anymore after @tinloaf 's fix at 5.7.
Have to wait a day to get some data to purge.
By tomorrow !
Weird....I placed it in custom_components/recorder/ here. Restarted again. Will keep you posted.
Ok.....no error with the fix, but the recorder is not being purged. I tested it out with {"keep_days": 2} and I can still see logbook entries from 1/12.
The problem with the events still exists.
2018-01-19 14:27:13 DEBUG (Recorder) [homeassistant.components.recorder.purge] Deleted 427 states
2018-01-19 14:27:13 DEBUG (Recorder) [homeassistant.components.recorder.purge] Deleted 0 events
2018-01-19 14:27:13 DEBUG (Recorder) [homeassistant.components.recorder.purge] DB engine driver: pymysql
Checked that there are events to purge before.
When i looked at the tables in the database, i found out that all rows in the states table have an empty event_id.
Could this be the problem?
The database was new created from HA when i played with the DBMS's yesterday.
Just to confirm, history is getting purged, only logbook is not getting purged.
PR #11802 should fix this problem. :)
@tinloaf , purge works now and deletes the events too.
But is this normal?
Most helpful comment
Yep, that's something that has changed im MySQL 5.7.5:
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by
I think I understand the problem: The
state_id(and probably also theevent_id) field are not aggregated, even though we are doing agroup_byquery. For some reason, I thought that if you have onemax()aggregation in the fields, all non-aggregated fields get the value of the record selected by themax()aggregation. And seemingly this worked until recently (and still works for SQLite). I'll try to figure this out!