When I restart sentinel locally I get this in the log:
2016-07-20T21:15:46.371Z - info: backlog: processing 247 items, 500ms per item
2016-07-20T21:17:20.000Z - info: backlog: processing complete, 0 failures
2016-07-20T21:20:00.025Z - info: checking schedule again in 5 minutes
Next restart - same thing. It processes the same 247 items over and over again.
My last_valid_seq has no rows because all transitions are disabled so it should default to "now".
This isn't particularly expensive but it is a waste of cycles.
This also affects production instances - from production logs:
2016-07-08T22:48:29.940Z - info: 2016-07-08T22:48:29.940Z - info: backlog: processing 363838 items, 500ms per item
...
2016-07-09T01:12:48.325Z - info: 2016-07-09T01:12:48.324Z - info: backlog: processing 363869 items, 500ms per item
...
2016-07-10T00:00:20.815Z - info: 2016-07-10T00:00:20.815Z - info: backlog: processing 365168 items, 500ms per item
...
2016-07-10T18:54:10.155Z - info: 2016-07-10T18:54:10.155Z - info: backlog: processing 368087 items, 500ms per item
Although I couldn't find it in the docs, from my testing it appears that the since parameter is ignored when descending=true so the current backlog processor is always going to process all docs from now to the start of time.
Apparently @SCdF hit this earlier today so I'll try and catch him online tonight but here are my findings so far...
last_valid_seq till 0, while processing any new changes that come in from last_valid_seq.since param doesn't work with descending, and there's no parameter for the end of the sequence, the backlog processing is actually working from now till 0.last_valid_seq is fetched from the last_valid_seq view which is only updated when the transition is enabled and actually makes a modification to the doc. This means when all transitions are disabled (eg: a 2.x instance) the last_valid_seq is never updated.Some things we can do to improve this...
last_valid_seq whether or not the transition applies to the change. This will fix the case where disabled transitions cause the same docs to be checked every start. This will cause excessive doc updates and a high chance of conflicts, so we should consider storing the last_valid_seq in a separate doc, either as one massive doc or many small ones. Small ones probably scale better... last_valid_seq || 0.Some more analysis...
filter method which none of those transitions have.hasRun first, and the transition status is recorded when an error occurs. So even if the doc is updated the transition won't run again unless it's manually edited to remove the transition state. I wonder if the hasRun function should be changed to hasRunSuccessfully because the doc may have been changed in a way that means the transition succeeds.twilio_message transition should keep processing pending messages until they succeed. Currently it's only getting run on change and on backlog processing which is unreliable, so it won't handle situations where the server lacks internet access.PR at https://github.com/medic/medic-sentinel/pull/81
I've replaced the broken last_valid_seq view with a doc with id sentinel-meta-data that records the last seq the backlog processor got to. This starts at 0 so existing sites will start processing again but transitions are designed to be able to be run multiple times so other than a slight performance hit it should be fine.
I'll create issues for the other bugs listed above as this is the only blocker.
Batch processing a few hundred docs at a time would be nice but probably best to leave it for next time. LGTM
Looks like this has broken transitions? If I:
curl -i -u gateway:pass --data-urlencode 'message=P 4 Alice' --data-urlencode 'from=+447765902019' --data-urlencode 'sent_timestamp=1469027270' -X POST http://localhost:5988/api/v1/recordsWith prior codebases, it would generate tasks, and scheduled tasks, as well as fleshing out other parts of the document:

With the changes after this change, it doesn't look like sentinel finds the document that needs processing:

For acceptance testing restart sentinel multiple times and check the log to make sure it's progressing through the backlog not just starting from the start every time.
Most helpful comment
Apparently @SCdF hit this earlier today so I'll try and catch him online tonight but here are my findings so far...
last_valid_seqtill 0, while processing any new changes that come in fromlast_valid_seq.sinceparam doesn't work withdescending, and there's no parameter for the end of the sequence, the backlog processing is actually working fromnowtill 0.last_valid_seqis fetched from thelast_valid_seqview which is only updated when the transition is enabled and actually makes a modification to the doc. This means when all transitions are disabled (eg: a 2.x instance) thelast_valid_seqis never updated.Some things we can do to improve this...
last_valid_seqwhether or not the transition applies to the change. This will fix the case where disabled transitions cause the same docs to be checked every start. This will cause excessive doc updates and a high chance of conflicts, so we should consider storing thelast_valid_seqin a separate doc, either as one massive doc or many small ones. Small ones probably scale better...last_valid_seq|| 0.