When pressing LT() and another key in quick succession, the triggered layer can get stuck "on" until the next keypress.
Using git bisect I am fairly certain I isolated the problem to this commit https://github.com/qmk/qmk_firmware/commit/d0fb7005d51be7c876e63e87778d080c8a733a13.
Reproducing can be tricky since it seems to only be triggered within a very small window of time. I've attached a video demonstration. This was done with minimal changes to the default ergodox ez layout- I replaced the two large thumb keys with LT(SYMB,KC_SPC) (and the top right KC_MINS key with RESET for my convenience).
Then simply make ergodox_ez:default
https://aelius.keybase.pub/qmk%20debounce%20bug.mp4
Paging @alex-ong
@drashna was able to reproduce this, though not easily.
Thank you for your time!
Can confirm that it happens, but it's very hard to trigger for me.
Hey Aelius, thanks;
Are you able to confirm that it doesnt happen when you revert commit d0fb7005d51be7c876e63e87778d080c8a733a13?
Could you explain the video a bit? i'm guessing when the LED lights up, the triggered layer is stuck on?
Are you able to reproduce without LT()?
From a matrix.c point of view, (if its matrix.c/debounce.c's fault), this would mean that pressing key "a" and key "b" quickly results in key "a" getting stuck. In your case "a" and "b" are in different "rows" (ergodox_ez physical columns). Maybe using EK switchhitter, you'll see that "a" stays held down.
My guess is the optimization i made is too eagerly going into the skip processing countdown timers mode
@AeliusSaionji / @drashna Could you try this?
Change
if (*debounce_pointer != DEBOUNCE_ELAPSED) {
if (TIMER_DIFF(current_time, *debounce_pointer, MAX_DEBOUNCE) >= DEBOUNCE) {
*debounce_pointer = DEBOUNCE_ELAPSED;
} else {
counters_need_update = true;
}
}
to
if (*debounce_pointer != DEBOUNCE_ELAPSED) {
counters_need_update = true;
if (TIMER_DIFF(current_time, *debounce_pointer, MAX_DEBOUNCE) >= DEBOUNCE) {
*debounce_pointer = DEBOUNCE_ELAPSED;
}
}
This should make it less eager to stop debouncing.
eg:
diff --git a/quantum/debounce/eager_pr.c b/quantum/debounce/eager_pr.c
index 5b460f663..269738e15 100644
--- a/quantum/debounce/eager_pr.c
+++ b/quantum/debounce/eager_pr.c
@@ -63,11 +63,10 @@ void update_debounce_counters(uint8_t num_rows, uint8_t current_time) {
debounce_counter_t *debounce_pointer = debounce_counters;
for (uint8_t row = 0; row < num_rows; row++) {
if (*debounce_pointer != DEBOUNCE_ELAPSED) {
- if (TIMER_DIFF(current_time, *debounce_pointer, MAX_DEBOUNCE) >= DEBOUNCE) {
- *debounce_pointer = DEBOUNCE_ELAPSED;
- } else {
counters_need_update = true;
- }
+ if (TIMER_DIFF(current_time, *debounce_pointer, MAX_DEBOUNCE) >= DEBOUNCE) {
+ *debounce_pointer = DEBOUNCE_ELAPSED;
+ }
}
debounce_pointer++;
}
I think that should work. But yeah, applying now, and will let you know.
Nope, can still trigger it.
Looking at the matrix .... it looks like the issue is that the two keys are on the same row, actually.
Specifically, the thumb cluster keys. k5A and k2A are the ones that are most reliable for triggering this for me.
And confirmed that it only happens with eager per row debouncing.
Right, i didn't realise that the keys were on the same wired row.
That probably makes sense since they are sharing a counter. I remember when i originally mapped it out in my brain it would still work, but i guess not. Could you use share some ek-switch hitter logs with some annotations of what you actually pressed/released?
Are you able to confirm that it doesnt happen when you revert commit d0fb700?
Yep! Last known good commit: ffc82ebdb2ee00c14dd225eb057d209d4584a623.
Could you explain the video a bit? i'm guessing when the LED lights up, the triggered layer is stuck on?
Sorry, yes you're right. It is me demonstrating the layer getting stuck several times. I also hoped to convey a sense of the timing so that others might be able to replicate it.
Are you able to reproduce without LT()?
Yes- believe it or not I was writing my very first line of code for qmk (I'm new here, hello!) and I assumed the behavior was a problem with what I had written. Took me a while to figure out it wasn't my problem :wink:
In the following code the same bug happens but a different key is held. Instead of SPACE_MOD getting stuck, whatever other key I pressed is stuck instead. That key is then sent to the operating system repeatedly until I tap another key ("555555555555555555555555555555555555"). I have verified this problem does not happen when I try it under the last known good commit ffc82ebdb2ee00c14dd225eb057d209d4584a623.
enum custom_keycodes {
SPACE_MOD = SAFE_RANGE
};
static uint16_t key_check;
bool process_record_user(uint16_t keycode, keyrecord_t *record) {
switch (keycode) {
case SPACE_MOD:
if (record->event.pressed) {
key_check = 1;
layer_on(SYMB);
//should I return here?
} else {
layer_off(SYMB);
if (key_check == 1) {
tap_code(KC_SPC);
}
}
return false;
default:
key_check = 0;
return true;
}
}
Could you try this?
Tried it, didn't solve the issue :(
I'm not using the space mod thing, just the normal LT(1, KC_SPACE) but still triggers the same issue.
this also still happens if I remove all of the expander code, so it's a bug with Eager PR.
As for switch hitter:
37:01.0298 Space (0x20, BIOS 0x39) UP -> 70ms
37:01.0667 Space (0x20, BIOS 0x39) DOWN
37:01.0675 H (0x48, BIOS 0x23) DOWN
37:01.0683 Space (0x20, BIOS 0x39) UP -> 17ms
37:01.0692 H (0x48, BIOS 0x23) UP -> 17ms
37:02.0539 5 (0x35, BIOS 0x06) DOWN
37:02.0732 5 (0x35, BIOS 0x06) UP -> 193ms
37:04.0044 Space (0x20, BIOS 0x39) DOWN
37:04.0065 H (0x48, BIOS 0x23) DOWN
37:04.0090 H (0x48, BIOS 0x23) UP -> 26ms
37:04.0097 Space (0x20, BIOS 0x39) UP -> 53ms
37:04.0547 Space (0x20, BIOS 0x39) DOWN
37:04.0555 H (0x48, BIOS 0x23) DOWN
37:04.0563 H (0x48, BIOS 0x23) UP -> 8ms
37:04.0572 Space (0x20, BIOS 0x39) UP -> 25ms
37:05.0053 5 (0x35, BIOS 0x06) DOWN
37:05.0083 5 (0x35, BIOS 0x06) UP -> 30ms
That last Space is when it happened.
And I can only seem to trigger it when the keys are in the same "electrical" row.
The "space" key was the thumb cluster, LT(1, KC_SPC) (k5A). I was pressing "H"/"5". (K2A).
https://github.com/qmk/qmk_firmware/blob/master/keyboards/ergodox_ez/ergodox_ez.h#L114-L132
Thanks drashna,
Next is to narrow it down to counters_need_update or bool changed, could you test if forcing bool changed to true, or counter_needs_update to true (just one at a time) fixes the issue? Forcing both to true should definitely fix it, since thats the "previous" commit; it might be worth testing that too as a sanity check.
void debounce(matrix_row_t raw[], matrix_row_t cooked[], uint8_t num_rows, bool changed) {
uint8_t current_time = timer_read() % MAX_DEBOUNCE;
if (counters_need_update) {
update_debounce_counters(num_rows, current_time);
}
if (changed) {
transfer_matrix_values(raw, cooked, num_rows, current_time);
}
}
37:04.0572 Space (0x20, BIOS 0x39) UP -> 25ms
_37:04.5~ Space is held down (not reported by EK Switch hitter. Correct)_
37:05.0053 5 (0x35, BIOS 0x06) DOWN (we're in number layer)
37:05.0083 5 (0x35, BIOS 0x06) UP -> 30ms (we're still in number layer)
_37:05.0093~ Space is released (not reported by EK switch hitter, but QMK didn't pick it up. We're still in number layer)?_
Are those italics roughly right (for where you pressed/released space bar)?
Setting counters_need_update doesn't fix it. (still happens).
Setting changed to true seems to fix it.
Eg, I did this:
void debounce(matrix_row_t raw[], matrix_row_t cooked[], uint8_t num_rows, bool changed) {
uint8_t current_time = timer_read() % MAX_DEBOUNCE;
if (counters_need_update) {
update_debounce_counters(num_rows, current_time);
}
if (true) {
transfer_matrix_values(raw, cooked, num_rows, current_time);
}
}
And as for the order, yeah, that looks about right.
Though, I don't think that space should have registered, since it's a LT key
@AeliusSaionji confirmed that my above fix works.
Hmm i'm not sure why the fix works - if there is no change, there would never be a need to transfer_matrix_values... unless my code that determines bool changed on ergodox_ez is wrong :/
Unless this is another weird compilation/timing issue (transfer_matrix_values slows down the scan rate, since we are adding a whoel bunch of operations/scan)
Not sure. But to add to this... I "created" a just right side board for the ergodox, and removed the custom matrix. That still exhibited this behavior. So the issue is the eager_pr code.
As for using it.... I've not had issues using the modified code. I think for now, slowing down the scan rate by always running transfer_matrix_values may be acceptable.
That said, I'm more than willing to test stuff out.
Also, it may be wort adding a define to change the handling....
For instance:
void debounce(matrix_row_t raw[], matrix_row_t cooked[], uint8_t num_rows, bool changed) {
uint8_t current_time = timer_read() % MAX_DEBOUNCE;
if (counters_need_update) {
update_debounce_counters(num_rows, current_time);
}
#ifdef DEBOUNCE_EAGER_PR_SEND_ON_CHANGED // or anything else would be a better name than this
if (changed)
#endif
{
transfer_matrix_values(raw, cooked, num_rows, current_time);
}
}
This way, it can be configured
I "created" a just right side board for the ergodox, and removed the custom matrix.
Just confirming you're using the default qmk matrix.c in this case?
Yeah its definitely a bug in eager_pr, i'm still not certain why eager_pr performs any differently to eager_pk (apart from the whole sharing counters thing; i still haven't rationalized in my head how there could be a difference apart from a DEBOUNCE (5ms) time difference on certain input sequences)
Correct. Custom matrix is turned off, all the pins are defined, and correctly set, using the config as specified in the custom matrix, actually. And verified with switch hitter. You can see the board here, for verification:
https://github.com/drashna/qmk_firmware/tree/aelius/keyboards/handwired/half_dox
As for the difference, I'm not sure. Though for me, it seems to just trigger if both keys are in the same "row".
Also, I think you already posted the corresponding comic for this issue, in discord.
If you wanted to open a PR with the fix, I'll see about getting it merged..
Also, on a (un)related note, I'd love to see more debouncing options, so I can test and break them. :)
(but no hurries on that).
I'd rather figure out a better fix than ignoring changed; i've finally flashed a default matrix.c and reproduced it on my home qmk keyboard. if there is a fix for now, it'd probably be flat out ignoring changed rather than using a #define, since we haven't found a keyboard where the issue isn't present when changed is actually being used.
Okay. Well, hopefully, you find a fix for the issue!
Alright, i've got a working fix - my theory is that we need an extra transfer_matrix_values when debouncing changes from counters_need_update to !counters_need_update, probably due to the whole shared counter thing. This keeps changed in use most of the time and overrides it for this corner case. Not sure if also needed for eager_pk.
will do a formal pull request so you can test
Can we close this?