When inserting a block, if the block has input fields then the initial focus is set by default on the first input field, otherwise it's set on the block wrapper.
See focusTabbable() in BlockListBlock:
Note: "input fields" in this context are only:
<input> elements that support the text selection API<textarea> elementscontenteditableSee isTextField()
However, the current implementation assumes the first tabbable element is always an "input field". This is true for the Gutenberg default blocks but it's not guaranteed to be true for custom blocks.
For example, what if the first tabbable is a button element?

Initial focus is set on the "input field" after the button, thus skipping an important part of the UI.
Same if the first tabbable element is, for example, one of:
isTextField()In all these cases, initial focus will be set on the wrong element.
I'd suggest to consider to remove the filtering by isTextField() to start with. I don't see a reason for it to be there, unless I'm missing something.
Also to consider: a way to give developers more control on initial focus. This could certainly be addressed separately.
In short: removing the line .filter( isTextField ) would be nice 馃檪
Tested a bit with nested blocks and seems to me things are not so simple.
When there are inner blocks, the logic should be adjusted, as right now inner blocks are:
Imagine a block based on inner blocks. The block adds by default two editable fields, for example:

Because of 1, removing .filter( isTextField ) will focus the inner blocks container, which is the first focusable thing within the block. Instead, we'd want the first focusable form control / contenteditable to be focused.
Also, when adding a _second_ Q/A pair, I'm seeing focus goes to the answer. Expected: focus to go to the _first_ form field / editable i.e. the question.
For the inner blocks, see the original change in #10545
Fix #9212 which is causing the inner blocks to be selected when we select the parent.
Looking a bit more into this, seems the case of nested block is not so simple.
Basically, when inserting a block that initially inerts also some nested blocks, focusTabbable() runs when both the parent block and the child block mount.
This leads to unexpected results that actually depend on which elements are initially inside the parent and the child. For example, when inserting a block that has some default inner blocks:
Example 1:
Example 2:
Also, we'd need to remove .filter( isTextField ) to get all the tabbables but this would take into account also the entire UI of the children.
The desired behavior would be:
Would be great if that could be improved.
I have the situation that I use multiple TextControl fields in a custom block, followed by a InnerBlocks component that contains three blocks that contain InnerBlocks again. On inserting the block, the focus jumps to the first block in the last of the three inner block blocks, leading to the situation that the user needs to scroll up to see the top of the block (it is an address block that should give the user the option to add multiple website URLs, email addresses and phone numbers).
Most helpful comment
Would be great if that could be improved.
I have the situation that I use multiple
TextControlfields in a custom block, followed by aInnerBlockscomponent that contains three blocks that containInnerBlocksagain. On inserting the block, the focus jumps to the first block in the last of the three inner block blocks, leading to the situation that the user needs to scroll up to see the top of the block (it is an address block that should give the user the option to add multiple website URLs, email addresses and phone numbers).