Html: Programmatically setting focus navigation start point?

Created on 2 Mar 2020  Â·  28Comments  Â·  Source: whatwg/html

For a while, @robdodson and I have been noting use cases where setting the focus navigation start point, as opposed to the active element, would be preferable.

One motivating example is skip links - currently the technique may require setting focus to a non-interactive element purely to effectively set the focus navigation start point.

Another example is focus management for transient UI - for example, a side menu which appears as a result of a keyboard shortcut. It may not be appropriate to focus any specific element in the newly visible UI, but the user should be able to easily move focus within that UI, even if the UI is not modal.

Strawman API proposal:

el.focus({ setActiveElement: false });

This would unfocus the current active element, and set the focus navigation start point to el.

Additional naming proposals from @othermaciej in https://github.com/whatwg/html/issues/5326#issuecomment-601899518:

element.setSequentialFocusStartPoint();
element.startSequentialFocusHere();
element.setFocusStartPoint();
document.setFocusStartPoint(element); // maybe more obvious that this clears focus,
                                      // & if there was a getter it'd be on document

More name suggestions from @muan in https://github.com/whatwg/html/issues/5326#issuecomment-603541518

document.setSequentialFocusStartingPoint(element);
document.setTabFocusStartingPoint(element);
document.setSequentialNavigationStartingPoint(element);
document.setNavigationStartingPoint(element);

Another example from @muan in https://github.com/whatwg/html/issues/5326#issuecomment-598971825:

I've been working on adding focus management to the file list on GitHub's repository page. Currently when user navigates through the file directories with a keyboard, each navigation drops user's focus back to the top of the page.

The most basic solution would be to set the file table as the start point. However, with the current recommended pattern, we have to go through 5 steps:

  1. Setting tabindex="-1" on the file table
  2. Call focus() on the file table
  3. Ensure focus outline does not apply to this element
  4. Install a one time blur handler to remove the tabindex

With this new API, it'll be:

  1. Call table.focus({ setActiveElement: false })

This one line solution would immediately make the experience 10 times better.

accessibility additioproposal editing focus forms

Most helpful comment

I agree that this would be a very helpful API to have.

Setting tabindex="-1" has too many side-effects as @alice mentioned, and they turn something that should be as simple as "now you start from here" to an annoying "focus exception" where styles don't apply and element isn't "interactive" in the sense that native _interactive content_ is.

Using the current pattern, developers have to consider the following steps:

  1. Marking an element to be the start point
  2. Setting tabindex="-1" on the element
  3. Call focus()
  4. Ensuring focus outline does not apply to this element
  5. Install a one time blur handler to remove the tabindex

Making this pattern re-usable is very difficult for a large scale website. In these steps, I'd say step 1 alone is already a big task and quite a burden to maintain. And for step 2 to 5, the developer will have to consider how to handle the conditional cases if the start point is already an interactive element v.s. not. This is further complicated by how "interactive element" isn't always "focusable" (https://github.com/whatwg/html/issues/4464).


For example, I've been working on adding focus management to the file list on GitHub's repository page. Currently when user navigates through the file directories with a keyboard, each navigation drops user's focus back to the top of the page.

The most basic solution would be to set the file table as the start point. However, with the current recommended pattern, we have to go through that 5 steps above. With this new API, it'll be:

  1. Call table.focus({ setActiveElement: false })

This one line solution would immediately make the experience 10 times better.

If we want to further enhance the experience, we can write more code to track which directory user came from, and is going to, and move focus to the links themselves, but that's immediately a much more complex solution, and requires developers to know exactly what goes into this piece of UI, therefore won't be shareable to other parts of the site. Whereas the proposed solution can be easily applied to other places with/without interactive content in them.

I think this convenience is much needed considering that it is very rare for businesses to put development resource into designing a keyboard/screen reader specific experience.


With regards to using focus() v.s. introducing a new method, in my opinion this deserves a new method. Putting aside "overloading focus beyond its plain meaning" as @othermaciej mentioned, this also adds confusing expectations like "should focus event get fired?" I'd assume not, but didn't I just call focus() on something?

For clarity sake, I think it'd be the most ideal to separate their responsibilities, especially considering focus is already a complex domain.

All 28 comments

cc @whatwg/a11y

Can you describe the negative impacts of setting focus to a non-interactive element? I kind of was under the impression that's why almost everything is programmatically focusable, is so as to accomplish the use cases you describe here. But I guess something is not good enough about the existing solution?

I worry about having a separate active element and focus navigation starting point, which seems like it would be confusing. So I think it's worth getting a sense of what's wrong with the current coupling of the two so we can evaluate the tradeoff.

for clarity, what is the focus navigation start point? the equivalent of the reading position if you were using a screen reader? is it something internal to the browser?

I kind of was under the impression that's why almost everything is programmatically focusable, is so as to accomplish the use cases you describe here.

Most things are not programmatically focusable, unless you add tabindex=-1, so I'm not sure what you mean here.

This would work on _any_ element, without requiring an opt-in on the element.

Can you describe the negative impacts of setting focus to a non-interactive element?

It can trigger confusing focus styling/indication - both from :focus but additionally in some situations focus indication cannot be opted out of (such as when an operating system preference to always show focus is enabled).

Focus indication is only helpful (IMO) when the focused element has some kind of keyboard interactivity.

I worry about having a separate active element and focus navigation starting point, which seems like it would be confusing.

Apologies, I was unclear about my intent there: my intent was that it would still blur the active element, so document.activeElement would be body.

Most things are not programmatically focusable, unless you add tabindex=-1, so I'm not sure what you mean here.

Right, I meant if you add that. So we're comparing this new primitive vs. using the existing focus primitive on a tabindex=-1 element.

It can trigger confusing focus styling/indication - both from :focus but additionally in some situations focus indication cannot be opted out of (such as when an operating system preference to always show focus is enabled).

:focus-visible is meant to handle this, right?

Focus indication is only helpful (IMO) when the focused element has some kind of keyboard interactivity.

In the case of skip links (for example), it seems like the keyboard interactivity is "you can use the keyboard to navigate focus within the main content". E.g. pressing Tab takes you to the first focusable piece of main content. Similarly for side menus.

Perhaps I'm not understanding what you mean by keyboard interactivity?

Apologies, I was unclear about my intent there: my intent was that it would still blur the active element, so document.activeElement would be body.

I see. This does seem to reduce the potential confusion by making it similar to how browsers seem to behave in the existing case, e.g. of clicking on a non-click-focusable element. So, this complexity does already exist in the platform.

The question is whether it's worth exposing this complexity to JavaScript, and thus letting authors trigger it instead of users. I can see how when used for good, it probably aligns with user expectations. But it feels a bit like we're solving the problem of "people are using :focus instead of the new :focus-visible technology" by introducing yet another new technology and hoping they'll use that instead?

Ultimately what I'm not quite grasping is why it's better for users to have the body focused than to have the container element focused, in these cases. This is likely just a matter of me not having enough experience in these areas, so please take all of this in the spirit of me trying to strenghten your case, and not just being skeptical or resistant to change.

Perhaps I'm not understanding what you mean by keyboard interactivity?

As in an interactive control that can be/is operated via keyboard...

Setting aside whether this functionality is useful or necessary, I think a way to set focus navigation start point without actually focusing anything should be a new method, not a new parameter to focus. The proposal overloads focus beyond its plain meaning. Often in such cases a new method is better. Furthermore, whole new methods are easier to feature test for, and it seems likely this functionality would need fallback when not supported.

@othermaciej

I think a way to set focus navigation start point without actually focusing anything should be a new method, not a new parameter to focus. The proposal overloads focus beyond its plain meaning.

The reason I proposed overloading focus() is that initially my thought was that focus() on an unfocusable element should have this behaviour, but that would break any code that assume that that should be a complete no-op.

I take your point about overloading focus beyond its current meaning, but in practice focus() is already being used this way when focus is sent to a non-interactive element in order to simulate this exact behaviour.

Adding a new method is definitely a possibility, although it risks being stuck forever in a naming bikeshed, for what I see as debatable value.

@domenic

It can trigger confusing focus styling/indication - both from :focus but additionally in some situations focus indication cannot be opted out of (such as when an operating system preference to always show focus is enabled).

:focus-visible is meant to handle this, right?

  1. Not in the case where the operating system preference to always show focus is enabled
  2. :focus-visible heuristics can be "fooled" by using keyboard shortcuts to show modal UI:
    Suppose you use a keyboard shortcut in a page to show a piece of modal UI, and focus is moved to a container div in order to simulate moving focus navigation start point to that div. Should :focus-visible match, or not? How could you know reliably?

... keyboard interactivity is "you can use the keyboard to navigate focus within the main content". E.g. pressing Tab takes you to the first focusable piece of main content. Similarly for side menus.

I agree with @bkardell's framing. Meaningful keyboard interactivity in this case is, approximately, "handles keyboard events" (modulo event delegation).

@othermaciej Apologies for not addressing this in my comment above:

seems likely this functionality would need fallback when not supported.

This is a good point.

The fallback, I believe, would default to either focusing the element (if it is focusable), or a no-op (if it is not).

Since the latter is what we want to avoid, we would need to seriously consider how to address that if we didn't add a new method.

@alice I find I am confused about the situation. What is the current effect of focus() on a non-focusable element? Is it to move the focus navigation position without setting focus? Or is it a no-op? I understood your two recent comments to say both of those different things, so I'm probably not understanding correctly.

(Or is the distinction that some elements are focusable but non-interactive? And focusing interactive elements has an undesirable side effect? In which case, I think a new method is still a better design.)

@othermaciej Sorry for the confusion, let me clarify the comments which probably read that way:

in practice focus() is already being used this way when focus is sent to a non-interactive element in order to simulate this exact behaviour.

That looks like this:

<!-- non-interactive element with tabindex -->
<div id="container" style="display: none" tabindex="-1"> 
  <!-- interactive content goes in here -->
</div>
function openModal() {
    container.style.display = "block";
    // focus the container to allow moving focus into interactive content
    container.focus();  
}

So focus() on container is being used to get a similar effect to moving the FNSP, but only moves the FNSP in practice because the (non-interactive) element is also being focused.

The fallback, I believe, would default to either focusing the element (if it is focusable), or a no-op (if it is not).

The former case looks like:

<button id="button">Focusable element</button>
// if the setActiveElement option isn't supported, this will focus button as usual
button.focus({ setActiveElement: false });

The latter looks like:

<div id="container">  <!-- no tabindex -->
    <!-- interactive content goes here -->
</div>
// if the setActiveElement option isn't supported, this is a no-op:
// focus stays on the previous activeElement
container.focus({ setActiveElement: false });

Hope that clarifies things!

OK. So part of this is avoiding the need to carefully prepare a non-interactive focusable element, such by setting tabindex=-1 on a container. Other than that, is there an undesirable side effect from focusing such an element? Does it do something unwanted other than setting the focus navigation position?

@othermaciej

is there an undesirable side effect from focusing such an element?

I address that in earlier comments:

https://github.com/whatwg/html/issues/5326#issuecomment-593748135

It can trigger confusing focus styling/indication - both from :focus but additionally in some situations focus indication cannot be opted out of (such as when an operating system preference to always show focus is enabled).

Focus indication is only helpful (IMO) when the focused element has some kind of keyboard interactivity.

https://github.com/whatwg/html/issues/5326#issuecomment-594067753

:focus-visible is meant to handle this, right?

Not in the case where the operating system preference to always show focus is enabled.

:focus-visible heuristics can be "fooled" by using keyboard shortcuts to show modal UI:
Suppose you use a keyboard shortcut in a page to show a piece of modal UI, and focus is moved to a container div in order to simulate moving focus navigation start point to that div. Should :focus-visible match, or not? How could you know reliably?

wondering naively if all that would be required (but of course, throughout all user agents) is changing the behavior of focus() itself to be more like "if it's a focusable element, move active element and focus navigation start point; if the target isn't focusable (it's not an interactive control, or an arbitrary element blessed with tabindex) just move the focus navigation start point and unfocus the currently active element" ?

i.e. i can't currently think of a situation where i'd need to move focus navigation start point to a focusable element without wanting to also set focus to it (unless i wanted to park the start point "just before" it, but then i'd generally want to target something preceding)

Rob and I thought it might be an issue if existing code expects focus() on an unfocusable element to be a no-op. Hence needing to opt-in with the extra parameter.

Setting tabindex=-1 on the focus target (and removing focus indication CSS) should be all you need if JavaScript is available to call el.focus().

If JavaScript is not available and you want to set a different "start point", a new tag or attribute would be needed and processed by the browser.

For instance, let's say that I have a static site with several deep-linking pages. User starts on the homepage and gets the normal experience (start point is document.body). Then uses a link to navigate to a subpage. With "start point" set, while the referrer is the same origin and location is not the baseUrl, focus begins at the "start point" (I imagine it to be main or whatever "skip to main content" points to).

What this looks like in code could be <meta rel="start-point" content="#main">. There should also be a user setting, "prefers-natural-start-point" that ignores the directive.

Setting tabindex=-1 on the focus target (and removing focus indication CSS) should be all you need if JavaScript is available to call el.focus().

but then the thing has focus, which is distinct from it being where the focus start point is.

i'd have to test, so i may be talking rubbish here, but from memory when a container with tabindex="-1" receives focus, AT will start to read out the entirety of the content, unbroken (or at least it will start to do so), and that behavior is, i seem to remember, different from moving the focus start point/reading cursor somewhere (in that a user in the latter case can still decide at any point to stop, backtrack, etc, which is not the case when the AT was reading out all the content of the thing that has focus).

@AutoSponge

Setting tabindex=-1 on the focus target (and removing focus indication CSS) should be all you need if JavaScript is available to call el.focus().

It's true that this works today, but we would argue that it's kind of a hack that developers have to be taught because there is no standard way of setting the focus start point. Internally, browsers have the ability to move the focus start point and it would be useful to expose this to developers. Otherwise, on a large app, there might be many instances where they need to sprinkle outline: none; throughout their CSS to emulate this effect.

As someone who has found himself setting focus to <div tabindex="-1"/> with outline: none before when building UI components in design systems for clients, I see great benefit in having a specific, non-hacky feature built into HTML for this situation.

Example cases I can think of: modal window, expand/collapse functionality and tabs (if not following ARIA Authoring Practices for focus management).

  • In all of those cases, moving focus to a containing div rather than an interactive element within it, helps to build a more abstract component that needs to know little about its contents. As the person building the component (and the focus management), you don't necessarily know what kinds of contents consumers of your component (e.g. other teams) are going to include, so moving focus to a containing div is usually the safest bet.
  • In all of those cases, as a front-end developer, I would expect push back from my designer counterparts if there was visible focus indication, because the indicated area can be quite large and making it visible can be confusing for people who don't use their keyboard (Kind of a use case for focus-visible, but I agree with both of @alice's earlier comments on that)

I agree that this would be a very helpful API to have.

Setting tabindex="-1" has too many side-effects as @alice mentioned, and they turn something that should be as simple as "now you start from here" to an annoying "focus exception" where styles don't apply and element isn't "interactive" in the sense that native _interactive content_ is.

Using the current pattern, developers have to consider the following steps:

  1. Marking an element to be the start point
  2. Setting tabindex="-1" on the element
  3. Call focus()
  4. Ensuring focus outline does not apply to this element
  5. Install a one time blur handler to remove the tabindex

Making this pattern re-usable is very difficult for a large scale website. In these steps, I'd say step 1 alone is already a big task and quite a burden to maintain. And for step 2 to 5, the developer will have to consider how to handle the conditional cases if the start point is already an interactive element v.s. not. This is further complicated by how "interactive element" isn't always "focusable" (https://github.com/whatwg/html/issues/4464).


For example, I've been working on adding focus management to the file list on GitHub's repository page. Currently when user navigates through the file directories with a keyboard, each navigation drops user's focus back to the top of the page.

The most basic solution would be to set the file table as the start point. However, with the current recommended pattern, we have to go through that 5 steps above. With this new API, it'll be:

  1. Call table.focus({ setActiveElement: false })

This one line solution would immediately make the experience 10 times better.

If we want to further enhance the experience, we can write more code to track which directory user came from, and is going to, and move focus to the links themselves, but that's immediately a much more complex solution, and requires developers to know exactly what goes into this piece of UI, therefore won't be shareable to other parts of the site. Whereas the proposed solution can be easily applied to other places with/without interactive content in them.

I think this convenience is much needed considering that it is very rare for businesses to put development resource into designing a keyboard/screen reader specific experience.


With regards to using focus() v.s. introducing a new method, in my opinion this deserves a new method. Putting aside "overloading focus beyond its plain meaning" as @othermaciej mentioned, this also adds confusing expectations like "should focus event get fired?" I'd assume not, but didn't I just call focus() on something?

For clarity sake, I think it'd be the most ideal to separate their responsibilities, especially considering focus is already a complex domain.

@muan Thank you so much for this detailed motivating example! Would you mind if I copied parts of it up to the issue description (with a link to your comment)?

Regarding introducing a new method, good point about focus events, I hadn't considered that.

What might a good method name be for the new method?

This also prompts me to think we might want an API to _get_ the focus navigation start point (just like we can get the active element), as well; that might play into the design.

Thank you so much for this detailed motivating example! Would you mind if I copied parts of it up to the issue description (with a link to your comment)?

Not at all. I’m glad it’s helpful.

What might a good method name be for the new method?

Big question 😬. To start with, I hope that it’ll communicate not just setting the start point, but also the fact that focus will be taken away from the current active element. I’ll read through the spec to see if I can think of something that makes sense.

Some possible names that may not be ideal, just for a starting point for discussion:

element.setSequentialFocusStartPoint();
element.startSequentialFocusHere();
element.setFocusStartPoint();
document.setFocusStartPoint(element); // maybe more obvious that this clears focus, and if there was a getter it would be on document

I really like document.setFocusStartPoint(element). It makes it clear that this a document level operation and has more to do with the document flow than the element. Though it'd still be nice to be explicit on the type of focus this deals with (yes, this got very long).

document.setSequentialFocusStartingPoint(element);

Or perhaps we can use "tab"?

The name "tab index" comes from the common use of the Tab key to navigate through the focusable elements. The term "tabbing" refers to moving forward through sequentially focusable focusable areas. – tabindex spec

document.setTabFocusStartingPoint(element);

Or perhaps not even mention focus since nothing is getting focus? focus is only what would happen if you start tabbing from there. ("sequential navigation search algorithm")

document.setSequentialNavigationStartingPoint(element);
document.setNavigationStartingPoint(element);

I captured these name suggestions in the top comment.

I think my pick so far is document.setFocusStartPoint(el), although the downside for me of any of these compared to my original proposal (which I agree has downsides of its own) is that it's much less clear that it'll blur the current active element.

Or perhaps we can use "tab"?

I'd shy away from this, as moving the focus start point also has an effect for screen reader users, and they will use cursor/reading keys - not just TAB - from that point on to navigate/read content.

I like the direction this issue is heading, but wanted to call out that is seems like not all browsers have separate state to track the location of a focus navigation starting point.

The canonical example given in the HTML spec of how a user can set a focus navigation starting point is by clicking somewhere.

Clicking somewhere also moves the selection, and in Firefox it looks like the location of selection is what represents the concept of the focus navigation starting point (either that or moving selection also moves the focus navigation starting point such that the two locations cannot be distinguished).

So one candidate for the API you want could be Selection.collapse - seems to work today in Firefox.

Some other benefits of using selection to represent the focus navigation start point:

  1. Selection also determines where the first active match will be if the user performs a find-on-page operation. Maybe one API to set the user's "point of interest" might be easier to use than an API that sets the independent states that should follow the user's point of interest.
  2. Selection represents its position with a range. Test cases like this one that remove the currently focused element or a user clicking on the text between two links make me think that the API shape should track the focus navigation starting point with something like a range and not an element. This is how its implemented in Chromium today.

So some questions to consider:

  1. Should there be an API to set a focus navigation starting point which is separate from selection? An alternative is that we standardize selection as the focus navigation starting point.
  2. If we do create a separate API, should repositioning selection be something that updates the focus navigation starting point?
  3. Also, if we create a separate API, should the inputs be a node and offset pair?

Here's a test page if you want to try out the interaction between selection and the focus navigation starting point.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lacolaco picture lacolaco  Â·  3Comments

lespacedunmatin picture lespacedunmatin  Â·  3Comments

domenic picture domenic  Â·  4Comments

tontonsb picture tontonsb  Â·  3Comments

tkent-google picture tkent-google  Â·  3Comments