Currently the HTML Standard seems a bit US centric:
The autocomplete attribute supports the street-address type for an unstructured street address or address-line1, address-line2 and address-line3 for the respective lines.
In Germany and several other countries, websites typically ask for the street name and a house number. In Spain, websites ask for the floor number and door number (I am not a Spanish speaker, I have seen "Num.", "Piso", "Letra", "Esc." on simyo.es for example). Some Brazilian sites ask for a neighborhood (which we might express with an address-levelX but that might be a bit underspecified because I am not aware of a canonical mapping of entities in specific countries).
Should we start a discussion on defining more entities around addresses? I think that just by supporting a street-name, house-number and apartment-number, we could cover a lot of ground.
cc @mnoorenberghe
Chrome seems to somehow fill it correctly sometimes, see https://i.imgur.com/QRNxfXL.png (segmueller.de).
This is very rare though and probably depends on the ID's given. Usually, I'd say with a 90% probability, the "street" field from forms gets filled with the street name and the house number.
The house number field is then empty and I have to manually edit it, which is unfortunate and could be fixed via the spec.
@annevk / maintainers: Do you need any example websites to show/clarify the issue?
In Japan, we don’t even have street addresses. Instead, the equivalent is a series of numbers that starts with a district number, and then a block number, and then what’s nominally a specific building number.
I say “nominally” because it’s common for several buildings to actually have the same “building number” (due, e.g., to cases where an older, larger building on a piece of land gets demolished and replaced with two or more smaller buildings on the same land).
So in Japan, in addition to a “building number”, it’s often also necessary to also specify a building name.
When the autocomplete attribute was being defined, I remember we specifically discussed the Japanese-addressing case — as well as other locales/cases with addressing schemes that don’t use street names or that need something more specific than a street address — and (as far as I recall), the relevant set of tokens now in the spec were decided on because trying to define a richer set of tokens to reflect all the possible cases/locales would have ended up with a set that was just too unwieldy in practice.
@carstenhag I think we mainly need to hear from implementers to what extent they are interested in putting work into this as this would also require UI that's aware that if you change the country the address format changes. (E.g., Firefox only supports a single address field at the moment as far as I can tell.)
Thank you for sharing the historic context. I appreciate that there is a lot of complexity in international addresses that I am not aware of.
I wonder whether we can still come up with a practical solution that addresses the problem that websites put street and house number into separate fields. We have looked at a few hundred websites in Brazil, Spain, France, Germany and India (focused on non-US websites; acknowledging that this is by no means a representative sample) and noticed this challenge quite frequently in all of these countries but India.
Assuming that websites do not change their UI to support the autocomplete attribute, I wonder how we feel about adding special cases for "a lot but not all countries". I would propose that addresses are too complex to cover all countries with all nuances so we can only get closer to "correct" but never reach it 100%.
Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?
This could also be used for https://github.com/whatwg/html/issues/4987 as a format specifier cc-exp[mm/yy] to express an expiration in format "MM/YY".
@annevk Regarding implementer interest: I am working for Google on Chrome.
@rmondello Do you know anybody who works in this space on WebKit?
cc @whatwg/i18n
FWIW, this may also affect the purpose attribute in APA's Personalization work.
cc @whatwg/a11y per above comment. Does make me wonder how many attributes we need to state the same thing.
@rxaviers and I were chatting about internationalized address formatting. I wonder if the JavaScript Intl API could help by giving data about the required fields for a locale. @sffc
Given the role autocomplete has in WCAG SC 1.3.5: Identify Input Purpose, I am a fan of any new values that are more internationalized and can benefit more users.
From the user perspective, as long as browsers support it with auto-fill values, then all good.
From the dev perspective, to @sideshowbarker's comment above, too many options and we can expect many devs to use the wrong ones, blunting the accessibility benefit.
Postal addresses are famously complicated to internationalize, due to the wide variation in formats between (and sometimes within) countries. In addition, there can be differences in what application authors prefer (in terms of the level of granularity they wish to store/process/validate).
There are several ways this might be addressed. On the one hand, we could try to enumerate all postal address components globally such that page authors can always specify the component they mean--door number, prefecture, administrative unit, floor number, etc. As noted by @xfq and @aardrian this could be of use to assistive technologies. On the other hand, we might try to address only specific problems with additions (at a disadvantage to users in countries that need something different).
A key thing to notice is that the language/locale of the page is not the same thing as the country of an address and the form used to collect a specific address needs to be tied to the country it ships to.
@littledan This has less to do with locale than it does with country/region (although there is some locale influence). I would suggest that postal address parts and their regional association be suggested to CLDR as an addition. Intl could then leverage this.
I'll bring this up at W3C I18N WG's next telecon, but we're off this week due to IUC43.
speaking of WCAG's repurposing (perversion) of the attribute...there are certain situations where an input accepts two different types of information (such as "Username or email address") that currently can't be expressed I think? would it be a massive complication allowing these sorts of multiple values to be used (in order of preference, perhaps)?
@patrickhlauke see #4445.
Some quick comments here, although it looks like there's a lot of good discussion on the key points already...
Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?
FWIW I would prefer address-line1-street-name and address-line1-house-number.
From the user perspective, as long as browsers support it with auto-fill values, then all good.
Indeed. I think that's the key point, is whether there are actors in the ecosystem who would actually leverage this. It sounds like Chrome might take the steps @annevk describes (i.e., change their "user information" UI to have separate stree-name/house-number fields when you're in the given country, so that it can successfully autofill later).
Given the precedent in https://github.com/whatwg/html/issues/3745#issuecomment-487088177, we allow fairly liberal implementer support signals for expanding the autofill vocabulary. So I think the main goal of the discussion here is to gather input from everyone (as this thread has been doing) to make sure the design is reasonable, even if only one browser has immediate plans to do the UI work necessary.
I think this also speaks to the question of how much work we want to do in creating more autofill tokens for more classes of addresses, e.g. Japanese addresses. The answer, IMO, is as much work as the ecosystem wants to put in. Not just browsers, either. E.g. if extensions, or AT vendors, or similar communicated that they could serve a good number of users by introducing such new autofill tokens, then I think the spec should be a reasonable clearinghouse for coordinate those efforts, and having design discussions among multiple parties to shake out any cross-cutting issues.
Please excuse my rambling…
Do we have any evidence that the sites that currently use separate fields for these values actually require them to be separate fields? Or is it just a local preference?
@carstenhag I think we mainly need to hear from implementers to what extent they are interested in putting work into this as this would also require UI that's aware that if you change the country the address format changes. (E.g., Firefox only supports a single address field at the moment as far as I can tell.)
We show a single address-line textarea in preferences but can split it into up to 3 lines when filling. Likewise for the phone number field, we show one box but can split into many different combinations of <input> when necessary due to all the different tel-* tokens (we don't support all of them).
Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?FWIW I would prefer
address-line1-street-nameandaddress-line1-house-number.
I'm guessing the reason for the square bracket syntax was to allow it to work with address-lineN with N from 1 to 3. Would you want to instead add all 6 tokens since I don't think these components are always on line 1?
Indeed. I think that's the key point, is whether there are actors in the ecosystem who would actually leverage this. It sounds like Chrome might take the steps @annevk describes (i.e., change their "user information" UI to have separate stree-name/house-number fields when you're in the given country, so that it can successfully autofill later).
The UI change is the easiest part to deal with… the harder part is being able to convert in both directions between address-lineN and the subdivisions being proposed. We dealt with a similar problem with less complexity for the different tel-* tokens and it wasn't nice… I think we still don't support all of the tel-* tokens as a result of the complexity. The reason you need to convert in both directions is because the user could have first saved that address-line as one field but needs to autofill in the separate components (and vice versa). I think it will be very hard for UAs to handle that without creating duplicates but other than somehow pushing sites to move away from these fields (unless of course they need them for shipping calculations or something like that), I don't see how we can nicely solve this problem. I'm skeptical that that Intl APIs could even be defined to do the transformations in both directions.
My main concern with this proposal is that it could make the use of these narrower fields more popular in the future even if not all UAs can handle them properly. Can we add them to the spec but mark it deprecated from the beginning? :P That would be similar to how we have https://compat.spec.whatwg.org/ where we standardize the web as it is, not as we want it (I realize this isn't a perfect analogy since sites probably aren't using these specific tokens now).
The spec already has the following text which is relevant but I'm not sure most people would notice it:
Generally, authors are encouraged to use the broader fields rather than the narrower fields, as the narrower fields tend to expose Western biases. For example, while it is common in some Western cultures to have a given name and a family name, in that order (and thus often referred to as a first name and a surname), many cultures put the family name first and the given name second, and many others simply have one name (a mononym). Having a single field is therefore more flexible.
Maybe we should expand on that and explicitly annotate which autocomplete tokens should be avoided in favour of broader ones? Also, in this case I don't think "Western biases" is applicable.
I'm very interested in hearing how other UAs plan to handle the bidirectional data transformation issue… I'm also interested in hearing arguments for why we should codify this pattern rather than leave it up to UA heuristics to figure out (which is the status quo). Why do we want to pave this cow path rather than encourage change?
P.S. I wonder if authors ever handle this by listening for insertReplacementText input events on the address-lineN field and moving the appropriate sub-components to their own fields…
What is the status of this ticket?
We as a company use autofill options for our forms. We use postcode combined with house number (optional house number extension) to autocomplete the street and city data. This is for dutch websites. The service for autocompleting we use is https://pro6pp.nl/en.
I would like separate attribute for house number and house number extension
Most helpful comment
Postal addresses are famously complicated to internationalize, due to the wide variation in formats between (and sometimes within) countries. In addition, there can be differences in what application authors prefer (in terms of the level of granularity they wish to store/process/validate).
There are several ways this might be addressed. On the one hand, we could try to enumerate all postal address components globally such that page authors can always specify the component they mean--door number, prefecture, administrative unit, floor number, etc. As noted by @xfq and @aardrian this could be of use to assistive technologies. On the other hand, we might try to address only specific problems with additions (at a disadvantage to users in countries that need something different).
A key thing to notice is that the language/locale of the page is not the same thing as the country of an address and the form used to collect a specific address needs to be tied to the country it ships to.
@littledan This has less to do with locale than it does with country/region (although there is some locale influence). I would suggest that postal address parts and their regional association be suggested to CLDR as an addition.
Intlcould then leverage this.I'll bring this up at W3C I18N WG's next telecon, but we're off this week due to IUC43.