Introduce amp-form
attribute initialize-from-url
to populate form inputs
on page load from URL query parameter values.
Feature request (#24533) and work-in-progress pull request (#24671) already exist, but it was suggested this is significant enough of a change to warrant an I2I.
(none)
When a user navigates to a search results URL like /search?q=my+search
, it is typically expected that the search text input on the page is populated with the search term, in this case my search
. It is already possible to populate an <amp-list>
via query parameters thanks to AMP variable substitution with QUERY_PARAM
, so it would be great if AMP allowed setting default values of form inputs by checking the URL. For example, if new <amp-form>
attribute initialize-from-url
existed, then inputs under the form could have their default value set from the URL GET parameters.
It is possible output each form input using amp-list
. The value
attributes of inputs can then be set with mustache, e.g. <input name="search" type="text" value="{{search}}">
. This requires a special endpoint created to echo back query parameters. Example: https://ampsupport.wompmobile.com/echoquery/dictionaryformat?q=QUERY_PARAM(q)
. This has obvious disadvantages:
placeholder
and fallback
in amp-list
cannot contain the same input or else the form risks having multiple inputs with the same name
attribute (they are all in the DOM even if they are not all visible)Minimal example with <amp-list>
workaround incorporated:
android.cmphys.com/temp/amp-form-ex.html?q=my+search
Worth noting, the reason the inputs are initialized to my+search
is due to bug #24534 which has PR #24626 waiting for review.
/cc @ampproject/wg-approvers @caroqliu
Does applying the URL values trigger change events on the inputs that can be reacted to via amp-bind/amp-script?
@cramforce Yes. See PR #24671 . If any form input is updated the following event is fired:
const formChangeEvent = createCustomEvent(this.win_, AmpEvents.FORM_VALUE_CHANGE, null, {bubbles: true});
targetChange.dispatchEvent(formChangeEvent);
One of the TODO items on that PR is "Verify it is sufficient to fire one amp form update event at the end of form initialization instead of one per input updated".
I'm still not sure that this change is significant enough to warrant a full design document, but I'm at least familiarizing myself with the expectations and can write one if it would help answer questions like these.
I'm worried that this leads to jumping on page load.
CC @choumx @dvoytenko
Yea, changing input values is ok as long as they don't affect size or position. But anything that can cause content jumping must not be allowed.
This is also probably the wrong usage of FORM_VALUE_CHANGE
, which is currently used to indicate form "dirtiness" AKA modified by user. A dynamic default value is not dirty.
Re: the amp-list
alternative, we can avoid the network call if we support rendering from local data. This would also address your other FR #24832.
Something like:
<amp-state data-amp-replace="QUERY_PARAM" id=local_data>
{
"foo": "QUERY_PARAM(q)"
}
</amp-state>
<amp-list src="amp://local_data">
<template type=amp-mustache>
<input value="{{foo}}">
</template>
</amp-list>
Probably needs more thought to discourage excessive client-side rendering.
Just saw your PR. A targeted, form-specific feature seems simpler and reasonable too.
Though we need to watch out for content jumping due to CSS selectors like :checked
.
Thanks for your input @choumx .
Interesting, I intentionally used FORM_VALUE_CHANGE
because I do consider the form dirty once inputs are modified from their default values -- default being the values hard-coded in the HTML.
I appreciate the idea in your <amp-list>
workaround, but it's probably still not optimal. You cannot include a placeholder
with inputs of the same name
, so you will certainly see missing content while the <amp-list>
component loads and runs.
cc @ampproject/wg-ui-and-a11y and @caroqliu
@ampproject/wg-caching What are the considerations regarding query params leaking to forms? Is this a security concern here?
@ampproject/wg-caching What are the considerations regarding query params leaking to forms? Is this a security concern here?
Interesting, I don't think I'm knowledgeable enough to answer that fully.
Do components have access to the URL already? IOW, does this change give components access to data it doesn't already have?
It seems from the design review that there is concern that it should be given a security review.
Do components have access to the URL already? IOW, does this change give components access to data it doesn't already have?
@honeybadgerdontcare Others may be able to provide additional details, but I believe this summarizes the discussion from the design review:
Components have had access to certain parameters in the URL for quite a while thanks to variable substitutions: QUERY_PARAM(abc)
amp-list
, amp-pixel
, and other components can forward query parameters to another server without user interaction on page loadinput[type=hidden][data-amp-replace]
in forms requires form submission to send param values to another serverWhere a security review could be warranted is that the suggested implementation <form data-initialize-from-url>
no longer individually identifies query parameters that can be extracted from the URL and sent with a form submission. Rather, all supported fields (<select>
, <textarea>
, and most <input>
s that can be set with .value
) under the form that have an input with name
attribute matching a URL parameter could be pre-populated.
@mattwomple could we require a specific prefix for these query parameters, e.g. amp-form-
, for substitution to be done to their corresponding fields? It's unlikely that such a prefix would be used elsewhere as query parameters and that the parameter is intentionally being used for this purpose.
could we require a specific prefix for these query parameters?
@honeybadgerdontcare I suppose I'm not sure who this benefits.
data-initialize-from-url
, they are already making the statement that each field within the form is fair game for populating via URL. If anything, I'd suggest we introduce the opposite flag for individual fields: data-omit-from-url-initialization
. It seems to me that a blacklist would be less cumbersome to implement than a whitelist (if needed at all).Perhaps I misunderstand the implementation here. I was under the impression that there may be URL query parameters (on the original document) that shouldn't be accessible to components. For example a unique id for analytics but shouldn't be shared to form fields that could be sent to another server. By using an explicit prefix then those query parameters are stating they know they may be accessed by components such as amp-form
and it's fields and potentially shared to another URL.
Ah, sure, good example. Then yes, either whitelisting or blacklisting individual fields would be worthwhile!
One question I had: Should this be turned on on the AMP Cache at all?
One question I had: Should this be turned on on the AMP Cache at all?
It's reasonable to suggest that this feature doesn't need to work when pages are served from the AMP cache. No publisher is going to want to share a link like https://www.bing.com/amp/s/example.com/search?q=my+search
over something from their own domain. The only use case I can see for this feature on the AMP cache is that a page is designed to submit a form to itself (relative action
), but since the AMP cache rewrites relative actions to be absolute, this is a no-go anyways. I'm fine with limiting this feature to publisher domains.
Follow-up to Design Review 2019-10-23 (#24591) and other comments
extensions/amp-form/0.1/amp-form.js
the right place to handle form initialization if attribute data-initialize-from-url
is present?amp-form
; it is not an AMP custom element.action
target of a GET form submission. For example, if page /search
includes a GET form that submits to /search/results
, then that form submission will append query parameters, not a fragment, to the action
.<input type=checkbox>
and <input type=radio>
checked-state can be detected with CSS :checked
. What about CSS selectors [value=]
, :empty
, :valid
, and :invalid
for other form fields?.value
on supported form fields does not alter those nodes' attributes or children, with the exception of <input type="hidden">
. So, :empty
CSS selectors will not match any initialized fields and [value=]
could only match <input type="hidden">
. :valid
and :invalid
selectors _do_ match fields modified with .value
, so page content could be altered by specially designed CSS on page load. Given that pseudo selectors :valid
and :invalid
are as powerful as :checked
in writing CSS logic, there seems no reason to omit <input type=checkbox>
and <input type=radio>
from the initialize-from-url feature (if it is approved).amp-list
can already change the page content based on query param with src="/results?q=QUERY_PARAM(q)"
. Further, only crawlers willing to run JavaScript will ever see the distinct content.name
s with amp-form-
so that it is clear they are designed to be initialized from URL? Or could we use per-element opt-in via a new attribute like data-allow-initialization
?The example page included in PR #24671 has been updated to match these design considerations. Direct link to sample page using pr-deploy-bot. Notably, it shows some sample layout hacking on page load that could be (ab)used if this feature is accepted.
I'm not sure how to reconcile these two statements:
a. It is desirable to have search engines index the same page with different query strings.
b. It is not necessary to enable this on the cache.
If you include a link to /results?q=couches
and a search engine indexes that and wants to display the results in a viewer, then we would need this feature enabled on the cache, no?
Is there a particular harm to letting this run on the cache (or even an unparticular "feels a bit fishy")? If so, then I guess we have two options; which one of these is preferred?
a. Write a cache-side transformer to remove this attribute
b. Runtime disables this feature when it senses it is running on a cache
Regarding security implications, I'm not a security expert. But I'm reminded of the "mass assignment" feature in Ruby on Rails, which I believe has been a source of vulnerabilities in the past. Opting into the feature (with the form attribute) is a minimum, but it may be worth exploring ideas for how to make sure this isn't a footgun (e.g. definitely don't modify <input type=hidden>
s). Sorry I don't have more time to think about this; gotta run. Maybe there are some good ideas to borrow from Rails history.
- If you include a link to
/results?q=couches
and a search engine indexes that and wants to display the results in a viewer, then we would need this feature enabled on the cache, no?
Well, remember that desktop computers are still a thing, so that's a reasonable market share that depends on indexing without caring about AMP caches. And if you live in my fantasy world where signed exchanges will – in reasonable time? – have widespread support, then the AMP cache concern becomes moot; the amp page is no longer tied to the search domain.
- Is there a particular harm to letting this run on the cache (or even an unparticular "feels a bit fishy")? If so, then I guess we have two options; which one of these is preferred?
The approach I've taken in PR #24671 is to bail out if a proxy (cache) origin is detected:
https://github.com/ampproject/amphtml/pull/24671/commits/17caa8f79ad074da773997ca0291b450c6bacab6#diff-4f7d7ef94609d4d079c3fde230f0d924R1222
- Regarding security implications, I'm not a security expert. But I'm reminded of the "mass assignment" feature in Ruby on Rails, ... (e.g. definitely don't modify
<input type=hidden>
s)
The stakes are lower here since we're only talking about client-side programming. The only real concern is "can a user submit a form without knowing what they are submitting?" I argue that this is already possible today, and in AMP. Drop an amp-list
inside a form and let it generate <input type="hidden" name="secret" value="{secret}">
via template. All you need to support this amp-list
is an echo endpoint, e.g.
Similarly, there already exists a feature in AMP to populate hidden inputs just before form submission; see amp-form: Variable Substitutions.
Outside of AMP, this same feature can be accomplished with a few lines of JavaScript (error handling omitted for brevity):
let url = new URL(location.href);
for (let name of url.searchParams.keys())
document.querySelector(`[name=${name}]`).value = url.searchParams.get(name);
So, I really don't think we're letting any new dragons loose with this feature request.
- If you include a link to
/results?q=couches
and a search engine indexes that and wants to display the results in a viewer, then we would need this feature enabled on the cache, no?Well, remember that desktop computers are still a thing, so that's a reasonable market share that depends on indexing without caring about AMP caches. And if you live in my fantasy world where signed exchanges will – in reasonable time? – have widespread support, then the AMP cache concern becomes moot; the amp page is no longer tied to the search domain.
Fair enough. I'd still argue it'd be useful to support this on cache so that mobile in-viewer results are displayed the same as on-origin, but if folks are concerned about the safety of running this feature on-cache, I would just make sure we document this limitation.
- Regarding security implications, I'm not a security expert. But I'm reminded of the "mass assignment" feature in Ruby on Rails, ... (e.g. definitely don't modify
<input type=hidden>
s)The stakes are lower here since we're only talking about client-side programming. The only real concern is "can a user submit a form without knowing what they are submitting?" I argue that this is already possible today, and in AMP.
I think your statement of the concern is correct. (In the extreme case, the publisher may style the submit button to look like a link, and the user may not think they are submitting anything.)
Agreed it's already possible (maybe with amp-script, too?), but I think it's worth designing what's easy/hard, since people will mostly do what's easy.
It sounds like you're saying we can defer security countermeasures until the server-side, since that's where the potentially dangerous effect actually takes place. I can see that argument, but also an argument for defense-in-depth when it's possible and doesn't hurt DX too much.
Looking at Rails' response to its own high-profile mass assignment vulnerabilities, I see three things:
Regarding whitelisting/blacklisting, I guess there are several ways to go. I'll throw out my strawman but I'm not tied too strongly to it:
input[type=hidden]
. (What about form fields that are simply visually hidden with display:none
or visibility:hidden
or opacity:0
or z-index:-infinity
?)data-also-initialize
attribute.data-only-initialize
attribute, then don't set any fields that don't have that attribute.(Sorry if I'm retreading old ground! I came to this thread late.)
Thanks @twifkak, all of your comments are appreciated. I believe the field whitelisting performed in PR #24671 satisfies the bulk of your whitelisting-related suggestions: each field must have attribute data-allow-initialization
to be considered for initialization (code link). Do you agree?
I understand this thread and that PR have seen plenty of comments and code changes, so I'll work on adding documentation to the PR soon. Then, everyone can verify the implementation matches the design goals. Thanks for your patience.
Thanks @twifkak, all of your comments are appreciated. I believe the field whitelisting performed in PR #24671 satisfies the bulk of your whitelisting-related suggestions: each field must have attribute
data-allow-initialization
to be considered for initialization (code link). Do you agree?
Ah, yes, I agree. I hadn't read the PR. Thanks!
PR #24671 now includes unit tests and documentation. It has been moved out of draft status. Reviews appreciated! pr-deploy-bot sample
👏
Most helpful comment
Follow-up to Design Review 2019-10-23 (#24591) and other comments
extensions/amp-form/0.1/amp-form.js
the right place to handle form initialization if attributedata-initialize-from-url
is present?amp-form
; it is not an AMP custom element.action
target of a GET form submission. For example, if page/search
includes a GET form that submits to/search/results
, then that form submission will append query parameters, not a fragment, to theaction
.<input type=checkbox>
and<input type=radio>
checked-state can be detected with CSS:checked
. What about CSS selectors[value=]
,:empty
,:valid
, and:invalid
for other form fields?.value
on supported form fields does not alter those nodes' attributes or children, with the exception of<input type="hidden">
. So,:empty
CSS selectors will not match any initialized fields and[value=]
could only match<input type="hidden">
.:valid
and:invalid
selectors _do_ match fields modified with.value
, so page content could be altered by specially designed CSS on page load. Given that pseudo selectors:valid
and:invalid
are as powerful as:checked
in writing CSS logic, there seems no reason to omit<input type=checkbox>
and<input type=radio>
from the initialize-from-url feature (if it is approved).amp-list
can already change the page content based on query param withsrc="/results?q=QUERY_PARAM(q)"
. Further, only crawlers willing to run JavaScript will ever see the distinct content.name
s withamp-form-
so that it is clear they are designed to be initialized from URL? Or could we use per-element opt-in via a new attribute likedata-allow-initialization
?The example page included in PR #24671 has been updated to match these design considerations. Direct link to sample page using pr-deploy-bot. Notably, it shows some sample layout hacking on page load that could be (ab)used if this feature is accepted.