Html: Add TextTrackCue end time representing end of media

Created on 17 Feb 2020  Â·  33Comments  Â·  Source: whatwg/html

Use Case

A user wants to display content which is synchronised to a web media object and remains visible from the cue start time until the media finishes playing. To do this, they use a metadata TextTrack and VTTCue.

In the case of live streaming, the end of media time is unknown and there is no value of TextTrackCue.endTime that can represent this.

Some specific use cases include:

  • Video chapter metadata for live streams where the end time is not known in advance, or may be changed during the video (e.g., a sports game running overtime or a newscast that's extended due to breaking news).

  • WebVMT is a metadata format for annotating video with synchronized geolocation information. With a live video stream, a map annotation cue may be created to mark the entry to a particular location, but the time of exit is not known in advance.

Proposal

It is proposed that a TextTrackCue.endTime value of Infinity be used to represent the end of media time. This is a simple extension of the existing HTML standard where media.duration equal to Infinity represents the duration of an unbounded stream.

Example

// Display cue from startTime to end of media
var textTrack = mediaElement.addTextTrack('metadata');
var cue = new VTTCue(startTime, Infinity, 'A cue with unbounded end time');
textTrack.addCue(cue);

Related Issues

cc @rjksmith @eric-carlson

additioproposal media

Most helpful comment

Harmonising with the media spec makes a lot of sense to me. I wasn't aware
Infinity had already been added there.

On Sun, Jun 14, 2020 at 1:03 AM Rob Smith notifications@github.com wrote:

@silviapfeiffer https://github.com/silviapfeiffer I agree that there's
a workaround, which is fine when this is an edge case. However, displaying
a cue in perpetuity is common use case for WebVMT, which escalates the
requirement.

WebVMT cue design extends the WebVTT definition to allow cue end time to
be optional
https://w3c.github.io/sdw/proposals/geotagging/webvmt/#webvmtcues, i.e.
there is no cue end, which is particularly important for capture and live
streaming.

As the current WHATWG spec already supports media.duration = Infinity for
unbounded streams
https://html.spec.whatwg.org/multipage/media.html#offsets-into-the-media-resource,
the proposed solution is to extend this to also cover cue end time, which
is a minimal change and acknowledges the requirement.

Another use case is:

  • Broadcasters wishing to distribute media via web channels could use
    a perpetual cue to display channel branding over video, akin to YouTube.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/whatwg/html/issues/5297#issuecomment-643635486, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAXFQP65IVKKA6DHI66JELRWOILLANCNFSM4KWQWNFA
.

All 33 comments

The specific proposed change here is to replace double endTime in TextTrackCue with unrestricted double (See https://html.spec.whatwg.org/multipage/media.html#dom-texttrackcue-endtime)

cc @whatwg/media

This was discussed on the W3C Media Working Group call, 12 May 2020, notes here.

This may not be necessary. There are work arounds:

  • You can choose a very long duration, longer than the expected streaming event
  • You can update the end time when you know when to end a cue

You can choose a very long duration, longer than the expected streaming event

You can't get longer than Infinity!

Sure, but Infinity isn't currently covered in the spec whereas a long
duration of, e.g. years into the future is. So no charge to the spec would
be required.

On Fri, Jun 12, 2020, 9:07 PM Nigel Megitt notifications@github.com wrote:

You can choose a very long duration, longer than the expected streaming
event

You can't get longer than Infinity!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/whatwg/html/issues/5297#issuecomment-643213913, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAXFQJDTG5TGOH4F2OC6TDRWIEAVANCNFSM4KWQWNFA
.

@silviapfeiffer I agree that there's a workaround, which is fine when this is an edge case. However, displaying a cue in perpetuity is common use case for WebVMT, which escalates the requirement.

WebVMT cue design extends the WebVTT definition to allow cue end time to be optional, i.e. there is no cue end, which is particularly important for capture and live streaming.

As the current WHATWG spec already supports media.duration = Infinity for unbounded streams, the proposed solution is to extend this to also cover cue end time, which is a minimal change and acknowledges the requirement.

Another use case is:

  • Broadcasters wishing to distribute media via web channels could use a perpetual cue to display channel branding over video, akin to YouTube.

Harmonising with the media spec makes a lot of sense to me. I wasn't aware
Infinity had already been added there.

On Sun, Jun 14, 2020 at 1:03 AM Rob Smith notifications@github.com wrote:

@silviapfeiffer https://github.com/silviapfeiffer I agree that there's
a workaround, which is fine when this is an edge case. However, displaying
a cue in perpetuity is common use case for WebVMT, which escalates the
requirement.

WebVMT cue design extends the WebVTT definition to allow cue end time to
be optional
https://w3c.github.io/sdw/proposals/geotagging/webvmt/#webvmtcues, i.e.
there is no cue end, which is particularly important for capture and live
streaming.

As the current WHATWG spec already supports media.duration = Infinity for
unbounded streams
https://html.spec.whatwg.org/multipage/media.html#offsets-into-the-media-resource,
the proposed solution is to extend this to also cover cue end time, which
is a minimal change and acknowledges the requirement.

Another use case is:

  • Broadcasters wishing to distribute media via web channels could use
    a perpetual cue to display channel branding over video, akin to YouTube.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/whatwg/html/issues/5297#issuecomment-643635486, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAXFQP65IVKKA6DHI66JELRWOILLANCNFSM4KWQWNFA
.

As just discussed on the call, some scenarios that I think we need a defined behaviour for:

  • end time = Infinity, media end time = unknown. Cue end handler never gets called, right?
  • end time = Infinity, media end time = known and in the future. Does the cue end handler get called?
  • end time = Infinity, media end time = known and now. Does the cue end handler get called?
  • end time = Infinity, media end time = known and in the past. If the cue end handler has not yet been called, does it get called?
  • end time is updated from Infinity to a value later than the current media time. Cue end handler does not get called, right?
  • end time is updated from Infinity to a value earlier than or equal to the current media time. Cue end handler should get called, right?

Here's my 2c worth.

While end time=Infinity, there's no logical reason that I could think of to
call the cue end handler, ever. I must be missing something is you think
otherwise.

Once updated to a normal end time, existing logic applies.

On Tue, Jun 16, 2020, 2:01 AM Nigel Megitt notifications@github.com wrote:

As just discussed on the call, some scenarios that I think we need a
defined behaviour for:

  • end time = Infinity, media end time = unknown. Cue end handler never
    gets called, right?
  • end time = Infinity, media end time = known and in the future. Does
    the cue end handler get called?
  • end time = Infinity, media end time = known and now. Does the cue
    end handler get called?
  • end time = Infinity, media end time = known and in the past. If the
    cue end handler has not yet been called, does it get called?
  • end time is updated from Infinity to a value later than the current
    media time. Cue end handler does not get called, right?
  • end time is updated from Infinity to a value earlier than or equal
    to the current media time. Cue end handler should get called, right?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/whatwg/html/issues/5297#issuecomment-644223414, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAXFQNDFZ46C3EM6WVNMITRWZAVRANCNFSM4KWQWNFA
.

That was my thinking too, @silviapfeiffer , but during the call I _think_ I heard another suggestion, which is that Infinity is a proxy for "whenever the (currently unknown) end of media happens", so while the media duration is unknown and the media is ongoing, and the cue end time is Infinity, the end handler does not get called, but if the media is deemed to have ended, somehow (and no other change happens), _then_ call the end handler.

Another perhaps inter-related view was put, by @eric-carlson I think, that cues are _part_ of the media, so if there are ongoing cues after the other media components, audio, video etc have ended, then the media overall cannot have ended. By this reasoning, a media element with an associated text track that includes a cue whose end time is Infinity could never end.

Here's a link to the discussion from 2020-06-15.

The key points from yesterday's discussion in the Media Timed Events Task Force come from the definitions:

  1. An unbounded media stream has a finite duration that monotonically increases over time, i.e. initially 0, then 1 after a second and so on. WHATWG provides a definition for duration which includes media.duration = Infinity to represent the (finite) length of the media resource. The media duration is defined as the time at the end of media resource in seconds (which may be unknown - see below).
  2. There is no identified requirement for handling times differently in bounded and unbounded streams. Furthermore, such discrimination could cause an anomaly in the live capture use case:

    • During media capture, recording the (unknown) end time of the media resource as Infinity is valid. However, at the instant that the capture ends, the stream becomes bounded (by definition) causing the valid recorded end times to become invalid which is anomalous.

@silviapfeiffer I disagree, though I think I've also made a mistake with my choice of wording in the proposal.

The intention is to allow a cue with an unknown end time which can be represented by endTime = Infinity and which is consistent with the current WHATWG media.duration = Infinity definition. Hence, a clearer title for this issue might be: AddTextTrackCue end time representing an unknown time. Apologies for the confusion.

To address your point above, the cue end handler will be called as the cue has a finite (but unknown) duration. Assuming that the cue is active at the current media playback time, two ways it can become inactive and call the end handler are:

  1. The media player is rewound to the start at time 0, or to any time before the cue start.
  2. The media player is cleared or reset in some way, e.g. a playlist skips to the next media file, a user-loaded media resource is cleared, or the media player is closed.

In both cases, the end handler should be called to make the cue inactive. Note that the cue is not necessarily displayed within the media player itself, e.g. WebVMT cues can be overlaid on a separate map display.

More generally, I can't think of a single example when the end handler is not called for an active cue, though I may have overlooked something.

the cue end handler will be called as the cue has a finite (but unknown) duration.

Infinity is finite?

Assuming that the cue is active at the current media playback time, two ways it can become inactive and call the end handler are:

  1. The media player is rewound to the start at time 0, or to any time before the cue start.
  2. The media player is cleared or reset in some way, e.g. a playlist skips to the next media file, a user-loaded media resource is cleared, or the media player is closed.

Why would the cue change event fire when the media player is cleared? That event is only fired during the time marches on steps, which I believe only happen during playback and after seeking

Infinity is finite?

No, that's not what I mean, but I think it's a source of confusion. The current WHATWG spec _represents_ the duration of an unbounded stream with Infinity: media.duration = Infinity. Do you believe that an unbounded stream literally has an infinite duration?

Why would the cue change event fire when the media player is cleared? That event is only fired during the time marches on steps, which I believe only happen during playback and after seeking

I presume that all associated resources are released when a media player closes, otherwise it would leak memory. It seems reasonable to call the cue end handler as part of this release process. If not, how does the player convey that an active cue should no longer be displayed when that resource is released?

If not, how does the player convey that an active cue should no longer be displayed when that resource is released?

I think web apps can implement this scenario using the media element's ended event and iterating over the text track's activeCues.

That makes sense, so perhaps we're talking about the same idea in different terms. Is that described in the WHATWG spec?

Apart from the triggering event, is there any difference in the process of handling an active cue that should no longer be displayed because:

  1. The current playback time exceeds the cue end time, e.g. playback continued
  2. The current playback time precedes the cue start time, e.g. media rewound
  3. The media resource has been released, e.g. media removed

Apart from the triggering event, is there any difference in the process of handling an active cue that should no longer be displayed because:

  1. The current playback time exceeds the cue end time, e.g. playback continued
  2. The current playback time precedes the cue start time, e.g. media rewound
  3. The media resource has been released, e.g. media removed
  1. and 2. are the same, as is seeking to change current playback position (time marches on).

I assume 3. means changing the element's source? If so there is no active cue_handling per se, because time does not advance.

Thanks @eric-carlson. Yes, changing the video element's source is a good description of 3.

I understand that a key concern is how a proposed change will affect cue state transitions within time marches on, so will focus on that.

In the light of this discussion, I'll draft revised wording for @chrisn's proposed TextTrackCue.endTime = Infinity to explain its purpose and highlight the harmonisation with media.duration = Infinity. My feeling is that 'unbounded cue' is a descriptive name which helps to clarify its meaning.

Thanks to all for the constructive feedback.

Proposal (Revised)

It is proposed that a TextTrackCue.endTime value of Infinity be used to represent an unbounded time, i.e. an unspecified future time. This is a simple extension of the existing HTML standard where media.duration = Infinity represents the duration of an unbounded stream, which is consistent with the definition of unbounded time.

Analysis

A TextTrackCue has three key properties: startTime, endTime and content which is rendered while the cue is active. The cue lifecycle can be considered in five steps with respect to the current playback position:

  1. Before the startTime, the cue is inactive after time marches on is applied;
  2. At the startTime, the cue becomes active after time marches on is applied;
  3. After the startTime and before the endTime, the cue is active after time marches on is applied;
  4. At the endTime, the cue becomes inactive after time marches on is applied;
  5. After the endTime, the cue is inactive after time marches on is applied.

During normal playback i.e. the usual monotonic increase of the current playback position, steps 1, 3 & 5 represent steady states, and steps 2 & 4 represent transition states where the cue becomes active or inactive. Latency in updating cues is handled by time marches on which allows the user agent to "catch up." Insight can be gained by considering the response of a media player to an instantaneous change of the current playback position, i.e. seek.

An unbounded cue, i.e. cue with an unbounded endTime, should be handled in the same way as a cue with an endTime greater than the end of media time, which is already permissible within the current HTML spec. An unbounded cue can never progress to steps 4 & 5, though it should be noted that an active unbounded cue may become inactive if the current playback position seeks backwards to step 1, e.g. by rewinding the media after the cue's startTime.

The current HTML spec already allows cues to have startTime and endTime values greater than the end of media time or less than the start of media time, i.e. negative - see note under text track cue, which make some of the above five steps unreachable. Definition of the unbounded cue is entirely consistent with the existing HTML specification.

Thanks for writing this up so clearly, @rjksmith. This matches my understanding, and is consistent for what we're proposing for DataCue for media in-band events with unbounded duration.

Given the discussion above, I've drafted changes to WHATWG/HTML for the proposed unbounded TextTrackCue endTime in a branch on the Away Team repo with help from @chrisn

The Pull Request is now ready and I'd be grateful if browser implementers could comment on interest, tests and bugs to allow me to complete the following section for inclusion with the PR. Many thanks for your support.

cc @eric-carlson, @mounirlamouri, @foolip, @jyavenard

  • [ ] At least two implementers are interested (and none opposed):

    • …

    • …

  • [ ] [Tests](https://github.com/web-platform-tests/wpt) are written and can be reviewed and commented upon at:

    • …

  • [ ] [Implementation bugs](https://github.com/whatwg/meta/blob/master/MAINTAINERS.md#handling-pull-requests) are filed:

    • Chrome: …

    • Firefox: …

    • Safari: …

(See WHATWG Working Mode: Changes for more details.)

Review of https://github.com/whatwg/html/pull/5953 would be great, especially from possible future implementers. To summarize it, it adds positive infinity as a possible value of https://html.spec.whatwg.org/#text-track-cue-end-time and that appears to be the only change really needed. I'd be interested to hear of possible complications. @eric-carlson @alastor0325 @fsoder WDYT?

Sounds unproblematic as a whole. Open-ended intervals are not uncommon in similar settings. Problems (w/ infinities) tend to arise when needing to do arithmetic, but those cases ought to be limited for this case (i.e perhaps when dealing with playbackRate). I guess that counts as "not opposed" =).

I did realize after reviewing this yesterday that a companion change in WebVTT will be needed. The constructor defined in https://w3c.github.io/webvtt/#the-vttcue-interface needs to change to allow +Infinity as a value of the endTime argument, similarly throwing a TypeError if it's -Infinity or NaN.

@rjksmith is that something you'd be up for doing as well?

The final piece of this will be tests in web-platform-tests.

This sounds fine to me as well.

The proposal looks good to me, thank you.

Excellent, sounds like we have enough implementer buy-in, I'll update https://github.com/whatwg/html/pull/5953 to reflect.

Thanks @foolip and all who commented. I'll work with @rjksmith to update the PR based on your feedback, and the corresponding changes to VTTCue.

Thanks @chrisn! For WebVTT, check out https://github.com/w3c/webvtt/pull/491 first, because that's changing the exact same text you'll need to update, so would conflict. No changes have been made to WebVTT in a while, so it looks like the process is a bit rusty and needs some greasing.

I did realize after reviewing this yesterday that a companion change in WebVTT will be needed. The constructor defined in https://w3c.github.io/webvtt/#the-vttcue-interface needs to change to allow +Infinity as a value of the endTime argument, similarly throwing a TypeError if it's -Infinity or NaN.

@rjksmith is that something you'd be up for doing as well?

The final piece of this will be tests in web-platform-tests.

I've raised a Pull Request for WebVTT (https://github.com/w3c/webvtt/pull/493) though it looks as if I may not have sufficient permissions.

VTTCue.endTime type has been updated to allow +Infinity but I don't believe any change is necessary to explicitly throw exceptions from the VTTCue constructor, as it inherits endTime from TextTrackCue which is covered by https://github.com/whatwg/html/pull/5953.

No conflict with https://github.com/w3c/webvtt/pull/491.

@chrisn Raised bugs against Chrome, Firefox & Safari and updated details in #5953.

@foolip How can I help with the web-platform-tests?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jyasskin picture jyasskin  Â·  3Comments

tkent-google picture tkent-google  Â·  3Comments

iane87 picture iane87  Â·  3Comments

benjamingr picture benjamingr  Â·  3Comments

FANMixco picture FANMixco  Â·  3Comments