Documentation: Add Captions, Subtitles, and Interactive Transcripts

Created on 15 Jan 2019  Â·  10Comments  Â·  Source: Islandora/documentation

Obvious suggestion is to add video closed captioning, subtitles, and interactive transcripts support for video playback.

The question is what closed captioning formats to support and which should be the default if any?

And should there be an example in the vagrant build?

Most popular from what I see is either SRT or WebVTT.

WebVTT
Supported by Video.js
Has the ability to use captioning numbers and metadata (embedded in the VTT file)
WebVTT has the ability to specify font, color and text formatting, and placement.

SRT
No longer supported by Video.js
SRT support basic text formats (bold, italic, underline) and placement.
linguists often prefer to translate directly in SRT since it will have fewer text code elements
Compatible with most subtitles processing programs.

Some other formats to note


CAP – This is a common subtitle/caption file format for broadcast media. It was developed by Cheetah International.
CPT.XML – XML format used for encoding captions into Flash video. It originated in the caption-embedding software Captionate.
DFXP – This is the most common format used for captioning Flash video. It’s a timed-text format that was developed by W3C and stands for “Distribution Format Exchange Profile.”
EBU.STL – This is a common subtitle/caption file format for PAL broadcast media. It was developed by the European Broadcast Union.
QT – Caption format used for QuickTime video or audio. It was developed by Apple.
RT – RealText captions for RealMedia video or audio.
SAMI (SMI) – Used for Windows Media video or audio. It was developed by Microsoft and stands for “Synchronized Accessible Media Interchange.”
SBV – This is a YouTube caption file format that stands for “SubViewer.” It’s what you get when you download captions from YouTube. It’s a text format that is very similar to SRT.
SCC – Popular standard used for Line 21 broadcast closed captions, web media, DVD, as well as subtitles for iTunes, iPods, iPads, and iPhones. It was originally developed by Sonic and stands for “Scenarist Closed Caption.”
SRT – This is the most common subtitle/caption file format, especially for YouTube or Facebook captioning. It is a text format that originated in the DVD-ripping software SubRip and stands for “SubRip Subtitle” file.
STL – Used for DVD Studio Pro. It was developed by Spruce Technologies and known as “Spruce Subtitle File.”
WebVTT – Caption format for HTML5 media players.
ITT – iTunes Timed Text
WMP.TXT – Windows Media
ADBE – Adobe

Accessibility

Most helpful comment

ok so related to the tech call from 1/13/21 and my recent PR: https://github.com/Islandora/islandora_defaults/pull/44
we decided to check out a drupal core issue + patch: https://www.drupal.org/files/issues/2019-10-07/3056714-16.patch and https://www.drupal.org/project/drupal/issues/3056714

this was fairly straight forward to apply

  1. i applied the patch via composer
  2. added a new field to the video media type - the new field type is media track
  3. copied the template from the patch core/themes/class/templates/field/file-video.html.twig into the place in bartik that was rendering the template core/themes/bartik/templates/classy/field/file-video.html.twig
  4. added a video node and related media.
  5. waited for service file to be generated
  6. attached vtt in the media track field and specified language
  7. went back to view node
  8. voila captions

A few notes:

  1. not sure why i had to copy the template over - the patch should probably apply to all the core shipped themes?
  2. i believe it was @rosiel who said that the languages would then be dependent on the enabled languages in the drupal site and that is 100% the case. which in my opinion, isn't a good thing. its highly likely we'd have captions for more languages than we make our site fully translatable into.

All 10 comments

@DonRichards

This is of interest to local use cases, specially with respect to accessibility and oral histories. WebVTT seems to be the standard supported by W3C: https://www.w3.org/TR/webvtt1/.

If we agree on WebVTT, then we can consider extending this: https://www.drupal.org/project/videojs. We would need to add tracks as media, then the field formatter plugin needs to access those tracks.

@DonRichards @dannylamb @MarcusBarnes

I have done some initial work to see how we can include caption tracks in the video js player.
https://github.com/Natkeeran/islandora/tree/videojs_overrides

It uses videojs module and overrides its theme hook to add the track. videojs drupal module is designed to support that use case. We don't have to touch the module.

Testing

  • Install this branch or sub module: https://github.com/Natkeeran/islandora/tree/videojs_overrides
  • install the videojs drupal module
  • Change the video file field to use the video js field formatter in : http://localhost:8000/admin/structure/media/manage/video/display and http://localhost:8000/admin/structure/media/manage/video/display/source
  • Add vtt extension/mime type to file field and re import the configuration: field.field.media.file.field_media_file.yml
  • Create a video repository item (ex https://github.com/digitalutsc/islandora_web_annotations/tree/7.x/tests/fixtures/video)
  • Add a vtt (sample provided below)
00:00.000 --> 00:30.000
<v Different Speaker>This is the transcript text content.

00:30.000 --> 01:00.000
<v Speaker2>speaker 2 transcript

01:00.000 --> 01:45.000
<v Speaker 3>speaker 3 transcript

(You will need to clear the cache to see changes effect during various steps.)

Questions

  • Is this (videojs) the drupal module we want to use to bring in videojs. It does not seem to provide an option to use local video.js library.
  • What is the best way to add the logic to pull in vtt track files associated with a node before passing that info to the twig template. We will also need the language info/codes.
  • Should it be a sub module on its own, or should the logic go some where else?

daaaaaaaaaaaang @Natkeeran

@Natkeeran To answer your questions,

1) The Videojs library "should" be able to handle the vtt files but I'm not completely sure. I guess if we're including an older version of videojs it might be an issue.

2) I think a simple standardized vtt file would work, treating the naming convention something like thumbnails. Something like "captions.vtt" would be good and easy to understand at a glance.

3) I don't think vtt integration should be a seperate submodule, it might give the wrong impression (that accessibility isn't integrated).

I may have misunderstood your first question. I'm not completely sure on your other questions.

I think this might cover the transcripts viewer & editor as another solution https://www.drupal.org/project/transcript

ok so related to the tech call from 1/13/21 and my recent PR: https://github.com/Islandora/islandora_defaults/pull/44
we decided to check out a drupal core issue + patch: https://www.drupal.org/files/issues/2019-10-07/3056714-16.patch and https://www.drupal.org/project/drupal/issues/3056714

this was fairly straight forward to apply

  1. i applied the patch via composer
  2. added a new field to the video media type - the new field type is media track
  3. copied the template from the patch core/themes/class/templates/field/file-video.html.twig into the place in bartik that was rendering the template core/themes/bartik/templates/classy/field/file-video.html.twig
  4. added a video node and related media.
  5. waited for service file to be generated
  6. attached vtt in the media track field and specified language
  7. went back to view node
  8. voila captions

A few notes:

  1. not sure why i had to copy the template over - the patch should probably apply to all the core shipped themes?
  2. i believe it was @rosiel who said that the languages would then be dependent on the enabled languages in the drupal site and that is 100% the case. which in my opinion, isn't a good thing. its highly likely we'd have captions for more languages than we make our site fully translatable into.

subtitles in EBU-STL format
Any puglin for videojs?

@matiaszumbo,

There was an earlier work-in-progress PR that used videojs, but it was abandoned. You could possibly change your theme to override the video field Twig to include the requisite libraries and use the video-js tag instead of the video tag, but I haven't tried it myself.

Also, according to their docs, videojs only supports WebVTT. I know YouTube will take them, but I haven't found any indication that browsers can. Perhaps you should try converting them?

@seth-shaw-unlv
Yes, maybe is a better option convert the .stl files to .vtt.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

akuckartz picture akuckartz  Â·  3Comments

acoburn picture acoburn  Â·  5Comments

dannylamb picture dannylamb  Â·  4Comments

ruebot picture ruebot  Â·  3Comments

jonathangreen picture jonathangreen  Â·  4Comments