Timeline Notation v1.0

The file format specification

Note: This section assumes basic computer programming knowledge

Timeline files are written in Timeline Notation. No monkey business here. It's a JSON spec.

Example

{
  "timeline_version": 1,
  "rss_feed": "https://guysmileyshow.com/rss",
  "rss_item_guid": "https://guysmileyshow.com/podcast/ep1/1.mp3",
  "horizontal_image": {
    "file": "https://guysmileyshow.com/podcast/ep1/1.jpg",
    "alt": "Guy Smiley in spotlight"
  },
  "audio_duration": 1879200,
  "speakers": [
    {
      "id": 0,
      "name": "Guy Smiley",
      "portrait": "https://guysmileyshow.com/podcast/speakers/guy.jpg"
    },
    {
      "id": 1,
      "name": "Oscar the Grouch",
      "portrait": "https://guysmileyshow.com/podcast/speakers/oscar.jpg"
    }
  ],
  "annotations": [
    {
      "type": "chapter",
      "timestamp": 0,
      "title": "Introduction",
      "image": {
        "file": "http://guysmileyshow.com/demo_episode/guysmiley.jpg",
        "caption": "Guy Smiley in spotlight"
      },
      "summary": "The beginning of another great episode. Guy introduces our guest.",
      "annotations": [
        {
          "type": "music",
          "timestamp": 32000,
          "artist": "Pete Seeger ft. Oscar the Grouch",
          "title": "Garbage",
          "album": "Pete Seeger & Brother Kirk Visit Sesame Street",
          "image": {
            "file": "http://guysmileyshow.com/demo_episode/boxer.JPG"
          },
        }
      ]
    },
    {
      "type": "chapter",
      "timestamp": 48910,
      "title": "Interview with Oscar",
      "image": {
        "file": "http://guysmileyshow.com/demo_episode/oscar.jpg",
        "caption": "Guy Smiley frowning for camera"
      },
      "summary": "Oscar joins Guy Smiley in the studio.",
      "annotations": [
        {
          "type": "quote",
          "timestamp": 67880,
          "attribution": "Oscar the Grouch",
          "text": "I actually like garbage."
        },
        {
          "type": "transcript",
          "timestamp": 67880,
          "speaker": 1,
          "text": "It started out as method acting, but now I actually like garbage.",
          "words": [
            {
              "text": "It",
              "offset":0
            },
            {
              "text": "started out as",
              "offset":300
            },
            {
              "text": "method acting,",
              "offset":1510
            },
            {
              "text": "but now I",
              "offset":1901
            },
            {
              "text": "actually like",
              "offset":2178
            },
            {
              "text": "garbage.",
              "offset":3100
            }

          ]
        }
      ]
    }
  ]
}
          

Spec

Top level

timeline_version (required, integer) is the Timeline Notation version that the file uses. This will be important for applications to know as the system evolves. This documentation describes Version 1.

rss_feed (optional, string) is the URL of the RSS feed that this timeline is associated with. This gives an application a way to associate this timeline file with a show, even if the show's RSS feed does not link to the timeline file.

rss_item_guid (optional, string) is GUID for the episode specified in the RSS feed. Every episode in an RSS feed has a <guid> element within an element. Applications can use this to associate the timeline with an <item>, even if the RSS feed does not link to the timeline file.

audio_duration (required, integer) is the duration of the audio file. In milliseconds.

horizontal_image (optional, object) is a horizontal (landscape oriented) image meant to be used at a large size. Even though this isn't part of the actual timeline per se, it is included in the timeline file since RSS feeds for podcasts typically don't include a specific episode image and when they do, it is a square image, which isn't ideal.

url (optional, string)

alt (optional, string) is like an HTML <img> element alt attribute. It does not display in normal circumstances, but instead it can optionally be read aloud to the listener using a voice synthesizer. It may be displayed by the player if the image is not loading fast enough. It can also be used for search engine visibility if the content is being displayed on the web.

annotations (optional, array) is a list of annotations to be placed on the timeline.

speakers (optional, array) is a list of people who speak in the episode.

Annotations array

Each item in the annotations array is an annotation on the timeline. Annotation items can have:

timestamp (required, integer) is the time in the audio program that the annotation is attached to.

type (optional, string) specifies the type of annotation. The valid values are chapter, music, transcript, quote or link. If there is no type specified, it is a generic annotation.

The other properties available for annotations depend on the type:

Annotation type: Chapter

Chapter annotations are a special kind of annotation that can have annotation arrays as children, nested content. Naturally, chapters can have stuff in them. In addition to the standard properties available for all annotations, each chapter annotation can have:

title (required, string)

timestamp_end (optional, integer) is a timestamp where the chapter closes. If not specified, applications should assume the chapter closes when the next sibling annotation starts. timestamp_end is for special cases where the Timeline publisher wants the chapter to end earlier.

image (optional, object)

url (required, string) is a URL to an image file.

alt (optional, string) (See description in horizontal_image, if you are unsure what "alt" means.)

caption (optional, string)

summary (optional, string)

Annotation type: Music

Music annotations are used to display the info about the music being played.

artist (optional, string)

album (optional, string)

title (optional, string)

url (optional, string) is the URL to the artist's website or web-store.

image (optional, object)

url (optional, string) is a URL to an image file.

Annotation type: Transcript

Transcript annotations can be thought of paragraphs in a transcription. You'll want to create a new transcript annotation any time the speaker changes.

speaker (optional, integer) is the ID of the speaker. IDs and speaker names are specified in the speakers array.

text (optional, string) is the full paragraph or block of text.

words (optional, array) is an array of the full text broken into timestamped words or phrases.

Annotation type: Quote

The quote annotation type is used for magazine-like pull quotes.

text (required, string) is the quoted text. Should not include quotation marks.

attribution (optional, string) is the name of the person being quoted.

Annotation type: Link

Link annotations are meant as very simple, easy to click web links. Applications may choose to follow links automatically and load previews of content, similar to Facebook or Twitter.

text (optional, string) is some text to accompany the link. Typically the entire block will be hyperlinked.

url (optional, string)

Annotation type: [No type specified]

These are the workhorse annotations of Timeline.

text (optional, string)

image (optional, object)

url (optional, string) is a URL to an image file.

alt (optional, string) (See description in horizontal_image, if you are unsure what "alt" means.)

caption (optional, string)

Words array

Transcript annotations can have an array that places words or phrases on the timeline. Each word can have a time, or you can group words. Each element in a words array can have:

text (required, string) is a spoken word or phrase.

offset (required, integer) is the word or phrase's location on the timeline. It is an offset value, in milliseconds, from the start location of the parent transcript annotation. We use offsets from the parent annotation because future versions of Timeline will likely use acoustic fingerprints instead of timestamps, except in the case of individual words and phrases. This exception is because fingerprinting locating takes much processing power and there are likely to be thousands of words in an episode, so in the future, Timeline programs will rely on fingerprints to locate the parent annotation block, but then timed offsets for the words and phrases within that block.

Speakers array

Each item in the speakers array is a person who speaks in the episode. Each speaker can have have:

id (required, integer) is a number used to identify this speaker in transcript annotations.

name (required, string) is the name of the speaker.

image (optional, object) is an image of the speaker.

url (optional, string) is a URL to an image file.