SubRip Text
SRT (SubRip Text??) is a very simple subtitle format.
It only allows one subtitle on screen at a time and there is (mostly) no definition of style at all.
Some SRT renderers (notably VSFilter) do however allow for multiple subtitles on screen at a time, and advanced styling, but those are not part of the specification. (Do any specs really exist?)
It should be noted that many SRT renderers don't attempt to do any automatic line breaking, so you should make sure each line is relatively short, usually around 40-50 characters at most.
File Format
The SRT file format is line/section based.
Each subtitle in an SRT file appears in a numbered section, which has a start-time, an end-time and a number of subtitle lines. Each section is separated by one or more blank lines. As such, subtitles can not contain blank lines.
This is an example of an SRT section:
1 00:00:04,070 --> 00:00:10,047 This is the <b>first</b> line, And this is the <i>second</i> line
The section numbers must be continuos and start at 1 (one). The sections must appear in numbered order in the file, and must also be sorted on start-time. Two sections can not overlap time-wise.
Three override tags are supported: Bold, Italics and Underline. They use the common HTML-style tag pairs <b>bold</b>, <i>italics</i> and <u>underline</u>.
Parsing
This is an idea for a state machine that parses SRT. Transitions happen on reading and analysing a line.
- Start, looking for first subtitle
- Blank line: next=1
- Line number: next=2
- Anything else: bail
- Want timestamps
- Timestamps line: next=3, store timestamps
- Anything else: bail
- Reading subtitle, first line of text
- Blank line: next=5
- EOF: complete
- Anything else: next=4, append line text to current subtitle text
- Reading subtitle, following line of text
- Blank line: next=5
- EOF: complete
- Anything else: next=4, append line break and line text to current subtitle text
- Reading subtitle, on blank line
- Blank line: next=5, append line break to current subtitle text
- Line number: next=2
- EOF: complete
- Anything else: next=4, append two line breaks and line text to current subtitle text
This parser ignores the actual values of the subtitle index lines, they just have to be present, but not sequential in any way. Blank lines inside individual subtitles are allowed, parsed, and kept. A subtitle cannot contain a blank line followed by a line containing only a decimal number.
