wiki:Technical/MediaServer

Media Server design documents

The video and audio provider roles should be separated out of the Aegisub main process. Instead, video and audio data should be served over a socket interface (or possibly named pipes on Windows). There are three reasons to do this:

  • Aegisub will be protected against unstable providers, "allowing" them to crash without taking the entire application down.
  • It will make it possible to use Avisynth as provider on UNIX-like systems, by running an Avisynth server process under Wine.
  • Media can potentially be served over a network.

To-do

  1. Write requirements document
  2. Outline a design based on requirements
  3. Define the protocol
  4. In line with defining the protocol, implement it to test ideas in practice

Requirements

Not formalised yet, should probably be written as a numbered list of individual requirements.

Run in a separate process, optionally on a different machine. A stream-oriented reliable transport is used to communicate between the media server and the consumer.

To keep implementations simple, the protocol must be strictly synchronous request-reply. The server never sends anything to the consumer without the consumer having beforehand sent a request to the server. The server must respond to any request immediately, if the server cannot handle the request successfully immediately, it must instead send a response indicating that the request is being processed but cannot be fulfilled at the current time. The consumer may then poll the server until the request can be fulfilled. The consumer must not send a request to the server if there is already an outstanding request.

A media server must never display any GUI by itself, because it might be running on a remote machine, which the user might not have access to, and which might be headless. All GUI required must be implemented inside Aegisub.

Communication between the server and consumer happens in sessions. A session represents an open media resource. A session may be associated with at most one connection to the media server. A connection to the media server may be associated with at most one session.

Aegisub requires media requests to be idempotent: Requestion a range of media data from the server must always return the same data within the same session. If the server cannot guarantee that at the point in time the request is made, the request must fail.

A media server may implement audio access in one of two manners: Random access or linear access. A server providing random access audio guarantees that it is possible to seek to a specific audio sample and read a specific count of audio samples, and that this operation will always be idempotent. A server providing linear access can not guarantee reliable seeking and Aegisub will in this case download all audio from the server into a local cache which can provide Aegisub the accurate seeking which is required.

The consumer may without the knowledge of the server cache data. This is always safe because of the property of idempotence of requests. This cache can be used to overcome network delays.

The actual media data will initially be transferred uncompressed, although some kind of compression might be added later to accommodate slow links.

Initially, a server may close a session if the associated connection is lost, however plans are to at some point allow a session to live without a connection, and allow a connection to connect to an existing, unclaimed session, to accommodate unstable network links where connections may be dropped randomly.

Attachments