webvid - web video downloader daemon Copyright 2009 Antti Ajanki <antti.ajanki@iki.fi> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. See the file COPYING for more information. Description ----------- webvid is a server that downloads web pages from certain media sharing websites, such as YouTube or Google Video, and transforms them into an XML-based menu format. See README for full list of supported sites. The server listens for client (VDR plugin or webvi, the command line client) connections in port 2357. The client asks for a menu page or media file using a reference its has received from the server earlier. The server connects to upstream web site, downloads a web page, and transforms it into a simple XML format. See below for description of the format. The client then shows the menu to the user who may choose to follow another link. Command line parameters ----------------------- -h, --help show this help message and exit -p PORT, --port=PORT serve in PORT -r PATH, --servicepath=PATH load service definitions from PATH -d, --daemonize run in the background -l LOG, --logfile=LOG write log messages to file LOG. The default is to write to stdout. The Debian package uses config file /etc/default/vdr-plugin-webvideo for specifying the command line options. Communication protocol ---------------------- The client and the server communicate using a HTTP-like protocol. The client sends a request that consists of three parts separated by spaces: command, reference, and protocol version. The request ends in four byte sequence CR LF CR LF (0x0D 0x0A 0x0D 0x0A). The command is one of the words listed below, reference is the target of the command (usually from an earlier request), the protocol version must be WVTP/1.0. Server's response starts with a status line which are followed by optional header lines, an empty line to indicate the end of the headers, and an optional message body. The lines end with CR LF. The status line consists of a protocol version identifier (WVTP/1.0), a space, a three-digit status code, a space, and a short human-readable status message. Each header line consists of name followed by colon and header value. The semantics of the message body depend on the request. The two headers that must be present if the body is non-empty are Content-Type, which specifies the MIME content type of the body, and Content-Length, which contains the length of the body in bytes. Requests and responses ---------------------- GET reference WVTP/1.0 The GET command retrieves an object identified by reference. Legal values for reference can be extracted from <ref> and <object> nodes in a <wvmenu> object received as response to an earlier GET request. See below for specification of the <wvmenu> XML objects. If the reference is a value of a <ref> node, an successful GET request returns a <wvmenu> object. If the reference is a value of an <object> node, an successful GET request returns a <mediaurl> object. reference can also be the string "/mainmenu" (without quotes). In this case the result is an <wvmenu> object that lists links to all video services supported by the server. This is usually the first thing the client shows to the user. If the request is successful, the server responds with status code 200 and returns a <wvmenu> or <mediaurl> object in the message body. DOWNLOAD url WVTP/1.0 The DOWNLOAD command retrieves a media file from the Internet. url is a URL of a video or audio file or an ASX playlist. If successful, the server responds with status code 200 and returns the media file in the message body. In the case url points to an ASX playlist, the server reads the playlist and returns the (first) media file referenced in the playlist in the message body. STREAM url WVTP/1.0 The STREAM command returns an URL of a media object. If successful, the server responds with status code 200 and returns the URL as a <mediaurl> XML object in the message body. If the url points to an ASX playlist, the server reads the playlist and the returned value is the (first) URL in the playlist. Otherwise the returned value is url without any changes. CLOSE 0 WVTP/1.0 Closes the connection. The server does not respond. The second parameter (0) is ignored. Error status codes ------------------ If the server cannot complete a request it responds with one of the following status codes describing the error. The message body will be empty. 400 Bad request 404 Not Found 500 Internal Server Error 501 Not Implemented 502 Protocol version not supported 503 XSLT transformation failed 504 Read error while receiving data from upstream 505 Timeout on upstream server 506 Couldn't connect to upstream server Menu description language ------------------------- The navigation menus retrieved by a GET request are encoded as XML objects. An example is shown below: <wvmenu> <title>Page title</title> <link> <label>Label of the link</label> <ref>navigation-reference</ref> <object>media-object-reference</object> </link> <textarea> <label>very long text possibly spanning several lines </label> </textarea> <textfield id="search_query"> <label>Search terms</label> </textfield> <itemlist id="search_sort"> <label>Sort by</label> <item value="">Relevance</item> <item value="video_date_uploaded">Date Added</item> <item value="video_view_count">View Count</item> <item value="video_avg_rating">Rating</item> </itemlist> <button> <label>Send</label> <submission>base-submission-reference</submission> </button> </wvmenu> The root node <wvmenu> has a child <title>, which will be shown as the title of the menu, and one or more of the <link>, <textfield>, <itemlist>, <textare>, and <button> tags. <link> is a navigation link. It must have a child <label>, which is the name of the link, and at least one of <ref> (a navigation reference) or <object> (a media object reference, which can be used in future DOWNLOAD or STREAM request). <textarea> defines a multiline text widget. <textfield> defines a control for entering a string. <itemlist> defines a control for selecting one of the specified <item>s. <button> is a link that can be used to submit the state of the textfields and itemlists to the server. When a button is selected the client sends a GET request with the reference encoding the values of the textfields and itemlists on the same page. The reference is constructed as follows: the base is the value of the <submission> node. If there are at least one textfield or itemlist control, a "?" is appended to the reference. The values of the textfields and itemlists are then appended as "key=value" pairs separated by "&". The key is the id parameter of a <textfield> or <itemlist> tag, and the value is an UTF-8 encoded representation of the text in a textfield control or the content of the value parameter of the currently selected item in an itemlist. Media URLs ---------- Media objects retrieved by GET or STREAM requests are XML objects. An example follows: <mediaurl> <title>Name of the video</title> <url priority=50>http://www.example.com/video.flv</url> <url priority=10>http://www.example.com/lowresvideo.flv</url> </mediaurl> The value of <title> is the name of the video or audio file. There may be several <url> nodes. They all should refer to the same content (for example, versions with different resolutions or backup copies on different servers). The priority parameters are used to sort the URLs. The URL with highest priority should be tried first, and only if it fails, others should be tried in decreasing order of priority.