Requirements
- Two way stream
- Multi-modal, beyond audio, video and text to include arbitrary synchronised streams of data.
- Support adaptive bitrate for different bandwidths
- Allow for both tight and loose bound tracks
- Identical format for live streaming over a network and file playback.
- Must allow server side packet filtering to control stream bandwidth by allowing a client to apply data filters. Such filters include bitrate selection, channel and track selection.
Suggested Changes
- Rename
frame
tosource
andstreamID
togroup
- Remove
definition
from packet, forcing it to be extracted from codec data - Reconsider naming of
bitrate
to reflect a more general conception? - More precisely define
flags
, moving some of the current flags into another metadata mechanism. Perhaps have codec specific flags and stream management flags. - Make
group
(orstreamID
) at least 16bit
Implemented Versions
First the Header structure is written / read directly with no encoding, followed by the IndexHeader structure for version 2 or greater. The structures below are for version 4.
struct Header {
const char magic[4] = {'F','T','L','F'};
uint8_t version = 4;
};
struct IndexHeader {
int64_t reserved[8];
};
All subsequent data is MsgPack encoded. All subsequent data consists of a tuple of StreamPacket and Packet pairs (in that order). These packets specify stream, channel and codec information, allowing different kinds of data to be included in a stream.
struct StreamPacket {
int64_t timestamp;
uint8_t streamID;
uint8_t frame_number;
ftl::codecs::Channel channel;
};
A stream packet locates a data packet within the overall context. It identifies which stream it belongs to and which channel within that stream, along with a timestamp to position it. frame_number
is used as an initial offset in the event that multiple packets are required to represent all sources. It is normally 0 unless there are more than 9 cameras and the tiling mechanism exceeds hardware decoding limits.
struct Packet {
ftl::codecs::codec_t codec;
ftl::codecs::definition_t definition; // To be removed in version 5
uint8_t frame_count;
uint8_t bitrate;
uint8_t flags;
std::vector<uint8_t> data;
};
The packet structure provides details relevant to the encoding. codec
is one of: JPG, PNG, H264, HEVC, JSON, MSGPACK, RAW and some others. definition
identifies one of a predefined set of resolutions such as 1920x1080. frame_count
is the total number of frames contained, as tiles, within this packet. bitrate
indicates the encoding quality, it is possible for the same channel to have multiple versions at different bitrates. A large number is a higher rate. flags
should be 0 or set to codec specific values. data
contains the raw encoded data for the frame.
Version 0
No longer valid. Did not have the 64 byte index header and had different packet structures.
Version 1
- Additional fields in StreamPacket and Packet structures. Not backwards compatible.
Version 2
- Add 64 bytes of reserved space after header. To be used for indexing.
- Packet flags is unused in this version.
Version 3
- Set packet flags to 0 or:
- Use of RGB flag that indicates video is RGB not BGR encoded (HEVC or H264 codec)
- Calibration and pose should be injected and use msgpack codec
Version 4
- Replace
block_count
andblock_total
withframe_count
andframe_number
- Use tiling to merge multiple sources into a single video frame.