diff --git a/BSP.markdown b/BSP.markdown index 975d77b..4855e24 100644 --- a/BSP.markdown +++ b/BSP.markdown @@ -1,6 +1,6 @@ -BSP is an application layer data synchronisation protocol for delay-tolerant networks. It can operate over any transport that can deliver a simplex stream of bytes from a sender to a recipient on a best-effort basis, meaning that streams may be delayed, lost, reordered or duplicated by the transport layer. BSP does not ensure the confidentiality, authenticity or integrity of streams; that is the responsibility of a transport layer security protocol such as [BTP](BTP). +BSP is an application layer data synchronisation protocol for delay-tolerant networks. It can operate over any transport that can deliver a simplex stream of bytes from a sender to a recipient on a best-effort basis, meaning that streams may be delayed, lost, reordered or duplicated by the transport. BSP does not ensure the confidentiality, authenticity or integrity of streams; that is the responsibility of a transport layer security protocol such as [BTP](BTP). -BSP synchronises data between two devices referred to as the local and remote peers. The data to be synchronised consists of messages posted to channels. A message is simply a sequence of bytes, and a channel is simply a set of messages. +BSP synchronises data between two devices referred to as the local and remote peers. The data to be synchronised consists of messages, organised into sets called channels. From BSP's point of view, a message is simply a sequence of bytes, and a channel is simply a set of messages. Each channel on the local peer belongs to an application. The application decides which messages stored on the local peer should be synchronised to the remote peer (the sharing policy), and which messages synchronised from the remote peer should be stored on the local peer (the storage policy). @@ -10,19 +10,19 @@ BSP uses a cryptographic hash function, H(m), with an output length of hash_len ### Channel identifiers -Each channel has a unique identifier hash_len bytes long. This identifier is supplied by the application and is not interpreted by BSP. To prevent collisions, the identifier must either be randomly generated, or must be the cryptographic hash of an application data structure describing the channel. If a hash is used, a randomly generated application identifier hash_len bytes long must be prepended to the application data structure before hashing to prevent collisions between applications with similar data structures. +Each channel has a unique identifier hash_len bytes long. This identifier is supplied by the application and is not interpreted by BSP. To prevent collisions, the identifier must either be random, or be the cryptographic hash of an application data structure describing the channel. If a hash is used, a random application identifier hash_len bytes long must be prepended to the application data structure before hashing to prevent collisions between applications with similar data structures. ### Message format -Each message consists of one or more blocks. Each block is block_len bytes long, except the last, which may be shorter. (We require that block_len <= 2^15.) The blocks form the leaves of a binary hash tree. Each parent node consists of the concatenated hashes of its children. If the number of blocks is not a power of two, some parent nodes will only have one child. +Each message consists of one or more blocks. Each block is block_len bytes long, except the last block of the message, which may be shorter. (We require that block_len is not more than 2^15.) The blocks form the leaves of a binary hash tree. Each parent node consists of the concatenated hashes of its children. If the number of blocks in the message is not a power of two, some parent nodes will only have one child. The message's unique identifier is calculated by hashing a message header concatenated with the root hash of the tree. The message header consists of the channel identifier, a timestamp, the message length and the message type. The timestamp is a 64-bit integer representing seconds since the Unix epoch. (All integers in BSP are big-endian.) The message length is a 64-bit integer representing the length of the message in bytes. The message type is a single byte that is supplied by the application and is not interpreted by BSP. -Each block also has a unique identifier, which is calculated by hashing the message header concatenated with a block header and the hash of the block itself. The block header consists of the block number and a list of hashes called the path. +Each block has a unique identifier, which is calculated by hashing the message header concatenated with a block header and the hash of the block itself. The block header consists of the block number and a list of hashes called the path. -The block number is a 64-bit integer starting from zero for the first block of the message. The path consists of the hashes of the siblings of the block's ancestors in the hash tree. If the number of blocks is not a power of two, some ancestors may not have siblings. The positions of any nonexistent siblings can be calculated from the message length and the block number. +The block number is a 64-bit integer starting from zero for the first block of the message. The path consists of the hashes of the siblings of the block's ancestors in the hash tree. If the number of blocks in the message is not a power of two, some of the block's ancestors may not have siblings. The positions of any such ancestors can be calculated from the message length and the block number, so the number of hashes in the path does not need to be recorded explicitly. A block accompanied by its message and block headers is called a portable block. A portable block is valid if the message length, block number, path and block length are consistent with each other. A valid portable block contains all the information needed to calculate the message identifier. Any valid portable blocks that produce the same message identifier are guaranteed to be consistent with each other. @@ -34,11 +34,11 @@ The local and remote peers synchronise data by sending simplex streams of bytes * Bits 8-15: Record type * Bits 16-31: Length of the payload in bytes as a 16-bit integer -A stream may contain records of any type in any order. If the recipient does not recognise the protocol version or record type, the recipient skips to the next record in the stream. +A stream may contain records of any type in any order. If the recipient does not recognise a record's protocol version or record type, the recipient skips to the next record in the stream. The current version of the protocol is 1, with five record types: -**0: OFFER** - The payload consists of one or more block identifiers. This record informs the recipient that the sender holds the listed blocks and wishes to know whether to send them. +**0: OFFER** - The payload consists of one or more block identifiers. This record informs the recipient that the sender holds the listed blocks and asks the recipient whether to send them. **1: REQUEST** - The payload consists of one or more block identifiers. This record asks the recipient to send the listed blocks. @@ -48,7 +48,7 @@ The current version of the protocol is 1, with five record types: **4. RESET** - The payload is empty. This record asks the recipient to discard any information about which blocks the sender holds. -The local peer is said to hold a block if it is storing the block and sharing it with the remote peer. If the local peer is storing a block but not sharing it with the remote peer, the local peer acts as though it were not storing the block. +The local peer is said to hold a block if it is storing the block and sharing it with the remote peer according to the application's sharing policy. If the local peer is storing a block but not sharing it with the remote peer, the local peer acts as though it were not storing the block. ### Synchronisation state @@ -60,7 +60,7 @@ The local peer stores the following synchronisation state for each block it hold * Send count - The number of times the block has been offered or sent to the remote peer * Send time - A timestamp indicating when the block can next be offered or sent to the remote peer; measured in seconds since the Unix epoch and initialised to 0 -The local peer also stores a list of message identifiers that have been offered by the remote peer and not yet acknowledged or requested by the local peer. The length of this list should be bounded, and the local peer should discard the oldest items if the maximum length is reached. +The local peer also stores a list of message identifiers that have been offered by the remote peer and not yet acknowledged or requested by the local peer. The length of this list should be bounded, and the local peer should discard the oldest identifiers if the maximum length is reached. ### Interactive mode and batch mode @@ -86,8 +86,8 @@ A block is not offered or sent until its send time is reached. ### Retransmission -Whenever the local peer offers or sends a block it updates the block's send count and send time. BSP does not specify how the send time should be updated, except that the updates should increase exponentially with the send count. The local peer may base the updates on measurements of the transport's round-trip time and round-trip time variance, as in TCP, or it may use any other method. A block may be offered or sent over many different transports, and the method of updating the send time should take this into account. +Whenever the local peer offers or sends a block it updates the block's send count and send time. BSP does not specify how the send time should be updated, except that the amount by which it is updated should increase exponentially with the send count. The local peer may base the updates on measurements of the transport's round-trip time and round-trip time variance, as in TCP, or it may use any other method. ### Resetting -If the local peer crashes while synchronising with the remote peer, the local peer may fail to store blocks that it has already acknowledged. (This can happen even if the local peer waits for the blocks to be stored before acknowledging them, as there are many layers of buffers between the application and the storage medium.) The remote peer will no longer offer or send blocks that the local peer has acknowledged, so the peers may remain out of sync indefinitely. If the local peer detects that it has crashed, it should send a reset record to reset the remote peer's knowledge of which blocks the local peer holds. +If the local peer crashes, it may fail to store blocks that it has already acknowledged. (This can happen even if the local peer waits for the blocks to be stored before acknowledging them, as there are many layers of buffers between the application and the physical storage medium.) The remote peer will no longer offer or send blocks that the local peer has acknowledged, so the peers may remain out of sync indefinitely. If the local peer detects that it has crashed, it should send a reset record to reset the remote peer's information about which blocks the local peer holds.