Introduction
The C++ language is strongly typed; programs express data structures and algorithms in terms of types. However many categories of algorithms operate instead on data represented as raw bytes, such as:
-
Encryption and decryption
-
Compression and decompression
-
JSON parsing and serialization
-
Network protocols, such as HTTP and WebSocket
Algorithms are often combined; a network application may need to remove the framing from a WebSocket stream, apply decompression to the unframed payloads, then parse the decompressed payloads as serialized JSON. We define common vocabulary types and concepts to facilitate interoperability between libraries, in order that such algorithms may be composed.
Contiguous Buffers
The fundamental representation of unstructured bytes is the contiguous buffer,
characterized by a possibly const
pointer to region of memory with a defined
number of valid bytes. These are represented as objects of type
const_buffer
and mutable_buffer
.
struct const_buffer
{
void const* data() const noexcept;
std::size_t size() const noexcept;
};
struct mutable_buffer
{
void * data() const noexcept;
std::size_t size() const noexcept;
};
This representation is helpful yet insufficient for all use-cases. For example
an algorithm which must add framing data to caller-provided contiguous buffers
would need to perform costly reallocations and copies to represent it as a
single buffer before passing it on to the next algorithm. Operating-system
level facilities which transact raw bytes on file descriptors or sockets often
provide a
scatter/gather
interface: an ordered list of zero or more contiguous buffers. Linux provides
the type
iovec
this purpose. In C++ we may consider using a container to represent an
sequence of buffers:
std::list< const_buffer > bs1;
std::vector< mutable_buffer > bs2;
std::array< const_buffer, 3 > bs3;
There are some caveats with this approach:
-
list
is expensive to copy. -
vector
can be resized at runtime yet allocates memory to do so. -
array
is fixed in size. Inserting additional buffers requires metaprogramming and a redeclaration of the container type. -
Choosing a single type as a vocabulary type precludes user-defined types.
Sequence Requirements
The approach taken by this library is the same as the approach used in the popular Boost.Asio network library. That is, to define the concepts ConstBufferSequence and MutableBufferSequence for representing buffer sequences with these semantics:
-
A buffer sequence is cheap to copy.
-
A copy of a buffer sequence refers to the same underlying memory regions.
-
Buffer sequences model bidirectional ranges whose value type is convertible to
const_buffer
ormutable_buffer
for sequences whose contents. are modifiable. -
The types
const_buffer
andmutable_buffer
are also considered to be buffer sequences.
Algorithms which use buffer sequences have these additional requirements:
Buffer Pairs
Buffer sequences of length two; that is, buffer sequences consisting of up to
two separate contiguous memory regions, occur in design patterns with sufficient
frequency that the library provides a custom implementation for representing
them. Objects of type const_buffer_pair
or :mutable_buffer_pair
are
buffer sequences of length two. These are the type of sequences returned by
the circular_buffer
dynamic buffer, discussed later.