A quick tour to a high-throughput low-latency logging library.
#include <binlog/binlog.hpp>
int main()
{
BINLOG_INFO("Hello {}!", "World");
std::ofstream logfile("hello.blog", std::ofstream::out|std::ofstream::binary);
binlog::consume(logfile);
}
Compile and run this program: it will produce a binary logfile, hello.blog
.
The logfile can be converted to text using bread
:
$ bread hello.blog
INFO [10/05 20:05:30.176362854] Hello World!
Compared to other log libraries, Binlog is very fast for two reasons:
First, Binlog is using asynchronous logging. The log events are first copied to
a fast-access, lock-free intermediate storage, before consume
writes
them to the logfile.
Second, Binlog produces a structured, binary log.
This means, in the application, arguments and timestamps are not converted to text,
and are not substituted into a format string. These happen in a later stage offline,
while reading the logs (i.e: when running bread
).
As a bonus, structured logfiles are smaller, more flexible (e.g: text representation is customizable),
and easier to data mine.
Binlog provides a macro interface for logging.
The most basic log macros are: BINLOG_TRACE
, BINLOG_DEBUG
, BINLOG_INFO
, BINLOG_WARNING
, BINLOG_ERROR
and BINLOG_CRITICAL
,
one for each severity. Usage:
BINLOG_INFO("Result: {}", 42);
// Outputs: INFO Result: 42
Each basic log macro takes a format string, and zero or more arguments.
The format string can contain {}
placeholders for the arguments.
The number of placeholders in the format string and the number of arguments must match,
and it is enforced by a compile time check.
Events are timestamped using std::chrono::system_clock
.
The set of loggable argument types includes primitives, containers, pointers,
pairs and tuples, enums and adapted user defined types - as shown below.
Standard containers of loggable value_type
are loggable by default:
std::vector<int> vec{1,2,3};
std::array<int, 3> arr{4,5,6};
std::forward_list<int> lst{7,8,9};
BINLOG_INFO("Sequence containers: {} {} {}", vec, arr, lst);
// Outputs: Sequence containers: [1, 2, 3] [4, 5, 6] [7, 8, 9]
std::set<int> set{4,8,15,16,23,42};
std::map<char, std::string> map{{'a', "alpha"}, {'b', "beta"}};
BINLOG_INFO("Associative containers: {} {}", set, map);
// Outputs: Associative containers: [4, 8, 15, 16, 23, 42] [(a, alpha), (b, beta)]
Aside the standard containers, any container-like type is loggable that satisfies the following constraints:
begin()
and end()
value_type
of the iterator is loggableC style arrays of loggable elements can be wrapped in a view if the array size is known, to be logged as a container:
int array[] = {1, 2, 3};
BINLOG_INFO("Array: {}", binlog::array_view(array, 3));
// Outputs: Array: [1, 2, 3]
Containers of characters (e.g: std::string
or std::vector<char>
) are logged just
like any other container, but have a special text representation:
std::string str = "String";
BINLOG_INFO("Hello {}!", str);
// Outputs: Hello String!
const char*
arguments, because of established convention, are assumed to
point to null terminated strings, therefore logged and displayed accordingly.
Unfortunately, sometimes even char*
is used to refer to a null terminated string
(e.g: strerror). Binlog has no way to know whether the passed argument
is a string (therefore calling strlen
on it is safe) or just a pointer to a single character.
To remain on the safe side, by default, Binlog assumes the latter.
A few ways to workaround this:
const char*
, instead of char*
. If char*
is given, assign it to a const char*
variable before loggingchar*
by string_view
char*
as a string:#include <binlog/char_ptr_is_string.hpp>
char* error = geterror(EBADF);
BINLOG_INFO("Error: {}", error);
// Outputs: Error: Bad file descriptor
Raw and standard smart pointers pointing to a loggable element_type
are loggable by default:
int* ptr = nullptr;
std::unique_ptr<int> uptr(std::make_unique<int>(1));
std::shared_ptr<int> sptr(std::make_shared<int>(2));
BINLOG_INFO("Pointers: {} {} {}", ptr, uptr, sptr);
// Outputs: Pointers: {null} 1 2
If the pointer is valid, it gets dereferenced and the pointed object will be logged.
If the pointer is empty (i.e: it points to no valid object),
no value is logged, and in the text log it will be shown as {null}
.
Logging of weak_ptr
is not supported, those must be .lock()
-ed first.
To log the address pointed by the pointer (instead of the object it points to),
apply binlog::address
or cast the pointer to a void*
:
BINLOG_INFO("Raw pointer value: {} {}", binlog::address(any_pointer), void_pointer);
// Outputs: Raw pointer value: 0xF777123 0xF777123
Aside the standard pointers, any pointer or optional-like type is loggable that satisfies the following constraints:
bool
namespace mserialize { namespace detail {
template <typename T> struct is_optional<boost::optional<T>> : std::true_type {};
}}
Standard pair and tuple with loggable elements are loggable by default:
std::pair<int, char> p{1, 'a'};
std::tuple<std::string, bool, int> t{"foo", true, 2};
BINLOG_INFO("Pair: {}, Tuple: {}", p, t);
// Outputs: Pair: (1, a), Tuple: (foo, true, 2)
Aside the standard pair and tuple, any tuple-like type is loggable that satisfies the following constraints:
T<E...>
.E
of E...
are loggableget<N>(t)
namespace mserialize { namespace detail {
template <typename... T>
struct is_tuple<boost::tuple<T...>> : std::true_type {};
}}
Enums are loggable by default, serialized and shown as their underlying type:
enum Enum { Alpha = 123, Beta = 124 };
BINLOG_INFO("Enum: {}", Alpha);
// Outputs: Enum: 123
To make the log easier to read, enums can be adapted: adapted enums are still serialized as their underlying type, but in the log, the enumerator name is shown:
enum AdaptedEnum { Gamma, Delta };
BINLOG_ADAPT_ENUM(AdaptedEnum, Gamma, Delta)
BINLOG_INFO("Adapted enum: {}", Delta);
// Outputs: Adapted enum: Delta
The BINLOG_ADAPT_ENUM
must be called in global scope, outside of any namespace.
If an enumerator is omitted, the underlying value will be shown
instead of the omitted enumerator name.
Both scoped and unscoped enums can be adapted.
For scoped enums, the enumerators must be prefixed with the enum name, as usual:
enum class ScopedEnum { Epsilon, Phi };
BINLOG_ADAPT_ENUM(ScopedEnum, Epsilon, Phi)
BINLOG_INFO("Scoped enum: {}", ScopedEnum::Epsilon);
// Outputs: Scoped enum: Epsilon
The maximum number of enumerators is limited to 100.
To adapt enum types shadowed by non-type declarations (e.g: a field), the following workaround is required:
struct Nest {
enum Nested
{
Bird
} Nested; // Note: field name is same as enum name
};
// BINLOG_ADAPT_ENUM(Nest::Nested, Bird) // Doesn't work, type is shadowed by field.
typedef enum Nest::Nested NestedT; // workaround: use elaborated type specifier // NOLINT
BINLOG_ADAPT_ENUM(NestedT, Bird)
User defined types outside the categories above can be still logged, if adapted:
struct Foo
{
int a = 0;
std::string b;
bool c() const { return true; }
};
BINLOG_ADAPT_STRUCT(Foo, a, b, c)
BINLOG_INFO("My foo: {}", Foo{1, "two"});
// Outputs: My foo: Foo{ a: 1, b: two, c: true }
BINLOG_ADAPT_STRUCT
takes a typename, and a list of data members or getters.
Data members are:
Getters are:
The member list does not have to be exhaustive, e.g: mutex members can be omitted, those will not be logged.
The member list can be empty. The maximum number of members is limited to 100.
BINLOG_ADAPT_STRUCT
must be called in global scope, outside of any namespace.
The type must not be recursive, e.g: Foo
can't have a to be logged Foo*
typed member.
User defined types that derive from base classes already adapted for logging can be adapted, without enumerating the base members again:
struct Base { int a = 0; };
struct Derived : Base { int b = 1; int c = 2; };
BINLOG_ADAPT_STRUCT(Base, a)
BINLOG_ADAPT_DERIVED(Derived, (Base), b, c)
BINLOG_INFO("Hierarchy: {}", Derived{});
// Outputs: Hierarchy: Derived{ Base{ a: 0 }, b: 1, c: 2 }
The second argument of BINLOG_ADAPT_DERIVED
must be a list of adapted base classes, multiple inheritance is supported.
User defined templates can be adapted for logging in a similar way:
template <typename A, typename B, std::size_t C>
struct Triplet
{
A a{};
B b{};
std::array<int, C> c{};
};
BINLOG_ADAPT_TEMPLATE((typename A, typename B, std::size_t C), (Triplet<A,B,C>), a, b, c)
For more information see the Mserialize documentation on Adapting custom types and Adapting user defined recursive types for visitation.
std::chrono::system_clock::time_point
objects can be logged after adoption by including a header file:
#include <binlog/adapt_stdtimepoint.hpp> // must be included to log std::chrono::system_clock::time_point
std::chrono::system_clock::time_point now = std::chrono::system_clock::now();
BINLOG_INFO("Now: {}", now);
// Outputs: Now: 2021-03-16 18:05:02.123456000
The object is serialized as a single number (nanoseconds since epoch), and pretty printed as a human readable timestamp.
The pretty printed text format is affected by the -d
flag when using bread.
If %d
(localtime) appears first in the -f
(format) parameter, the time point is converted to producer-local timezone
(this is the default). If %u
(UTC) appears first, the time point is shown in UTC.
Certain helper types of std::chrono::duration
can be logged after adoption by including a header file:
#include <binlog/adapt_stdduration.hpp> // must be included to log std::chrono::duration helper types
BINLOG_INFO("{} {} {}", std::chrono::nanoseconds{1}, std::chrono::microseconds{2}, std::chrono::milliseconds{3});
// Outputs: 1ns 2us 3ms
BINLOG_INFO("{} {} {}", std::chrono::seconds{4}, std::chrono::minutes{5}, std::chrono::hours{6});
// Outputs: 4s 5m 6h
std::error_code
objects can be logged after adoption by including a header file:
#include <binlog/adapt_stderrorcode.hpp> // must be included to log std::error_code
std::error_code ec;
BINLOG_INFO("ec: {}", ec);
// Outputs: ec: Success
The error code is serialized in terms of the message
member, therefore it may result in memory allocations.
If C++17 is available, the standard optional with a loggable value_type
can be made loggable:
#include <binlog/adapt_stdoptional.hpp> // must be included to log std::optional, requires C++17
std::optional<int> opt(123);
std::optional<int> emptyOpt;
BINLOG_INFO("Optionals: {} {}", opt, emptyOpt);
// Outputs: Optionals: 123 {null}
If C++17 is available, standard filesystem types can be made loggable:
#include <binlog/adapt_stdfilesystem.hpp> // must be included to log std::filesystem types, requires C++17
const std::filesystem::path p("/path/to/file");
BINLOG_INFO("std::filesystem::path: {}", p);
// Outputs: std::filesystem::path: /path/to/file
The supported filesystem types are: path
, directory_entry
, file_type
, perms
, space_info
, file_status
.
On platforms where path::value_type
is char
(e.g: POSIX platforms), path
objects are serialized using the native
member,
therefore no extra memory allocation is required.
On platforms where path::value_type
is not char
(e.g: on Windows), path
objects are serialized using the string
member,
which may result in extra memory allocations.
If C++17 is available, standard variant types can be made loggable:
#include <binlog/adapt_stdvariant.hpp> // must be included to log std::variant types, requires C++17
std::variant<int, std::string, float> v{"works"};
std::variant<std::monostate> mono;
BINLOG_INFO("std::variant: {}, monostate: {}, valueless_by_exception: {}", v, mono, valueless);
// Outputs: std::variant: works, monostate: {null}, valueless_by_exception: {null}
To log a variant, every alternative must be loggable. The number of alternatives must be less than 256. Monostate and valueless-by-exception variants are supported.
The binary logfile produced by Binlog is not human-readable,
it has to be converted to text first. The bread
program
reads the given binary logfile, converts it to text
and writes the text to the standard output.
$ bread logfile.blog > logfile.txt
The format of the text representation can be customized using command line switches:
$ bread -f "%S [%d] %n %m (%G:%L)" -d "%m/%d %H:%M:%S.%N" logfile.blog
The events of the logfile can be sorted by their timestamp using -s
.
The complete input is consumed first, then sorted and printed in one go.
$ bread -s logfile.blog
If no input file is specified, or it is -
, bread
reads from the standard input,
acting like a filter that converts a binary log stream to text. This allows reading
compressed logfiles:
$ zcat logfile.blog.gz | bread
With piping, a live logfile that is being written by the application can be read, even if log rotation is configured:
$ tail -c0 -F logfile.blog | bread
To customize the output and for further options, see the builtin help:
$ bread -h
If the application crashes, because of the asynchronous logging employed by
Binlog, it is possible that the most interesting log events, those directly
preceding the crash, are still in the intermediate queues, waiting to be
consumed and flushed to the logfile. The brecovery
program takes a coredump
(memory dump, crash dump, the memory image of the application at the time of the
crash), searches for metadata and log queues, and writes recovered content to the specified file:
$ brecovery application.core recovered.blog
The recovered logfile is a binary logfile, that can be read using bread
:
$ bread recovered.blog
The first section, Hello World shows a very simple example, where the data flow might not be clear, due to the hidden state, which makes the example short. Consider the following, more explicit example:
#include <binlog/Session.hpp>
#include <binlog/SessionWriter.hpp>
#include <binlog/advanced_log_macros.hpp>
#include <fstream>
#include <iostream>
int main()
{
binlog::Session session;
binlog::SessionWriter writer(session);
BINLOG_INFO_W(writer, "Hello {}!", "World");
std::ofstream logfile("hello.blog", std::ofstream::out|std::ofstream::binary);
session.consume(logfile);
if (! logfile)
{
std::cerr << "Failed to write hello.blog\n";
return 1;
}
std::cout << "Binary log written to hello.blog\n";
return 0;
}
Session
is a log stream. SessionWriter
can add log events to such a stream.
Multiple writers can write a single session concurrently,
as between the session and each writer there's a queue of bytes.
In the first example, the default instances of these types were used implicitly,
provided by default_session()
and default_thread_local_writer()
.
BINLOG_INFO_W
is the same as BINLOG_INFO
, except that it takes
an additional writer as the first argument, which it will use
to add the event instead of the default writer.
The log event, created by this macro, is first serialized into the
queue of the writer
, upon invocation. If this is the first
call to this macro, the metadata associated with this event
is also added to the session of the writer.
This macro is available for each severity, i.e:
BINLOG_TRACE_W
, BINLOG_DEBUG_W
, BINLOG_INFO_W
, BINLOG_WARNING_W
, BINLOG_ERROR_W
and BINLOG_CRITICAL_W
.
Serialization is done using the Mserialize library.
When session.consume
is called, first the available metadata is
consumed and written to the provided stream. Then each writer queue
is polled for data. Available data is written to the provided stream
in batches. At the end of the program, the health of the output stream
is checked, to make sure errors are not swallowed (e.g: disk full).
To make the source of the events easier to identify, writers can be named. Writer names appear in the converted text output. This feature can be used to distinguish the output of different threads.
writer.setName("w1");
BINLOG_INFO_W(writer, "Hello named writer");
// Outputs: w1 Hello named writer
It might be desirable to change the verbosity of the logging runtime. Setting the minimum severity of the session disables production of events with lesser severities:
#include <binlog/Severity.hpp>
session.setMinSeverity(binlog::Severity::warning);
// trace, debug, info severities are disabled
If an event of disabled severity is given to a writer, it will be discarded without effect, and the log arguments will not be evaluated.
// after setting min severity to warning above:
BINLOG_INFO("Call f: {}", f()); // f will not be called
To separate the log events coming from different components of the application, a category can be attached to them:
#include <binlog/advanced_log_macros.hpp>
BINLOG_INFO_C(my_category, "This is a categorized event");
// Outputs: my_category This is a categorized event
BINLOG_INFO_C
is the same as BINLOG_INFO
, except that it takes
an additional category name as the first argument, which will be the
category of the event. The category name can be any valid identifier,
that must be available compile time. The name of the default category is main
.
This macro is available for each severity, i.e:
BINLOG_TRACE_C
, BINLOG_DEBUG_C
, BINLOG_INFO_C
, BINLOG_WARNING_C
, BINLOG_ERROR_C
and BINLOG_CRITICAL_C
.
The BINLOG_<SEVERITY>_W
and BINLOG_<SEVERITY>_C
macros can be combined:
BINLOG_INFO_WC(writer, my_category, "My writer, my category");
// Outputs: my_category W1 My writer, my category
As above, there's one for each severity, i.e:
BINLOG_TRACE_WC
, BINLOG_DEBUG_WC
, BINLOG_INFO_WC
, BINLOG_WARNING_WC
, BINLOG_ERROR_WC
and BINLOG_CRITICAL_WC
.
Regardless the exact log macro being used (BINLOG_<SEVERITY>*
), when an event is created,
it is first put into the queue of the writer. When session.consume(ostream)
is called,
these queues are polled and the acquired data is written to the given ostream
.
If the writer is unable to write the queue, because it is full, it creates a new one,
and closes the old. Therefore, a balance of new event frequency, queue size and consume frequency
must be established. For applications built around a main loop, it might be suitable to
consume the logs at the end of each loop iteration, sizing the queue according to the
estimated amount of data one iteration produces:
binlog::Session session;
const std::size_t queueCapacityBytes = 1 << 20;
binlog::SessionWriter writer(session, queueCapacityBytes);
std::string input;
while (std::getline(std::cin, input))
{
processInput(input, writer); // logs using `writer`
session.consume(logfile);
}
For different kind of applications, calling consume
periodically in a dedicated thread
or task can be an option.
Log rotation can be achieved by simply changing the output stream passed to Session::consume
.
Session
does not know or care about the underlying device of the stream.
However, metadata is not added automatically to the new file (as Session
does not know
that a rotation happened). To make the new file self contained (i.e: readable without the
metadata in the old file), old metadata has to be added via Session::reconsumeMetadata
:
if (std::rename("rotate.blog", "rotate.1.blog") != 0)
{
std::cerr << "Failed to rename rotate.blog to rotate.1.blog\n";
return 2;
}
logfile.close();
logfile.open("rotate.blog", std::ofstream::out|std::ofstream::binary);
// Metadata is now moved to rotate.1.blog, rotate.blog is empty.
// To make rotate.blog stand alone, re-add metadata:
binlog::default_session().reconsumeMetadata(logfile);
The key feature of Binlog is producing structured, binary logfiles.
However, in some cases, writing text output directly might be still desirable.
This is easy to do using TextOutputStream
:
#include <binlog/TextOutputStream.hpp> // requires binlog library to be linked
#include <binlog/binlog.hpp>
#include <iostream>
int main()
{
BINLOG_INFO("Hello Text Output!");
binlog::TextOutputStream output(std::cout);
binlog::consume(output);
return 0;
}
TextOutputStream
requires the Binlog library to be linked to the application.
Session::consume
takes a single target only, but it is easy to multiplex the log stream
to produce multiple output streams. With EventFilter
, it is possible to route
different kinds of events to different outputs, efficiently. In this example,
the complete binary log is written to a logfile, but high severity events are also
written to the standard error as text:
#include <binlog/EventFilter.hpp>
#include <binlog/TextOutputStream.hpp> // requires binlog library to be linked
// Write complete binlog output to `binary`,
// and also write error and above events to `text` - as text.
class MultiOutputStream
{
public:
MultiOutputStream(std::ostream& binary, std::ostream& text)
:_binary(binary),
_text(text),
_filter([](const binlog::EventSource& source) {
return source.severity >= binlog::Severity::error;
})
{}
MultiOutputStream& write(const char* buffer, std::streamsize size)
{
_binary.write(buffer, size);
try
{
_filter.writeAllowed(buffer, std::size_t(size), _text);
}
catch (const std::runtime_error& ex)
{
std::cerr << "Failed to convert buffer to text: " << ex.what() << "\n";
}
return *this;
}
private:
std::ostream& _binary;
binlog::TextOutputStream _text;
binlog::EventFilter _filter;
};
Usage is like before:
std::ofstream logfile(path, std::ofstream::out|std::ofstream::binary);
MultiOutputStream output(logfile, std::cerr);
binlog::consume(output);
Logging in global destructor context:
The BINLOG_<SEVERITY>
and BINLOG_<SEVERITY>_C
macros that do not take a writer argument
access global state: a function local static writer and a session. Depending on the initialization order,
in a global destructor context, that state might be no longer valid,
yielding to undefined behavior. Possible resolutions include:
BINLOG_<SEVERITY>_W
,
and make sure the initialization of its writer argument is sequenced before
the initialization of the affected global objectsTimestamps: By default, Binlog uses std::chrono::system_clock
to create timestamps,
and assumes that this clock measures UTC time (without leap seconds). This is required as of C++20.
The Binlog test suite tests if this requirement holds on a given platform.