google-cloud-cpp

Architecture

This document describes the high-level architecture of the Google Cloud C++ Client libraries. Its main audience are developers and contributors making changes and additions to these libraries. If you want to familiarize yourself with the code in the google-cloud-cpp project, you are at the right place.

While we expect users of the libraries may find this document useful, this document does not change or define the public API. You can use this document to understand how things work, or maybe to troubleshoot problems. You should not depend on the implementation details described here to write your application. Only the public API is stable, the rest is subject to change without notice.

What these libraries do

The goal of the libraries is to provide idiomatic C++ libraries to access services in Google Cloud Platform. All services are in scope. As of 2024-03 we have over 100 libraries, covering most Google Cloud Platform services.

What do we mean by idiomatic? We mean that C++ developers will find the APIs familiar, or “natural”, that these APIs will fit well with the rest of the C++ ecosystem, and that very few new “concepts” are needed to understand how to use these libraries.

More specifically, the functionality offered by these libraries include:

Where is the code?

Each library is found in a subdirectory of /google/cloud/. The name of the subdirectory is chosen to match the most distinctive part of the service name. For example, /google/cloud/storage contains the code for the Google Cloud Storage service.

Within each directory you will find (almost always) the same structure:

The remaining directories are implementation details, or at least only intended to be interesting for google-cloud-cpp developers and contributors.

The *Client classes

The main interface for each library is the ${library}::*Client class. Roughly speaking each “client” corresponds to a service in the .proto definitions for the service. It is common for services to have separate service definitions for “admin” operations, such as creating a new instance of the service, or changing permissions, as opposed to “normal” operations, such as inserting a new row, or publishing a new message. When “admin” services are defined, there are separate ${library}::*Client objects.

These are some examples:

Generally these classes are very “thin”; they take function arguments from the application, package them in lightweight structure, and then forward the request to the ${library}::*Connection class.

It is important to know that almost always there is one RPC generated by each *Client member functions and RPCs. That “almost” is (as the saying goes) “load bearing”, the devil is, as usual, in the details.

The *Connection classes

Connections serve two functions:

Most of the time there is a 1-1 mapping between a FooClient and the corresponding FooConnection.

Because *Connection classes are intended for customers to use (at least as mocks), they are part of the public API.

Typically there are three concrete versions of the *Connection interface:

Name Description
*Impl An implementation using the *Stub layer
*Tracing Instrument each retry loop with an OpenTelemetry trace span
Mock*Connection An implementation using googlemock

Only Mock*Connection is part of the public API. It is used by application developers that want to test their code using a mocked behavior for the *Client class.

In some cases you may find a fourth implementation, used to implement clients over HTTP and gRPC Transcoding. This class is also not part of the public API.

Name Description
*Rest*Impl An implementation using the *Rest*Stub layer

The *Stub classes

The *Stub classes wrap the *Stub generated by gRPC. They provide several functions:

The *Stub classes are typically organized as a (small) stack of Decorators, which simplifies their testing.

Layer Description
*Logging Optional *Stub decorator that logs each request and response
*Metadata Injects resource metadata headers for routing
*RoundRobin Round-robins over several *Stubs, not all libraries have them
*Tracing Instrument each RPC with an OpenTelemetry trace span

For services where we have enabled REST support, there is a parallel set of *Rest*Stub classes. These implement the same functionality, but make calls using HTTP and gRPC Transcoding.

The Options class(es)

Many functions need a way for a user to specify optional settings. This was traditionally done with distinct classes, like spanner::ConnectionOptions, spanner::QueryOptions, or storage::ClientOptions. These classes often had very different interfaces and semantics (e.g., some included a meaningful default value, others didn’t). The new, recommended way to represent options of all varieties is using the google::cloud::Options class.

Any function that needs to accept optional settings should do so by accepting an instance of google::cloud::Options, and by documenting which option classes are expected so that users know how the function can be configured. Functions that accept the old-style option classes can continue to exist and should forward to the new Options-based overload. These old functions need not even be deprecated because they should work just fine. However, to avoid burdening users with unnecessary decisions, functions should clearly document that the Options overload is to be preferred.

Each setting for an Options instance is a unique type. To improve discoverability of available option types, we should minimize the places where users have to look to find them to common_options.h, grpc_options.h, and (preferably) a single <product>/options.h file (e.g., spanner/options.h). It’s OK to introduce additional options files, but keep discoverability in mind.

Instances of Options do not contain any default values. Defaults should be computed by a service-specific function, such as spanner_internal::DefaultOptions(). This function (or a related one) is used by our implementations to augment the (optionally) user-provided Options instance with appropriate defaults for the given service. Defaults should be computed in the user-facing function that accepted the Options argument so that all the internal implementation functions lower in the stack can simply accept the Options by const& and can assume it’s properly populated. The user-facing function that documented to the user which options it accepts should also call google::cloud::internal::CheckExpectedOptions<...>(...) in order to help users diagnose option-related issues in their code.

Deviations from the “normal” Architecture

Pub/Sub

Pub/Sub generally follow these patterns, but there is substantial code outside the main classes to implement a few features:

Spanner

Spanner implements some key features in the spanner_internal::SessionPool.

Storage

In Storage the *Connection classes are in the storage::internal namespace, which forces our users to reach into the internal namespace to mock things. There is an open bug to fix this. It would involve moving all the *Request and *Response classes out of storage::internal. Some of the member functions in these classes should not be part of the public API. In short, the changes are more involved than a simple git mv.