Getting started with protobuf (Protocol Buffer)
Protocol Buffer aka protobuf is the most commonly used IDL (Interface Definition Language) for gRPC. It is a high-performance, compact binary wire format invented by Google who uses it internally so they can communicate with their internal network services at a very high speed.
Why use protobuf instead of JSON and XML?
JSON and XML are the most commonly used to send and receive messages in the REST API and RPC method. Out of this JSON is the most popular format as it is flexible, efficient, platform-neutral, and human-readable. But for some cases, these formats are not fast enough or lightweight enough when transmitting data between the systems. Mainly, XML to serialize message requests but they big, bloated, and slow to parse.
When you serialize or encode a protobuf, it is converted into binary format, which is significantly smaller than even JSON.
In addition, it is much faster to parse and encode and we can also create strongly typed objects, making them easier to work with.
Other advantages includes:
- less ambiguous with explicit data types
- smaller
- faster
- Serializes and deserializes structured data to communicate via binary.
- As a highly-compressed format, it doesn’t achieve JSON’s level of human-readability.
What is in a proto file…
The first step when working with protocol buffers is to define the structure for the data you want to serialize in a proto file: this is an ordinary text file with a .proto
extension.
The first line when working with protocol buffers is to define the version of the syntax.
syntax = "proto3";
It specifies that you are using proto3 syntax, which is the latest proto version else the protobuf compiler will assume you are using proto2. This must be the first non-empty, non-comment line of the file.
You can add an optional package
specifier to a .proto
file to prevent name clashes between protocol message types.
package foo.bar;
Here, foo.bar is the package name, which you can define.
Then, the next step will be defining the messages. Protocol buffer data is structured as messages, where each message is a small logical record of information containing a series of name-value pairs called fields.
A scalar message field can have different types to specify on .proto
file and the corresponding type in the automatically generated class. More can be found on the link.
message Person {
string name = 1;
int32 id = 2;
bool isEmployed = 3;
}
Then, once you’ve specified your data structures, you use the protocol buffer compiler protoc
to generate data access classes in your preferred language(s) from your proto definition. These provide simple accessors for each field, like name()
and set_name()
, as well as methods to serialize/parse the whole structure to/from raw bytes. So, for instance, if your chosen language is C++, running the compiler on the example above will generate a class called Person
. You can then use this class in your application to populate, serialize, and retrieve Person
protocol buffer messages.
As this is in the form of a contract, both the client and server need to have the same proto file. The proto file acts as the intermediary contract for client to call any available functions from the server.
Service definition
If you want to use your message types with an RPC (Remote Procedure Call) system, you can define an RPC service interface in a .proto
file and the protocol buffer compiler will generate service interface code and stubs in your chosen language. So, for example, if you want to define an RPC service with a method that takes your HelloRequest
and returns a HelloResponse
, you can define it in your .proto
file as follows:
service HelloService {
rpc SayHello (HelloRequest) returns (HelloResponse);
}
Protobuf compiler
To generate the Java, Python, C++, Go, Ruby, Objective-C, or C# code you need to work with the message types defined in a .proto
file, you need to run the protocol buffer compiler protoc
on the .proto
Installation of protoc (protobuf compiler)
Linux, using apt
or apt-get
, for example:
$ apt install -y protobuf-compiler
$ protoc --version # Ensure compiler version is 3+
macOS, using Homebrew:
$ brew install protobuf
$ protoc --version # Ensure compiler version is 3+
The protocol buffer compiler, protoc
, is used to compile .proto
files, which contain service and message definitions and will generate as output, source files according to the configured language by its arguments, in this case, js.
protoc --proto_path=protos --js_out=import_style=commonjs,binary:build/ fileName.proto
By default, the compiler generates code with Closure-style imports. If you specify a library
option when running the compiler, the compiler creates a single .js
file with your specified library name. Otherwise the compiler generates a .js
file for each message in your .proto
file. The names of the output files are computed by taking the library
value or message name (lowercased), with the following changes:
- A
.js
the extension is added. - The proto path (specified with the
--proto_path=
or-I
command-line flag) is replaced with the output path (specified with the--js_out=
flag). - fileName.proto is the name of the proto file to compile
Getting it all together
Let us define a proto file login.proto
syntax = "proto3";
message Login {
string userName = 1;
string password = 2;
}
Now let us compile this with proto compiler
$ protoc --js_out=import_style=commonjs,binary:. login.proto
protoc
has generated login_pb.js
from login.proto
for you. Now you can use them anywhere you want, like this:
// Serialization
const pb = require('./login_pb')
const data = { userName: 'Issac', password: 'Newton' }
var msg = new pb.Login();
msg.setuserName(data.userName)
msg.setPassword(data.password)// Deserialization
const bytes = msg.serializeBinary();
const msg2 = pb.Login.deserializeBinary(bytes)
console.log(msg2.getStatus(), msg2.getMessage())
The serialized data you got is UInt8Array
.
Summary:
Protobuf is an ideal format for data serialization. It’s much smaller than JSON and allows for the explicit definition of interfaces. It will quickly pay off if we invest small amount of time into it.
Thank You :)