[原创] Protocol Buffer Basics: C++中文翻译(Google Protocol Buffers中文教程)

Protocol Buffer Basics: C++Protocol Buffer基础:C++篇)


注:这是本人的翻译,可能不准确,可能有错误,但是基本上可以理解,希望能对大家有所帮助!(转载需注明出处:本文来自learnhard的博客:http://www.codelast.com/ http://blog.csdn.net/learnhard/


This tutorial provides a basic C++ programmer's introduction to working with protocol buffers. By walking through creating a simple example application, it shows you how to


·         Define message formats in a .proto file.


·         Use the protocol buffer compiler.


·         Use the C++ protocol buffer API to write and read messages.


本教程提供了面向C++程序员的protocol buffers的基本介绍。通过创建一个简单的示例程序,它教你如何:


l  定义.proto文件的消息格式。


l  使用protocol buffer的编译器。


l  使用protoco bufferC++ API来读写消息。


 


This isn't a comprehensive guide to using protocol buffers in C++. For more detailed reference information, see the Protocol Buffer Language Guide, the C++ API Reference, the C++ Generated Code Guide, and the Encoding Reference.


本文并不是关于protocol buffersC++使用的全面教程。要查看更详细的参考资料,请阅如下文章:Protocol Buffer Language GuideC++ API ReferenceC++ Generated Code Guide以及 Encoding Reference



Why Use Protocol Buffers? 为什么要使用protocol buffers


The example we're going to use is a very simple "address book" application that can read and write people's contact details to and from a file. Each person in the address book has a name, an ID, an email address, and a contact phone number.


How do you serialize and retrieve structured data like this? There are a few ways to solve this problem:


·         The raw in-memory data structures can be sent/saved in binary form. Over time, this is a fragile approach, as the receiving/reading code must be compiled with exactly the same memory layout, endianness, etc. Also, as files accumulate data in the raw format and copies of software that are wired for that format are spread around, it's very hard to extend the format.


·         You can invent an ad-hoc way to encode the data items into a single string – such as encoding 4 ints as "12:3:-23:67". This is a simple and flexible approach, although it does require writing one-off encoding and parsing code, and the parsing imposes a small run-time cost. This works best for encoding very simple data.


·         Serialize the data to XML. This approach can be very attractive since XML is (sort of) human readable and there are binding libraries for lots of languages. This can be a good choice if you want to share data with other applications/projects. However, XML is notoriously space intensive, and encoding/decoding it can impose a huge performance penalty on applications. Also, navigating an XML DOM tree is considerably more complicated than navigating simple fields in a class normally would be.


Protocol buffers are the flexible, efficient, automated solution to solve exactly this problem. With protocol buffers, you write a .proto description of the data structure you wish to store. From that, the protocol buffer compiler creates a class that implements automatic encoding and parsing of the protocol buffer data with an efficient binary format. The generated class provides getters and setters for the fields that make up a protocol buffer and takes care of the details of reading and writing the protocol buffer as a unit. Importantly, the protocol buffer format supports the idea of extending the format over time in such a way that the code can still read data encoded with the old format.


(这一段不翻译了,没意思)


Where to Find the Example Code 在哪可以找到示例代码


The example code is included in the source code package, under the "examples" directory. Download it here.


示例代码包含在源代码包中的“examples”目录下。此处下载


Defining Your Protocol Format 定义你自己的协议格式


To create your address book application, you'll need to start with a .proto file. The definitions in a .proto file are simple: you add a message for each data structure you want to serialize, then specify a name and a type for each field in the message. Here is the .proto file that defines your messages, addressbook.proto.


要创建你的地址薄应用程序,你需要从编写一个.proto文件开始。.proto文件的定义是比较简单的:为每一个你需要序列化的数据结构添加一个消息(message),然后为消息(message)中的每一个字段(field)指定一个名字和一个类型。下面就是一个定义你的多个消息(messages)的文件addressbook.proto


package tutorial;


 


message Person {


  required string name = 1;


  required int32 id = 2;


  optional string email = 3;


 


  enum PhoneType {


    MOBILE = 0;


    HOME = 1;


    WORK = 2;


  }


 


  message PhoneNumber {


    required string number = 1;


    optional PhoneType type = 2 [default = HOME];


  }


 


  repeated PhoneNumber phone = 4;


}


 


message AddressBook {


  repeated Person person = 1;


}


As you can see, the syntax is similar to C++ or Java. Let's go through each part of the file and see what it does.


正如你所看到的一样,该语法类似于C++Java的语法。让我们依次来看看文件的每一部分的作用。


 The .proto file starts with a package declaration, which helps to prevent naming conflicts between different projects. In C++, your generated classes will be placed in a namespace matching the package name.


.proto文件以一个package声明开始。这个声明是为了防止不同项目之间的命名冲突。对应到C++中去,你用这个.proto文件生成的类将被放置在一个与package名相同的命名空间中。


文章来源:http://www.codelast.com/


Next, you have your message definitions. A message is just an aggregate containing a set of typed fields. Many standard simple data types are available as field types, including boolint32floatdouble, and string. You can also add further structure to your messages by using other message types as field types – in the above example the Person message contains PhoneNumber messages, while the AddressBook message contains Person messages. You can even define message types nested inside other messages – as you can see, the PhoneNumber type is defined inside Person. You can also define enum types if you want one of your fields to have one of a predefined list of values – here you want to specify that a phone number can be one of MOBILEHOME, or WORK.


再往下看,就是若干消息(message)定义了。一个消息就是某些类型的字段的集合。许多标准的、简单的数据类型都可以用作字段类型,包括boolint32floatdouble,以及string。你也可以使用其他的消息(message)类型来作为你的字段类型——在上面的例子中,消息Person就是一个被用作字段类型的例子。


 


The " = 1", " = 2" markers on each element identify the unique "tag" that field uses in the binary encoding. Tag numbers 1-15 require one less byte to encode than higher numbers, so as an optimization you can decide to use those tags for the commonly used or repeated elements, leaving tags 16 and higher for less-commonly used optional elements. Each element in a repeated field requires re-encoding the tag number, so repeated fields are particularly good candidates for this optimization.


在每一项后面的、类似于“= 1”,“= 2”的标志指出了该字段在二进制编码中使用的唯一“标识(tag)”。标识号1~15编码所需的字节数比更大的标识号使用的字节数要少1个,所以,如果你想寻求优化,可以为经常使用或者重复的项采用1~15的标识(tag),其他经常使用的optional项采用≥16的标识(tag)。在重复的字段中,每一项都要求重编码标识号(tag number),所以重复的字段特别适用于这种优化情况。


 Each field must be annotated with one of the following modifiers:


·         required: a value for the field must be provided, otherwise the message will be considered "uninitialized". If libprotobuf is compiled in debug mode, serializing an uninitialized message will cause an assertion failure. In optimized builds, the check is skipped and the message will be written anyway. However, parsing an uninitialized message will always fail (by returning false from the parse method). Other than this, a required field behaves exactly like an optional field.


·         optional: the field may or may not be set. If an optional field value isn't set, a default value is used. For simple types, you can specify your own default value, as we've done for the phone number type in the example. Otherwise, a system default is used: zero for numeric types, the empty string for strings, false for bools. For embedded messages, the default value is always the "default instance" or "prototype" of the message, which has none of its fields set. Calling the accessor to get the value of an optional (or required) field which has not been explicitly set always returns that field's default value.


·         repeated: the field may be repeated any number of times (including zero). The order of the repeated values will be preserved in the protocol buffer. Think of repeated fields as dynamically sized arrays.


每一个字段都必须用以下之一的修饰符来修饰:


l  required:必须提供字段值,否则对应的消息就会被认为是“未初始化的”。如果libprotobuf是以debug模式编译的,序列化一个未初始化的消息(message)将会导致一个断言错误。在优化过的编译情况下(译者注:例如release),该检查会被跳过,消息会被写入。然而,解析一个未初始化的消息仍然会失败(解析函数会返回false)。除此之外,一个required的字段与一个optional的字段就没有区别了。


l  optional:字段值指定与否都可以。如果没有指定一个optional的字段值,它就会使用默认值。对简单类型来说,你可以指定你自己的默认值,就像我们在上面的例子中对phone numbertype字段所做的一样。如果你不指定默认值,就会使用系统默认值:数据类型的默认值为0string的默认值为空字符串,bool的默认值为false。对嵌套消息(message)来说,其默认值总是消息的“默认实例”或“原型”,即:没有任何一个字段是指定了值的。调用访问类来取一个未显式指定其值的optional(或者required)的字段的值,总是会返回字段的默认值。


l  repeated:字段会重复N次(N可以为0)。重复的值的顺序将被保存在protocol buffer中。你只要将重复的字段视为动态大小的数组就可以了。


Required Is Forever You should be very careful about marking fields as required. If at some point you wish to stop writing or sending a required field, it will be problematic to change the field to an optional field – old readers will consider messages without this field to be incomplete and may reject or drop them unintentionally. You should consider writing application-specific custom validation routines for your buffers instead. Some engineers at Google have come to the conclusion that using required does more harm than good; they prefer to use onlyoptional and repeated. However, this view is not universal.


required是永久性的:在把一个字段标识为required的时候,你应该特别小心。如果在某些情况下你不想写入或者发送一个required的字段,那么将该字段更改为optional可能会遇到问题——旧版本的读者(译者注:即读取、解析消息的一方)会认为不含该字段的消息(message)是不完整的,从而有可能会拒绝解析。在这种情况下,你应该考虑编写特别针对于应用程序的、自定义的消息校验函数。Google的一些工程师得出了一个结论:使用required弊多于利;他们更愿意使用optionalrepeated而不是required。当然,这个观点并不具有普遍性。


文章来源:http://www.codelast.com/


You'll find a complete guide to writing .proto files – including all the possible field types – in the Protocol Buffer Language Guide. Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.


你可以在Protocol Buffer Language Guide一文中找到编写.proto文件的完整指南(包括所有可能的字段类型)。但是,不要想在里面找到与类继承相似的特性,因为protocol buffers不是拿来做这个的。


Compiling Your Protocol Buffers 编译你的protocol buffers


Now that you have a .proto, the next thing you need to do is generate the classes you'll need to read and write AddressBook (and hence Personand PhoneNumber) messages. To do this, you need to run the protocol buffer compiler protoc on your .proto:


1.    If you haven't installed the compiler, download the package and follow the instructions in the README.


2.    Now run the compiler, specifying the source directory (where your application's source code lives – the current directory is used if you don't provide a value), the destination directory (where you want the generated code to go; often the same as $SRC_DIR), and the path to your.proto. In this case, you...:


protoc -I=$SRC_DIR --cpp_out=$DST_DIR $SRC_DIR/addressbook.proto


Because you want C++ classes, you use the --cpp_out option – similar options are provided for other supported languages.


This generates the following files in your specified destination directory:


·         addressbook.pb.h, the header which declares your generated classes.


·         addressbook.pb.cc, which contains the implementation of your classes.


在得到了一个.proto文件之后,下一步你就要生成可以读写AddressBook消息(当然也就包括了Person以及PhoneNumber消息)的类了。此时你需要运行protocol buffer编译器来编译你的.proto文件:


1. 如果你还没有安装该编译器,下载安装包 并参照README文件中的说明来安装


2. 安装了之后,就可以运行编译器了。指定源目录(即你的应用程序源代码所在的目录——如果不指定的话,就使用当前目录)、目标目录(即生成的代码放置的目录,通常与$SRC_DIR是一样的),以及你的.proto文件所在的目录。在我们这里,可以这样用:


protoc -I=$SRC_DIR --cpp_out=$DST_DIR $SRC_DIR/addressbook.proto


因为需要生成的是C++类,所以使用了--cpp_out选项参数——protocol buffers也为其他支持的语言提供了类似的选项参数。这样就可以在你指定的目标目录下生成如下文件:


l  addressbook.pb.h声明你生成的类的头文件。


l  addressbook.pb.cc你生成的类的实现文件。


 


The Protocol Buffer API


Let's look at some of the generated code and see what classes and functions the compiler has created for you. If you look in tutorial.pb.h, you can see that you have a class for each message you specified in tutorial.proto. Looking closer at the Person class, you can see that the complier has generated accessors for each field. For example, for the nameidemail, and phone fields, you have these methods:


让我们看一下生成的代码,了解一下编译器为你创建了什么样的类和函数。如果你看了tutorial.pb.h文件,就会发现你得到了一个类,它对应于tutorial.proto文件中写的每一个消息(message)。更深入一步,看看Person 类:编译器为每一个字段生成了读写函数。例如,对nameidemail以及phone字段,分别有如下函数:


  // name
  inline bool has_name() const;
  inline void clear_name();
  inline const ::std::string& name() const;
  inline void set_name(const ::std::string& value);
  inline void set_name(const char* value);
  inline ::std::string* mutable_name();

  // id
  inline bool has_id() const;
  inline void clear_id();
  inline int32_t id() const;
  inline void set_id(int32_t value);

  // email
  inline bool has_email() const;
  inline void clear_email();
  inline const ::std::string& email() const;
  inline void set_email(const ::std::string& value);
  inline void set_email(const char* value);
  inline ::std::string* mutable_email();

  // phone
  inline int phone_size() const;
  inline void clear_phone();
  inline const ::google::protobuf::RepeatedPtrField< ::tutorial::Person_PhoneNumber >& phone() const;
  inline ::google::protobuf::RepeatedPtrField< ::tutorial::Person_PhoneNumber >* mutable_phone();
  inline const ::tutorial::Person_PhoneNumber& phone(int index) const;
  inline ::tutorial::Person_PhoneNumber* mutable_phone(int index);
  inline ::tutorial::Person_PhoneNumber* add_phone();


As you can see, the getters have exactly the name as the field in lowercase, and the setter methods begin with set_. There are also has_ methods for each singular (required or optional) field which return true if that field has been set. Finally, each field has a clear_ method that un-sets the field back to its empty state.


正如你所看到的,getter函数具有与字段名一模一样的名字,并且是小写的,而setter函数都是以set_前缀开头。此外,还有has_前缀的函数,对每一个单一的(requiredoptional的)字段(译者注:此处估计是“非repeated字段”的意思)来说,如果字段被置(set)了值,该函数会返回true。最后,每一个字段还有一个clear_前缀的函数,用来将字段重置(un-set)到空状态(empty state)。


文章来源:http://www.codelast.com/


While the numeric id field just has the basic accessor set described above, the name and email fields have a couple of extra methods because they're strings – a mutable_ getter that lets you get a direct pointer to the string, and an extra setter. Note that you can call mutable_email() even ifemail is not already set; it will be initialized to an empty string automatically. If you had a singular message field in this example, it would also have a mutable_ method but not a set_ method.


然而,数值类型的字段id就只有如上所述的基本读写函数,nameemail字段则有一些额外的函数,因为它们是string——前缀为mutable_的函数返回string的直接指针(direct pointer)。除此之外,还有一个额外的setter函数。注意:你甚至可以在email还没有被置(set)值的时候就调用mutable_email(),它会被自动初始化为一个空字符串。在此例中,如果有一个单一消息字段,那么它也会有一个mutable_ 前缀的函数,但是没有一个set_ 前缀的函数。


 Repeated fields also have some special methods – if you look at the methods for the repeated phone field, you'll see that you can


·         check the repeated field's _size (in other words, how many phone numbers are associated with this Person).


·         get a specified phone number using its index.


·         update an existing phone number at the specified index.


·         add another phone number to the message which you can then edit (repeated scalar types have an add_ that just lets you pass in the new value).


重复的字段也有一些特殊的函数——如果你看一下重复字段phone 的那些函数,就会发现你可以:


l  得到重复字段的_size(换句话说,这个Person关联了多少个电话号码)。


l  通过索引(index)来获取一个指定的电话号码。


l  通过指定的索引(index)来更新一个已经存在的电话号码。


l  向消息(message)中添加另一个电话号码,然后你可以编辑它(重复的标量类型有一个add_前缀的函数,允许你传新值进去)。


 


For more information on exactly what members the protocol compiler generates for any particular field definition, see the C++ generated code reference.


关于编译器如何生成特殊字段的更多信息,请查看文章C++ generated code reference


Enums and Nested Classes 枚举和嵌套类


The generated code includes a PhoneType enum that corresponds to your .proto enum. You can refer to this type as Person::PhoneType and its values as Person::MOBILEPerson::HOME, and Person::WORK (the implementation details are a little more complicated, but you don't need to understand them to use the enum).


生成的代码中包含了一个PhoneType 枚举,它对应于.proto文件中的那个枚举。你可以把这个类型当作Person::PhoneType,其值为Person::MOBILEPerson::HOMEPerson::WORK(实现的细节稍微复杂了点,但是没关系,不理解它也不会影响你使用该枚举)。


 


The compiler has also generated a nested class for you called Person::PhoneNumber. If you look at the code, you can see that the "real" class is actually called Person_PhoneNumber, but a typedef defined inside Person allows you to treat it as if it were a nested class. The only case where this makes a difference is if you want to forward-declare the class in another file – you cannot forward-declare nested types in C++, but you can forward-declare Person_PhoneNumber.


编译器还生成了一个名为Person::PhoneNumber的嵌套类。如果你看看代码,就会发现“真实的”类实际上是叫做Person_PhoneNumber,只不过Person 内部的一个typedef允许你像一个嵌套类一样来对待它。这一点所造成的唯一一个区别就是:如果你想在另一个文件中对类进行前向声明(forward-declare)的话,你就不能在C++中对嵌套类型进行前向声明了,但是你可以对Person_PhoneNumber进行前向声明。


Standard Message Methods 标准消息函数


Each message class also contains a number of other methods that let you check or manipulate the entire message, including:


·         bool IsInitialized() const;: checks if all the required fields have been set.


·         string DebugString() const;: returns a human-readable representation of the message, particularly useful for debugging.


·         void CopyFrom(const Person& from);: overwrites the message with the given message's values.


·         void Clear();: clears all the elements back to the empty state.


These and the I/O methods described in the following section implement the Message interface shared by all C++ protocol buffer classes. For more info, see the complete API documentation for Message.


每一个消息(message)还包含了其他一系列函数,用来检查或管理整个消息,包括:


l  bool IsInitialized() const;:检查是否全部的required字段都被置(set)了值。


l  string DebugString() const;:返回一个易读的消息表示形式,对调试特别有用。


l  void CopyFrom(const Person& from);:用外部消息的值,覆写调用者消息内部的值。


l  void Clear();:将所有项复位到空状态(empty state)。


这些函数以及后面章节将要提到的I/O函数实现了Message 的接口,它们被所有C++ protocol buffer类共享。更多信息,请查看文章 complete API documentation for Message


 


Parsing and Serialization 解析&序列化


Finally, each protocol buffer class has methods for writing and reading messages of your chosen type using the protocol buffer binary format. These include:


·         bool SerializeToString(string* output) const;: serializes the message and stores the bytes in the given string. Note that the bytes are binary, not text; we only use the string class as a convenient container.


·         bool ParseFromString(const string& data);: parses a message from the given string.


·         bool SerializeToOstream(ostream* output) const;: writes the message to the given C++ ostream.


·         bool ParseFromIstream(istream* input);: parses a message from the given C++ istream.


These are just a couple of the options provided for parsing and serialization. Again, see the Message API reference for a complete list.


最后,每一个protocol buffer类都有读写你所选择的消息类型的函数。它们包括:


l  bool SerializeToString(string* output) const;将消息序列化并储存在指定的string中。注意里面的内容是二进制的,而不是文本;我们只是使用string作为一个很方便的容器。


l  bool ParseFromString(const string& data);从给定的string解析消息。


l  bool SerializeToOstream(ostream* output) const;将消息写入到给定的C++ ostream中。


l  bool ParseFromIstream(istream* input);从给定的C++ istream解析消息。


这些函数只是用于解析和序列化的几个函数罢了。请再次参考Message API reference以查看完整的函数列表。


 


Protocol Buffers and O-O Design Protocol buffer classes are basically dumb data holders (like structs in C++); they don't make good first class citizens in an object model. If you want to add richer behaviour to a generated class, the best way to do this is to wrap the generated protocol buffer class in an application-specific class. Wrapping protocol buffers is also a good idea if you don't have control over the design of the .proto file (if, say, you're reusing one from another project). In that case, you can use the wrapper class to craft an interface better suited to the unique environment of your application: hiding some data and methods, exposing convenience functions, etc. You should never add behaviour to the generated classes by inheriting from them. This will break internal mechanisms and is not good object-oriented practice anyway.


protocol buffers和面向对象的设计 protocol buffer类通常只是纯粹的数据存储器(就像C++中的结构体一样);它们在对象模型中并不是一等公民。如果你想向生成的类中添加更丰富的行为,最好的方法就是在应用程序中对它进行封装。如果你无权控制.proto文件的设计的话,封装protocol buffers也是一个好主意(例如,你从另一个项目中重用一个.proto文件)。在那种情况下,你可以用封装类来设计接口,以更好地适应你的应用程序的特定环境:隐藏一些数据和方法,暴露一些便于使用的函数,等等。但是你绝对不要通过继承生成的类来添加行为。这样做的话,会破坏其内部机制,并且不是一个好的面向对象的实践。


Writing A Message 写消息


Now let's try using your protocol buffer classes. The first thing you want your address book application to be able to do is write personal details to your address book file. To do this, you need to create and populate instances of your protocol buffer classes and then write them to an output stream.


Here is a program which reads an AddressBook from a file, adds one new Person to it based on user input, and writes the new AddressBook back out to the file again. The parts which directly call or reference code generated by the protocol compiler are highlighted.


现在让我们尝试使用你的protocol buffer类。你想让你的address book程序完成的第一件事情就是向你的address book文件写入详细的个人信息。要实现这一点,你需要创建protocol buffer类的实例并将它们写入到一个输出流(output stream)中。


下面的这个程序从一个文件中读取AddressBook ,然后根据用户的输入向其中添加一个新的Person ,然后再将新的AddressBook 写回文件中。由protocol buffer编译器生成的代码或者直接调用的代码都被突出显示了。


#include <iostream>
#include <fstream>
#include <string>
#include "addressbook.pb.h"
using namespace std;

// This function fills in a Person message based on user input.
void PromptForAddress(tutorial::Person* person) {
  cout << "Enter person ID number: ";
  int id;
  cin >> id;
  person->set_id(id);
  cin.ignore(256, '\n');           // cin.ignore(a,ch)方法是从输入流(cin)中提取字符,提取的字符被忽略(ignore),不被使用。每抛弃一个字符,它都要计数和比较字符:如果计数值达到a或者被抛弃的字符是ch,则cin.ignore()函数执行终止;否则,它继续等待。它的一个常用功能就是用来清除以回车结束的输入缓冲区的内容,消除上一次输入对下一次输入的影响。比如可以这么用:cin.ignore(1024,'\n'),通常把第一个参数设置得足够大,这样实际上总是只有第二个参数'\n'起作用,所以这一句就是把回车(包括回车)之前的所以字符从输入缓冲(流)中清除出去。

  cout << "Enter name: ";
  getline(cin, *person->mutable_name());

  cout << "Enter email address (blank for none): ";
  string email;
  getline(cin, email);
  if (!email.empty()) {
    person->set_email(email);
  }

  while (true) {
    cout << "Enter a phone number (or leave blank to finish): ";
    string number;
    getline(cin, number);
    if (number.empty()) {
      break;
    }

    tutorial::Person::PhoneNumber* phone_number = person->add_phone();
    phone_number->set_number(number);

    cout << "Is this a mobile, home, or work phone? ";
    string type;
    getline(cin, type);
    if (type == "mobile") {
      phone_number->set_type(tutorial::Person::MOBILE);
    } else if (type == "home") {
      phone_number->set_type(tutorial::Person::HOME);
    } else if (type == "work") {
      phone_number->set_type(tutorial::Person::WORK);
    } else {
      cout << "Unknown phone type.  Using default." << endl;
    }
  }
}

// Main function:  Reads the entire address book from a file,
//   adds one person based on user input, then writes it back out to the same
//   file.
int main(int argc, char* argv[]) {
  // Verify that the version of the library that we linked against is
  // compatible with the version of the headers we compiled against.
  GOOGLE_PROTOBUF_VERIFY_VERSION;

  if (argc != 2) {
    cerr << "Usage:  " << argv[0] << " ADDRESS_BOOK_FILE" << endl;
    return -1;
  }

  tutorial::AddressBook address_book;

  {
    // Read the existing address book.
    fstream input(argv[1], ios::in | ios::binary);
    if (!input) {
      cout << argv[1] << ": File not found.  Creating a new file." << endl;
    } else if (!address_book.ParseFromIstream(&input)) {
      cerr << "Failed to parse address book." << endl;
      return -1;
    }
  }

  // Add an address.
  PromptForAddress(address_book.add_person());

  {
    // Write the new address book back to disk.
    fstream output(argv[1], ios::out | ios::trunc | ios::binary);
    if (!address_book.SerializeToOstream(&output)) {
      cerr << "Failed to write address book." << endl;
      return -1;
    }
  }

  // Optional:  Delete all global objects allocated by libprotobuf.
  google::protobuf::ShutdownProtobufLibrary();

  return 0;
}


Notice the GOOGLE_PROTOBUF_VERIFY_VERSION macro. It is good practice – though not strictly necessary – to execute this macro before using the C++ Protocol Buffer library. It verifies that you have not accidentally linked against a version of the library which is incompatible with the version of the headers you compiled with. If a version mismatch is detected, the program will abort. Note that every .pb.cc file automatically invokes this macro on startup.


注意GOOGLE_PROTOBUF_VERIFY_VERSION宏。你最好像这样——尽管这不是严格要求的——在使用C++ Protocol Buffer库之前执行该宏。它会检查你是不是在无意中链接到了与你使用的头文件不兼容的protocol buffer库。如果检测到了不匹配情况,程序会中止运行下去。注意:每一个.pb.cc文件在开始的时候都会自动调用该宏。


文章来源:http://www.codelast.com/


Also notice the call to ShutdownProtobufLibrary() at the end of the program. All this does is delete any global objects that were allocated by the Protocol Buffer library. This is unnecessary for most programs, since the process is just going to exit anyway and the OS will take care of reclaiming all of its memory. However, if you use a memory leak checker that requires that every last object be freed, or if you are writing a library which may be loaded and unloaded multiple times by a single process, then you may want to force Protocol Buffers to clean up everything.


另外还需要注意的是程序结尾处调用的ShutdownProtobufLibrary()函数。该函数所做的所有工作就是删除由Protocol Buffer库分配的全局对象。在大多数程序中,这都是没有必要的,因为进程一退出,操作系统就回收了它的内存。然而,如果你使用了内存检查工具(译者注:例如valgrind)来检查你的程序的话(内存检查工具要求每一个对象最后都要被释放),或者你写了一个可能会在一个进程中多次被加载、卸载的库,那么你可能就需要强制Protocol Buffer来清理一切了。


Reading A Message 读消息


Of course, an address book wouldn't be much use if you couldn't get any information out of it! This example reads the file created by the above example and prints all the information in it.


当然,如果你不能从一个address book中取出信息的话,那么它也就没什么用了!下面的例子展示了如何读取上面的程序创建的文件,并将读到的所有信息打印出来。


#include <iostream>
#include <fstream>
#include <string>
#include "addressbook.pb.h"
using namespace std;

// Iterates though all people in the AddressBook and prints info about them.
void ListPeople(const tutorial::AddressBook& address_book) {
  for (int i = 0; i < address_book.person_size(); i++) {
    const tutorial::Person& person = address_book.person(i);

    cout << "Person ID: " << person.id() << endl;
    cout << "  Name: " << person.name() << endl;
    if (person.has_email()) {
      cout << "  E-mail address: " << person.email() << endl;
    }

    for (int j = 0; j < person.phone_size(); j++) {
      const tutorial::Person::PhoneNumber& phone_number = person.phone(j);

      switch (phone_number.type()) {
        case tutorial::Person::MOBILE:
          cout << "  Mobile phone #: ";
          break;
        case tutorial::Person::HOME:
          cout << "  Home phone #: ";
          break;
        case tutorial::Person::WORK:
          cout << "  Work phone #: ";
          break;
      }
      cout << phone_number.number() << endl;
    }
  }
}

// Main function:  Reads the entire address book from a file and prints all
//   the information inside.
int main(int argc, char* argv[]) {
  // Verify that the version of the library that we linked against is
  // compatible with the version of the headers we compiled against.
  GOOGLE_PROTOBUF_VERIFY_VERSION;

  if (argc != 2) {
    cerr << "Usage:  " << argv[0] << " ADDRESS_BOOK_FILE" << endl;
    return -1;
  }

  tutorial::AddressBook address_book;

  {
    // Read the existing address book.
    fstream input(argv[1], ios::in | ios::binary);
    if (!address_book.ParseFromIstream(&input)) {
      cerr << "Failed to parse address book." << endl;
      return -1;
    }
  }

  ListPeople(address_book);

  // Optional:  Delete all global objects allocated by libprotobuf.
  google::protobuf::ShutdownProtobufLibrary();

  return 0;
}


Extending a Protocol Buffer 扩展一个protocol buffer


Sooner or later after you release the code that uses your protocol buffer, you will undoubtedly want to "improve" the protocol buffer's definition. If you want your new buffers to be backwards-compatible, and your old buffers to be forward-compatible – and you almost certainly do want this – then there are some rules you need to follow. In the new version of the protocol buffer:


·         you must not change the tag numbers of any existing fields.


·         you must not add or delete any required fields.


·         you may delete optional or repeated fields.


·         you may add new optional or repeated fields but you must use fresh tag numbers (i.e. tag numbers that were never used in this protocol buffer, not even by deleted fields).


(There are some exceptions to these rules, but they are rarely used.)


If you follow these rules, old code will happily read new messages and simply ignore any new fields. To the old code, optional fields that were deleted will simply have their default value, and deleted repeated fields will be empty. New code will also transparently read old messages. However, keep in mind that new optional fields will not be present in old messages, so you will need to either check explicitly whether they're set with has_, or provide a reasonable default value in your .proto file with [default = value] after the tag number. If the default value is not specified for an optional element, a type-specific default value is used instead: for strings, the default value is the empty string. For booleans, the default value is false. For numeric types, the default value is zero. Note also that if you added a new repeated field, your new code will not be able to tell whether it was left empty (by new code) or never set at all (by old code) since there is no has_ flag for it.


无论或早或晚,在你放出你那使用protocol buffer的代码之后,你必定会想“改进”protocol buffer的定义。如果你想让你的新buffer向后兼容(backwards-compatible),并且旧的buffer能够向前兼容(forward-compatible)——你一定希望如此——那么你在新的protocol buffer中就要遵守其他的一些规则了:


l  对已存在的任何字段,你都不能更改其标识(tag)号。


l  你绝对不能添加或删除任何required的字段。


l  你可以添加新的optionalrepeated的字段,但是你必须使用新的标识(tag)号(例如,在这个protocol buffer中从未使用过的标识号——甚至于已经被删除过的字段使用过的标识号也不行)。


(有一些例外情况,但是它们很少使用。)


如果你遵守这些规则,老的代码将能很好地解析新的消息(message),并忽略掉任何新的字段。对老代码来说,已经被删除的optional字段将被赋予默认值,已被删除的repeated字段将是空的。新的代码也能够透明地读取旧的消息。但是,请牢记心中:新的optional字段将不会出现在旧的消息中,所以你要么需要显式地检查它们是否由has_前缀的函数置(set)了值,要么在你的.proto文件中,在标识(tag)号的后面用[default = value]提供一个合理的默认值。如果没有为一个optional项指定默认值,那么就会使用与特定类型相关的默认值:对string来说,默认值是空字符串。对boolean来说,默认值是false。对数值类型来说,默认值是0。还要注意:如果你添加了一个新的repeated字段,你的新代码将无法告诉你它是否被留空了(被新代码),或者是否从未被置(set)值(被旧代码),这是因为它没有has_标志。


Optimization Tips 优化小技巧


The C++ Protocol Buffers library is extremely heavily optimized. However, proper usage can improve performance even more. Here are some tips for squeezing every last drop of speed out of the library:


·         Reuse message objects when possible. Messages try to keep around any memory they allocate for reuse, even when they are cleared. Thus, if you are handling many messages with the same type and similar structure in succession, it is a good idea to reuse the same message object each time to take load off the memory allocator. However, objects can become bloated over time, especially if your messages vary in "shape" or if you occasionally construct a message that is much larger than usual. You should monitor the sizes of your message objects by calling the SpaceUsed method and delete them once they get too big.


·         Your system's memory allocator may not be well-optimized for allocating lots of small objects from multiple threads. Try using Google's tcmalloc instead.


Protocol Buffer C++库已经做了极度优化。但是,正确的使用方法仍然会提高很多性能。下面是一些小技巧,用来提升protocol buffer库的最后一丝速度能力:


l  如果有可能,重复利用消息(message)对象。即使被清除掉,消息(message)对象也会尽量保存所有被分配来重用的内存。这样的话,如果你正在处理很多类型相同的消息以及一系列相似的结构,有一个好办法就是重复使用同一个消息(message)对象,从而使内存分配的压力减小一些。然而,随着时间的流逝,对象占用的内存也有可能变得越来越大,尤其是当你的消息尺寸(译者注:各消息内容不同,有些消息内容多一些,有些消息内容少一些)不同的时候,或者你偶尔创建了一个比平常大很多的消息(message)的时候。你应该自己监测消息(message)对象的大小——通过调用SpaceUsed函数——并在它太大的时候删除它。


l  在多线程中分配大量小对象的内存的时候,你的操作系统的内存分配器可能优化得不够好。在这种情况下,你可以尝试用一下Google's tcmalloc


Advanced Usage 高级使用


Protocol buffers have uses that go beyond simple accessors and serialization. Be sure to explore the C++ API reference to see what else you can do with them.


One key feature provided by protocol message classes is reflection. You can iterate over the fields of a message and manipulate their values without writing your code against any specific message type. One very useful way to use reflection is for converting protocol messages to and from other encodings, such as XML or JSON. A more advanced use of reflection might be to find differences between two messages of the same type, or to develop a sort of "regular expressions for protocol messages" in which you can write expressions that match certain message contents. If you use your imagination, it's possible to apply Protocol Buffers to a much wider range of problems than you might initially expect!


Reflection is provided by the Message::Reflection interface.


Protocol Buffers的作用绝不仅仅是简单的数据存取以及序列化。请阅读C++ API reference全文来看看你还能用它来做什么。


protocol消息类所提供的一个关键特性就是反射。你不需要编写针对一个特殊的消息(message)类型的代码,就可以遍历一个消息的字段,并操纵它们的值,就像XMLJSON一样。“反射”的一个更高级的用法可能就是可以找出两个相同类型的消息之间的区别,或者开发某种“协议消息的正则表达式”,利用正则表达式,你可以对某种消息内容进行匹配。只要你发挥你的想像力,就有可能将Protocol Buffers应用到一个更广泛的、你可能一开始就期望解决的问题范围上。


“反射”是由Message::Reflection interface提供的。



文章来源:https://www.codelast.com/
➤➤ 版权声明 ➤➤ 
转载需注明出处:codelast.com 
感谢关注我的微信公众号(微信扫一扫):

wechat qrcode of codelast

发表评论