[cereal-io Python module] Converting Proto, Avro, and Thrift files with Python
The purpose of this module is to convert Google Protocol Buffer files, Apache Avro files, and Apache Thrift files to their counterparts.
The following example demonstrates how to convert a Google Protocol Buffer file to an Apache Avro file:
Given that the input file is helloworld.proto
:
message HelloRequest {
string name = 1;
}
message HelloReply {
string message = 1;
}
>>> from cereal import build
>>> svc = build('./examples/helloworld.proto')
>>> print(svc.to_avro())
[
{
"type": "record",
"name": "HelloRequest",
"fields": [
{
"type": "string",
"name": "name"
}
]
},
{
"type": "record",
"name": "HelloReply",
"fields": [
{
"type": "string",
"name": "message"
}
]
}
]
The svc
object is an instance of the Protobuf
class that contains a method called .to_avro()
. This returns a serialized JSON string that serves as the contents for a .avsc
file.
Converting a .avsc
file to a .proto
file uses a similar process:
>>> from cereal import build
>>> svc = build('./examples/helloworld.avsc')
>>> print(svc.to_protobuf())
message HelloRequest {
string name = 1;
}
message HelloReply {
string message = 1;
}
Given the following protocol buffer message:
// search.proto
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
Corpus corpus = 4;
}
cereal
converts enumerated types into the format defined in the Avro specification:
>>> from cereal import build
>>> svc = build('./examples/search.proto')
>>> print(svc.to_avro())
[
{
"type": "record",
"name": "SearchRequest",
"fields": [
{
"type": "string",
"name": "query"
},
{
"type": "int",
"name": "page_number"
},
{
"type": "int",
"name": "result_per_page"
},
{
"type": "enum",
"name": "Corpus",
"symbols": [
"UNIVERSAL",
"WEB",
"IMAGES",
"LOCAL",
"NEWS",
"PRODUCTS",
"VIDEO"
]
}
]
}
]