---
alias: []
description: 'Documentation for the Protobuf format'
input_format: true
keywords: ['Protobuf']
output_format: true
slug: /interfaces/formats/Protobuf
title: 'Protobuf'
---

import CloudNotSupportedBadge from '@theme/badges/CloudNotSupportedBadge';

<CloudNotSupportedBadge/>

| Input | Output | Alias |
|-------|--------|-------|
| ✔     | ✔      |       |

## Description {#description}

The `Protobuf` format is the [Protocol Buffers](https://protobuf.dev/) format.

This format requires an external format schema, which is cached between queries.

ClickHouse supports:
- both `proto2` and `proto3` syntaxes. 
- `Repeated`/`optional`/`required` fields.

## Example usage {#example-usage}

### Basic examples {#basic-examples}

Usage examples:

```sql
SELECT * FROM test.table FORMAT Protobuf SETTINGS format_schema = 'schemafile:MessageType'
```

```bash
cat protobuf_messages.bin | clickhouse-client --query "INSERT INTO test.table SETTINGS format_schema='schemafile:MessageType' FORMAT Protobuf"
```

Where the file `schemafile.proto` looks like this:

```capnp
syntax = "proto3";

message MessageType {
  string name = 1;
  string surname = 2;
  uint32 birthDate = 3;
  repeated string phoneNumbers = 4;
};
```

To find the correspondence between table columns and fields of the Protocol Buffers' message type, ClickHouse compares their names.
This comparison is case-insensitive and the characters `_` (underscore) and `.` (dot) are considered as equal.
If the types of a column and a field of the Protocol Buffers' message are different, then the necessary conversion is applied.

Nested messages are supported. For example, for the field `z` in the following message type:

```capnp
message MessageType {
  message XType {
    message YType {
      int32 z;
    };
    repeated YType y;
  };
  XType x;
};
```

ClickHouse tries to find a column named `x.y.z` (or `x_y_z` or `X.y_Z` and so on).

Nested messages are suitable for input or output of a [nested data structures](/sql-reference/data-types/nested-data-structures/index.md).

Default values defined in a protobuf schema like the one that follows are not applied, rather the [table defaults](/sql-reference/statements/create/table#default_values) are used instead of them:

```capnp
syntax = "proto2";

message MessageType {
  optional int32 result_per_page = 3 [default = 10];
}
```

ClickHouse inputs and outputs protobuf messages in the `length-delimited` format.
This means that before every message its length should be written as a [variable width integer (varint)](https://developers.google.com/protocol-buffers/docs/encoding#varints).

See also: [how to read/write length-delimited protobuf messages in popular languages](https://cwiki.apache.org/confluence/display/GEODE/Delimiting+Protobuf+Messages).

### Using autogenerated schema {#using-autogenerated-protobuf-schema}

If you don't have an external Protobuf schema for your data, you can still output/input data in the Protobuf format using an autogenerated schema.

For example:

```sql
SELECT * FROM test.hits format Protobuf SETTINGS format_protobuf_use_autogenerated_schema=1
```

In this case, ClickHouse will autogenerate the Protobuf schema according to the table structure using function [`structureToProtobufSchema`](/sql-reference/functions/other-functions.md#structure_to_protobuf_schema).
It will then use this schema to serialize data in the Protobuf format.

You can also read a Protobuf file with the autogenerated schema. In this case it is necessary for the file to be created using the same schema:

```bash
$ cat hits.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_protobuf_use_autogenerated_schema=1 FORMAT Protobuf"
```

The setting [`format_protobuf_use_autogenerated_schema`](/operations/settings/settings-formats.md#format_protobuf_use_autogenerated_schema) is enabled by default and applies if [`format_schema`](/operations/settings/formats#format_schema) is not set.

You can also save autogenerated schema in the file during input/output using setting [`output_format_schema`](/operations/settings/formats#output_format_schema). For example:

```sql
SELECT * FROM test.hits format Protobuf SETTINGS format_protobuf_use_autogenerated_schema=1, output_format_schema='path/to/schema/schema.proto'
```
In this case autogenerated Protobuf schema will be saved in file `path/to/schema/schema.capnp`.

### Drop protobuf cache {#drop-protobuf-cache}

To reload the Protobuf schema loaded from [`format_schema_path`](/operations/server-configuration-parameters/settings.md/#format_schema_path) use the [`SYSTEM DROP ... FORMAT CACHE`](/sql-reference/statements/system.md/#system-drop-schema-format) statement.

```sql
SYSTEM DROP FORMAT SCHEMA CACHE FOR Protobuf
```

## Format settings {#format-settings}