Using Apicurio Registry as an Apache Iceberg REST Catalog
This chapter explains how to use Apicurio Registry as an Apache Iceberg REST Catalog to manage table metadata for your data lakehouse applications:
-
A running Apicurio Registry instance with
apicurio.features.experimental.enabled=trueandapicurio.iceberg.enabled=true -
Query engines or tools that support the Iceberg REST Catalog specification (Apache Spark, Trino, ClickHouse, DuckDB, Flink)
Overview of the Apicurio Registry Iceberg REST Catalog
Apicurio Registry implements the Apache Iceberg REST Catalog API specification, enabling it to serve as a metadata catalog for Iceberg tables. This allows query engines like Apache Spark, Trino, ClickHouse, DuckDB, and Flink to use Apicurio Registry as their Iceberg catalog for managing table metadata.
The catalog maps Apicurio Registry concepts to Iceberg concepts as follows:
| Apicurio Registry Concept | Iceberg Concept | Description |
|---|---|---|
Group |
Namespace |
A logical namespace for organizing tables. Each group in Apicurio Registry appears as a namespace in Iceberg. |
Artifact (type: |
Table |
An Iceberg table registered in Apicurio Registry. The artifact content contains the full TableMetadata JSON. |
Artifact Content |
TableMetadata / ViewMetadata |
The metadata JSON stored as artifact content, containing schema, partition spec, sort order, snapshots, and other Iceberg metadata. |
Group Labels |
Namespace Properties |
Key-value properties associated with a namespace, stored as group labels. |
Artifact Version |
Table Commit |
Each commit to an Iceberg table creates a new artifact version in Apicurio Registry. The version content contains the updated TableMetadata JSON. |
Artifact Labels |
Table Properties |
Key-value properties associated with a table, stored as artifact labels. |
Prefix (path parameter) |
Catalog Identifier |
A configurable prefix that identifies the catalog instance (default: |
Configuring the Apicurio Registry Iceberg REST Catalog
This section describes the configuration options available for the Apicurio Registry Iceberg REST Catalog.
| Property | Environment Variable | Default | Description |
|---|---|---|---|
|
|
|
Enables or disables the Iceberg REST Catalog API. This is an experimental feature and also requires |
|
|
Empty string |
The default warehouse location for Iceberg tables. This is returned in the catalog configuration. |
|
|
|
The default prefix (catalog identifier) used when none is specified. |
apicurio.features.experimental.enabled=true
apicurio.iceberg.enabled=true
apicurio.iceberg.warehouse=s3://my-bucket/warehouse
apicurio.iceberg.default-prefix=production
Using the Iceberg REST API
Apicurio Registry exposes the Iceberg REST Catalog API at the /apis/iceberg/v1 endpoint. The following examples demonstrate how to interact with the API using curl.
Getting catalog configuration
curl http://localhost:8080/apis/iceberg/v1/config
{
"defaults": {},
"overrides": {}
}
Creating a namespace
curl -X POST http://localhost:8080/apis/iceberg/v1/default/namespaces \
-H "Content-Type: application/json" \
-d '{
"namespace": ["my_database"],
"properties": {
"owner": "data-team",
"description": "Production database"
}
}'
{
"namespace": ["my_database"],
"properties": {
"owner": "data-team",
"description": "Production database"
}
}
Listing namespaces
curl http://localhost:8080/apis/iceberg/v1/default/namespaces
{
"namespaces": [
["my_database"],
["another_database"]
]
}
Creating a table
curl -X POST http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables \
-H "Content-Type: application/json" \
-d '{
"name": "users",
"schema": {
"type": "struct",
"schema-id": 0,
"fields": [
{"id": 1, "name": "id", "required": true, "type": "long"},
{"id": 2, "name": "name", "required": true, "type": "string"},
{"id": 3, "name": "email", "required": false, "type": "string"},
{"id": 4, "name": "created_at", "required": true, "type": "timestamp"}
]
},
"properties": {
"write.format.default": "parquet"
}
}'
Loading a table
curl http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables/users
Listing tables in a namespace
curl http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables
{
"identifiers": [
{
"namespace": ["my_database"],
"name": "users"
}
]
}
Renaming a table
curl -X POST http://localhost:8080/apis/iceberg/v1/default/tables/rename \
-H "Content-Type: application/json" \
-d '{
"source": {
"namespace": ["my_database"],
"name": "users"
},
"destination": {
"namespace": ["my_database"],
"name": "customers"
}
}'
Committing updates to a table
The CommitTable operation is the core mechanism for updating Iceberg table metadata. It enables atomic metadata updates with optimistic concurrency control, which is the foundation of Iceberg’s snapshot isolation model.
A commit request contains two parts:
-
Requirements — preconditions that must be satisfied for the commit to proceed. If any requirement is not met, the commit fails with a
409 Conflictresponse and no changes are applied. -
Updates — mutations to apply to the table metadata. All updates within a single commit are applied atomically to produce one new metadata version.
Concurrency control
Apicurio Registry uses optimistic concurrency to detect conflicting commits. When a commit is processed:
-
The current table metadata is loaded and its version is recorded.
-
All requirements are validated against the current metadata.
-
All updates are applied in-memory to produce new metadata.
-
The new metadata is stored as a new artifact version.
-
If another commit was interleaved between steps 1 and 4, the conflict is detected and the commit fails with a
409 Conflictresponse.
This ensures that concurrent writers cannot silently overwrite each other’s changes. Clients that receive a 409 response should reload the current table metadata, revalidate their changes, and retry the commit.
All changes within a single commit — including the new version and any artifact-level label updates — are applied atomically within a single storage transaction. If any part of the commit fails, the entire operation is rolled back and no partial state is persisted.
Conflict handling and retries
When a commit fails with a 409 Conflict response, the appropriate client behavior is:
-
Reload the current table metadata using the
LoadTableendpoint. -
Revalidate the intended changes against the new metadata.
-
Rebuild the commit request with updated requirements.
-
Retry the commit.
Most Iceberg client libraries (such as the Apache Iceberg Java SDK) handle this retry loop automatically.
KafkaSQL storage considerations
When Apicurio Registry is deployed with the KafkaSQL storage backend, all write operations — including commits — are serialized through a Kafka journal topic. This has the following implications:
-
Ordering: Commits to the same table are totally ordered through Kafka partitioning. This means the optimistic concurrency check is always evaluated against the true latest state.
-
Failed commits: When a commit fails (for example, due to a concurrent write detected by the version-order check), the Kafka message remains in the journal. This is expected and harmless: during journal replay (such as when a new replica starts up), the message is re-consumed, the same version-order check fails again, and the message is silently discarded since no HTTP thread is waiting for a response.
-
Atomicity: The version creation and any artifact-level metadata updates are performed in a single SQL transaction. If the transaction fails, no data is written and the Kafka message becomes a no-op on replay.
Requirements
Requirements are assertions about the current state of the table. Each requirement has a type field that specifies the kind of assertion.
| Type | Fields | Description |
|---|---|---|
|
(none) |
Asserts that the table does not exist. Fails if the table already has metadata. |
|
|
Asserts that the table UUID matches the expected value. |
|
|
Asserts that a named reference points to the expected snapshot. For the |
|
|
Asserts that the current schema ID matches the expected value. |
|
|
Asserts that the last assigned column ID matches the expected value. |
|
|
Asserts that the last assigned partition field ID matches the expected value. |
|
|
Asserts that the default partition spec ID matches the expected value. |
|
|
Asserts that the default sort order ID matches the expected value. |
Updates
Updates are mutations applied to the table metadata. Each update has an action field that specifies the type of mutation.
| Action | Fields | Description |
|---|---|---|
|
|
Sets the table UUID. |
|
|
Upgrades the table format version (cannot downgrade). |
|
|
Adds a new schema to the table. The |
|
|
Sets the current schema ID. |
|
|
Adds a new partition spec. The |
|
|
Sets the default partition spec ID. |
|
|
Adds a new sort order to the table. |
|
|
Sets the default sort order ID. |
|
|
Adds a new snapshot. Updates the snapshot log and increments |
|
|
Sets a named snapshot reference. If the ref name is |
|
|
Removes snapshots by their IDs. |
|
|
Removes a named snapshot reference. |
|
|
Sets the table location. |
|
|
Merges key-value pairs into the table properties. |
|
|
Removes keys from the table properties. |
Example: Adding a schema and setting it as current
curl -X POST http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables/users \
-H "Content-Type: application/json" \
-d '{
"requirements": [
{"type": "assert-current-schema-id", "current-schema-id": 0}
],
"updates": [
{
"action": "add-schema",
"schema": {
"type": "struct",
"schema-id": 1,
"fields": [
{"id": 1, "name": "id", "required": true, "type": "long"},
{"id": 2, "name": "name", "required": true, "type": "string"},
{"id": 3, "name": "email", "required": false, "type": "string"},
{"id": 4, "name": "created_at", "required": true, "type": "timestamp"},
{"id": 5, "name": "updated_at", "required": false, "type": "timestamp"}
]
}
},
{"action": "set-current-schema", "schema-id": 1}
]
}'
Example: Adding a snapshot and updating the main branch
curl -X POST http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables/users \
-H "Content-Type: application/json" \
-d '{
"requirements": [
{"type": "assert-ref-snapshot-id", "ref": "main", "snapshot-id": -1}
],
"updates": [
{
"action": "add-snapshot",
"snapshot": {
"snapshot-id": 3051729675574597004,
"timestamp-ms": 1709550000000,
"summary": {"operation": "append"},
"manifest-list": "s3://my-bucket/warehouse/my_database/users/metadata/snap-3051729675574597004.avro"
}
},
{
"action": "set-snapshot-ref",
"ref-name": "main",
"snapshot-id": 3051729675574597004,
"type": "branch"
}
]
}'
Example: Setting and removing table properties
curl -X POST http://localhost:8080/apis/iceberg/v1/default/namespaces/my_database/tables/users \
-H "Content-Type: application/json" \
-d '{
"requirements": [],
"updates": [
{
"action": "set-properties",
"updates": {
"write.format.default": "parquet",
"write.parquet.compression-codec": "zstd"
}
},
{
"action": "remove-properties",
"removals": ["deprecated-key"]
}
]
}'
Error responses
| HTTP Status | Error Type | Description |
|---|---|---|
|
— |
Commit succeeded. Returns the updated |
|
|
The request contains an unknown requirement type or update action. |
|
|
The table does not exist. |
|
|
A requirement was not met, or a concurrent commit was detected. |
Error handling
All Iceberg REST Catalog API endpoints return errors in the Iceberg error format. The following table lists the error types that can be returned by the API.
| HTTP Status | Error Type | Description |
|---|---|---|
|
|
The request is malformed or contains invalid parameters. |
|
|
A generic resource was not found. |
|
|
The specified namespace does not exist. |
|
|
The specified table does not exist. |
|
|
A namespace or table with the same name already exists. |
|
|
The namespace cannot be dropped because it still contains tables. |
|
|
A commit requirement was not met, or a concurrent commit was detected. |
|
|
An unexpected internal error occurred. |
Authentication and authorization
When Apicurio Registry is configured with authentication enabled, the Iceberg REST Catalog API enforces the same role-based access control as the core Apicurio Registry API. Each endpoint requires a minimum authorization level.
| Operation | Minimum Level | Notes |
|---|---|---|
Get catalog configuration |
Read |
|
List namespaces |
Read |
|
Load namespace metadata |
Read |
|
Check if namespace exists |
Read |
|
Create a namespace |
Write |
|
Update namespace properties |
Write |
|
Drop a namespace |
Admin |
Namespace must be empty. |
List tables in a namespace |
Read |
|
Create a table |
Write |
|
Load a table |
Read |
|
Check if table exists |
Read |
|
Commit updates to a table |
Write |
|
Drop a table |
Admin |
|
Rename a table |
Write |
Using Apicurio Registry with query engines
This section provides configuration examples for popular query engines that support the Iceberg REST Catalog specification.
Apache Spark
Configure Spark to use Apicurio Registry as the Iceberg catalog:
spark.conf.set("spark.sql.catalog.apicurio", "org.apache.iceberg.spark.SparkCatalog")
spark.conf.set("spark.sql.catalog.apicurio.type", "rest")
spark.conf.set("spark.sql.catalog.apicurio.uri", "http://localhost:8080/apis/iceberg/v1")
spark.conf.set("spark.sql.catalog.apicurio.prefix", "default")
-- Switch to the Apicurio catalog
USE apicurio;
-- Create a namespace
CREATE NAMESPACE my_database;
-- Create a table
CREATE TABLE apicurio.my_database.events (
id BIGINT,
event_type STRING,
event_time TIMESTAMP
) USING iceberg;
-- Query the table
SELECT * FROM apicurio.my_database.events;
Trino
Configure Trino to use Apicurio Registry as the Iceberg catalog by creating a catalog properties file:
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://localhost:8080/apis/iceberg/v1
iceberg.rest-catalog.prefix=default
-- Create a schema (namespace)
CREATE SCHEMA apicurio.my_database;
-- Create a table
CREATE TABLE apicurio.my_database.products (
id BIGINT,
name VARCHAR,
price DECIMAL(10, 2)
) WITH (format = 'PARQUET');
-- Query the table
SELECT * FROM apicurio.my_database.products;
DuckDB
Configure DuckDB to use Apicurio Registry as the Iceberg catalog:
-- Install and load the Iceberg extension
INSTALL iceberg;
LOAD iceberg;
-- Attach the Apicurio Registry catalog
ATTACH 'http://localhost:8080/apis/iceberg/v1' AS apicurio (TYPE ICEBERG);
-- Query tables from the catalog
SELECT * FROM apicurio.my_database.users;
ClickHouse
Configure ClickHouse to use Apicurio Registry as the Iceberg catalog:
-- Create a database using the Iceberg engine
CREATE DATABASE apicurio_db ENGINE = Iceberg(
'http://localhost:8080/apis/iceberg/v1',
'default',
'my_database'
);
-- Query tables from the catalog
SELECT * FROM apicurio_db.users;
API reference
The following table lists all Iceberg REST Catalog API endpoints supported by Apicurio Registry.
| Method | Endpoint | Description |
|---|---|---|
GET |
|
Get catalog configuration |
GET |
|
List all namespaces |
POST |
|
Create a namespace |
GET |
|
Load namespace metadata |
HEAD |
|
Check if namespace exists |
DELETE |
|
Drop a namespace (must be empty) |
POST |
|
Update namespace properties |
GET |
|
List tables in a namespace |
POST |
|
Create a table |
GET |
|
Load a table |
POST |
|
Commit updates to a table (atomic metadata update with optimistic concurrency) |
HEAD |
|
Check if table exists |
DELETE |
|
Drop a table |
POST |
|
Rename a table |
Limitations
The Apicurio Registry Iceberg REST Catalog implementation has the following limitations:
-
Views are not supported. Only Iceberg tables are currently supported. The
ICEBERG_VIEWartifact type is reserved for future use. -
Multi-table transactions are not supported. Each CommitTable operation is scoped to a single table. Cross-table atomic commits are not available.
-
Server-side content management. Apicurio Registry stores table metadata but does not manage data files. Clients are responsible for writing data files to the configured warehouse location.
-
Rename is not atomic. Renaming a table creates the destination and then deletes the source as two separate operations. If the delete fails, both the source and destination may temporarily exist.
-
Experimental feature. The Iceberg REST Catalog API requires
apicurio.features.experimental.enabled=trueand may change in future releases.
-
For more information about the Iceberg REST Catalog specification, see the Apache Iceberg REST Catalog OpenAPI specification
-
For information on Apicurio Registry groups and artifacts, see Introduction to Apicurio Registry
-
For information about Apache Iceberg, see the Apache Iceberg documentation
