Configuring your Apicurio Registry deployment

This chapter explains how to set important configuration options for your Apicurio Registry deployment. This includes features such as the Apicurio Registry web console, logging, health checks, and observability:

For a list of all available configuration options, see Apicurio Registry configuration reference.

Configuring the Apicurio Registry web console

You can set optional environment variables to configure the Apicurio Registry web console specifically for your deployment environment or to customize its behavior.

Prerequisites
  • You have already installed Apicurio Registry.

Configuring the web console deployment environment

When you access the Apicurio Registry web console in your browser, some initial configuration settings are loaded. The following configuration settings are required:

  • URL for core Apicurio Registry server REST API v3

Typically the Apicurio Registry Operator will automatically configure the UI component with the REST API v3 URL. However, you can override this value by configuring the appropriate environment variable in the UI component deployment configuration.

Procedure

Configure the following environment variables to override the default URL:

  • REGISTRY_API_URL: Specifies the URL for the core Apicurio Registry server REST API v3. For example, https://registry-api.my-domain.com/apis/registry/v3

Configuring the web console in read-only mode

You can configure the Apicurio Registry web console in read-only mode as an optional feature. This mode disables all features in the Apicurio Registry web console that allow users to make changes to registered artifacts. For example, this includes the following:

  • Creating a group

  • Creating an artifact

  • Uploading a new artifact version

  • Updating artifact metadata

  • Deleting an artifact

Procedure

Configure the following environment variable:

  • REGISTRY_FEATURE_READ_ONLY: Set to true to enable read-only mode. Defaults to false.

Configuring Apicurio Registry observability with OpenTelemetry

You can configure Apicurio Registry to export telemetry data using OpenTelemetry (OTel) for comprehensive observability. This includes distributed tracing, metrics export via OTLP protocol, and log correlation with trace context.

Apicurio Registry is built with OpenTelemetry support and all telemetry signals (traces, metrics, logs) are enabled at build time. However, the OpenTelemetry SDK is disabled by default at runtime. When enabled, Apicurio Registry exports telemetry data to an OpenTelemetry-compatible collector such as Jaeger, Grafana Tempo, or the OpenTelemetry Collector.

The individual signal properties (QUARKUS_OTEL_TRACES_ENABLED, QUARKUS_OTEL_METRICS_ENABLED, QUARKUS_OTEL_LOGS_ENABLED) are build-time properties and cannot be changed at runtime. All signals are already enabled in the Apicurio Registry build. Use QUARKUS_OTEL_SDK_DISABLED=false to enable telemetry at runtime.
Prerequisites
  • You have already installed Apicurio Registry.

  • You have an OpenTelemetry-compatible backend available (for example, Jaeger, Grafana Tempo, or OpenTelemetry Collector).

Enabling OpenTelemetry

To enable OpenTelemetry observability, configure the following environment variables:

Table 1. Environment variables for enabling OpenTelemetry
Environment Variable Description

QUARKUS_OTEL_SDK_DISABLED

Set to false to enable the OpenTelemetry SDK and all telemetry signals. Default is true (disabled).

QUARKUS_OTEL_EXPORTER_OTLP_ENDPOINT

The endpoint URL of your OpenTelemetry collector. For example, http://jaeger:4317 for gRPC or http://jaeger:4318 for HTTP.

Example: Enabling OpenTelemetry with Jaeger
environment:
  QUARKUS_OTEL_SDK_DISABLED: "false"
  QUARKUS_OTEL_EXPORTER_OTLP_ENDPOINT: "http://jaeger:4317"

Configuring trace sampling for production

In production environments, you should configure trace sampling to reduce overhead and control the volume of trace data:

Table 2. Environment variables for trace sampling
Environment Variable Description

QUARKUS_OTEL_TRACES_SAMPLER

The sampling strategy. Use parentbased_traceidratio for production.

QUARKUS_OTEL_TRACES_SAMPLER_ARG

The sampling ratio (0.0 to 1.0). A value of 0.1 samples 10% of traces.

Example: Production sampling configuration (10% of traces)
environment:
  QUARKUS_OTEL_SDK_DISABLED: "false"
  QUARKUS_OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector:4317"
  QUARKUS_OTEL_TRACES_SAMPLER: "parentbased_traceidratio"
  QUARKUS_OTEL_TRACES_SAMPLER_ARG: "0.1"

Configuring structured logging with trace context

When using JSON logging format, Apicurio Registry automatically includes trace context (trace ID and span ID) in log entries. This enables correlation between logs and traces.

Example: Enabling structured logging with trace context
environment:
  QUARKUS_OTEL_TRACES_ENABLED: "true"
  QUARKUS_LOG_CONSOLE_JSON: "true"

OpenTelemetry features in Apicurio Registry

When OpenTelemetry is enabled, Apicurio Registry provides the following observability features:

  • Distributed tracing: All REST API requests are automatically traced with spans containing request details, path parameters, and Apicurio-specific attributes such as groupId, artifactId, and version.

  • Storage layer tracing: All storage operations create child spans, enabling you to trace the complete request flow from REST API to database.

  • Kafka tracing: When using KafkaSQL storage, Kafka operations are automatically traced with context propagation.

  • Custom metrics: OpenTelemetry metrics for artifact operations, schema validations, and search requests are exported alongside existing Prometheus metrics.

  • Log correlation: When JSON logging is enabled, trace context is automatically injected into log entries for easy correlation.

Performance considerations

OpenTelemetry instrumentation adds a small performance overhead. The following table shows the measured impact when all telemetry signals are enabled with 100% sampling:

Table 3. Performance impact with OpenTelemetry enabled (100% sampling)
Operation Latency Increase Throughput Decrease Impact Level

System Info

+9% (+0.35ms)

-7%

Low

Create Artifact

+4% (+0.13ms)

-1%

Minimal

Get Artifact

+5% (+0.05ms)

-6%

Minimal

Search Artifacts

+1% (+0.04ms)

-1%

Minimal

List Groups

+6% (+0.16ms)

-5%

Low

Key findings:

  • Average overhead is approximately 4-6% in throughput with 100% trace sampling.

  • With the recommended 10% sampling ratio (QUARKUS_OTEL_TRACES_SAMPLER_ARG=0.1), the overhead is reduced to less than 1%.

  • OpenTelemetry signals are disabled by default to avoid any overhead for users who do not require observability features.

To reproduce these benchmarks, run the following commands from the project root:

# Build the project
./mvnw clean install -DskipTests

# Run benchmark with OTEL disabled (baseline)
./mvnw test -pl app -Dtest=OpenTelemetryPerformanceTest \
    -DOpenTelemetryPerformanceTest=enabled

# Run benchmark with OTEL enabled
./mvnw test -pl app -Dtest=OpenTelemetryPerformanceEnabledTest \
    -DOpenTelemetryPerformanceTest=enabled

Backwards compatibility

OpenTelemetry support is fully backwards compatible:

  • The existing Prometheus metrics endpoint (/q/metrics) remains available and unchanged.

  • Health check endpoints (/q/health/*) continue to work as before.

  • All existing Micrometer-based metrics continue to function.

Configuring Apicurio Registry health checks on OpenShift

You can configure optional environment variables for liveness and readiness probes to monitor the health of the Apicurio Registry server on OpenShift:

  • Liveness probes test if the application can make progress. If the application cannot make progress, OpenShift automatically restarts the failing Pod.

  • Readiness probes test if the application is ready to process requests. If the application is not ready, it can become overwhelmed by requests, and OpenShift stops sending requests for the time that the probe fails. If other Pods are OK, they continue to receive requests.

The default values of the liveness and readiness environment variables are designed for most cases and should only be changed if required by your environment. Any changes to the defaults depend on your hardware, network, and amount of data stored. These values should be kept as low as possible to avoid unnecessary overhead.
Prerequisites
  • You must have an OpenShift cluster with cluster administrator access.

  • You must have already installed Apicurio Registry on OpenShift.

  • You must have already installed and configured your chosen Apicurio Registry storage in either Strimzi or PostgreSQL.

Procedure
  1. In the OpenShift Container Platform web console, log in using an account with cluster administrator privileges.

  2. Click Installed Operators > Apicurio Registry.

  3. On the ApicurioRegistry tab, click the Operator custom resource for your deployment, for example, example-apicurioregistry.

  4. In the main overview page, find the Deployment Name section and the corresponding DeploymentConfig name for your Apicurio Registry deployment, for example, example-apicurioregistry.

  5. In the left navigation menu, click Workloads > Deployment Configs, and select your DeploymentConfig name.

  6. Click the Environment tab, and enter your environment variables in the Single values env section, for example:

    • NAME: LIVENESS_STATUS_RESET

    • VALUE: 350

  7. Click Save at the bottom.

    Alternatively, you can perform these steps using the OpenShift oc command. For more details, see the OpenShift CLI documentation.

Environment variables for Apicurio Registry health checks

This section describes the available environment variables for Apicurio Registry health checks on OpenShift. These include liveness and readiness probes to monitor the health of the Apicurio Registry server on OpenShift. For an example procedure, see Configuring Apicurio Registry health checks on OpenShift.

The following environment variables are provided for reference only. The default values are designed for most cases and should only be changed if required by your environment. Any changes to the defaults depend on your hardware, network, and amount of data stored. These values should be kept as low as possible to avoid unnecessary overhead.

Liveness environment variables

Table 4. Environment variables for Apicurio Registry liveness probes
Name Description Type Default

LIVENESS_ERROR_THRESHOLD

Number of liveness issues or errors that can occur before the liveness probe fails.

Integer

1

LIVENESS_COUNTER_RESET

Period in which the threshold number of errors must occur. For example, if this value is 60 and the threshold is 1, the check fails after two errors occur in 1 minute

Seconds

60

LIVENESS_STATUS_RESET

Number of seconds that must elapse without any more errors for the liveness probe to reset to OK status.

Seconds

300

LIVENESS_ERRORS_IGNORED

Comma-separated list of ignored liveness exceptions.

String

io.grpc.StatusRuntimeException,org.apache.kafka.streams.errors.InvalidStateStoreException

Because OpenShift automatically restarts a Pod that fails a liveness check, the liveness settings, unlike readiness settings, do not directly affect behavior of Apicurio Registry on OpenShift.

Readiness environment variables

Table 5. Environment variables for Apicurio Registry readiness probes
Name Description Type Default

READINESS_ERROR_THRESHOLD

Number of readiness issues or errors that can occur before the readiness probe fails.

Integer

1

READINESS_COUNTER_RESET

Period in which the threshold number of errors must occur. For example, if this value is 60 and the threshold is 1, the check fails after two errors occur in 1 minute.

Seconds

60

READINESS_STATUS_RESET

Number of seconds that must elapse without any more errors for the liveness probe to reset to OK status. In this case, this means how long the Pod stays not ready, until it returns to normal operation.

Seconds

300

READINESS_TIMEOUT

Readiness tracks the timeout of two operations:

  • How long it takes for storage requests to complete

  • How long it takes for HTTP REST API requests to return a response

If these operations take more time than the configured timeout, this is counted as a readiness issue or error. This value controls the timeouts for both operations.

Seconds

5