Using EJBCA as a Large-scale Enterprise PKI

Are your certificates counted not by thousands but by the millions? One little-known but important fact about EJBCA is that it is built for scale. Through some tweaks and configurations, EJBCA's issuance and OCSP response volumes can handle PKI on a global scale. If your PKI requires world-class performance then EJBCA is your choice.

The following assumes that your PKI conforms to one of the other solution areas, but your problem is scale. Your current PKI solution might not be able to handle the volumes and throughput of your needs - your CRLs are scaling out of proportion, clients are complaining about timeouts and your VA is on its knees. Read on to find out how EJBCA can be tuned to handle even the world's biggest PKIs.

Clustering and Load Balancing

A large part of scaling is about scaling up the architecture of your PKI to meet your requirements. The first step to being able to handle more issuances and traffic is by clustering: splitting the load between several instances of EJBCA working in concert. This has the added bonus of adding a layer of reliability to your PKI, as any uneven-numbered cluster can always survive one or more of its nodes failing. EJBCA has been designed to allow for hot-upgrading, meaning that your PKI is still active and running while the nodes in your cluster are running different versions of EJBCA, with zero downtime as a result.

images/inline/20e746007ff0c13e4efd3f7e95345b8d841bc7fa7db20eeee8d86dec4c40ff00.png

Likewise, clustering can be performed on VAs or RAs to ease the load on your PKI, depending on where your PKI is having performance issues:

  • If you are experiencing long response times or timeouts in your VA infrastructure, then either the VA's HSM or the database is overloaded by queries. This can be solved by adding more VAs, but also by clustering the VA instances.

  • If you are issuing/revoking certificates in large volumes, clustering the CAs will allow more nodes to do the work of revoking and publishing. Each revocation sent out to the VAs is just a single write per cluster.

Essential if you have multiple VAs/VA clusters is to place them behind a load balancer to balance the load on each VA.

Learn more about running EJBCA in a cluster

Database Sharding

For databases of extreme volumes, it may be desirable to shard the database over several database instances to save space.

To shard your database to not save everything in the same physical volume, you configure the database.properties configuration file to allow the certificate bodies to be stored in another certificate table. Enable the property database.useSeparateCertificateTable to store the certificate body in the table Base64CertificateData instead of CertificateData.

images/inline/66f6a1cb431a438b61b4ad4f3c2469a0c6df10e377ef8e4561ab3222b83b63e2.png

Base64CertificateData can then be sharded and placed on a different database volume. For more information on EJBCA configurations, see Managing EJBCA Configurations.

CRL Partitioning

If your population of unexpired certificates is large and you rely on CRLs, you might start finding that CRL generation times are beginning to spin out of control and that CRL sizes become unmanageable. EJBCA supports CRL partitioning in accordance with RFC 5280, allowing certificates to be assigned to a specific CRL shard.

images/download/attachments/141984841/Screenshot_2021-01-18_at_14.32.31-version-1-modificationdate-1639387990000-api-v2-effects-drop-shadow.png

CRL partitioning means that instead of a single CRL, the CRL is split into several shards. As the shards grow themselves in size, EJBCA allows you to suspend shards, automatically creating new ones.

Learn more about partitioned CRLs

Service Pinning

In a clustered EJBCA instance, service execution happens at semi-random, the service being run by the first node to activate within the granted service interval. If some services, for example generating CRLs, are taking an excessive amount of time, you may be experiencing latency in the cluster node executing the service, leading to intermittent delays being experienced while the service is running. The easiest solution is to pin the service to a single node and remove that node from the load balancer's roster, meaning that all service executions will happen on that node only, while enrollment, issuance and revocation operations are processed on the remaining node.

images/inline/75d0594855df2b9509936121345be503f1f535a425ef097801711272f40ba1e7.png

images/download/attachments/141984841/Screenshot_2021-01-26_at_09.44.41-version-1-modificationdate-1639387989000-api-v2-effects-drop-shadow.png

Learn more about services in EJBCA

Precompiled OCSP Responses

Each OCSP reply requires an individual signature by the crypto token on the VA. While generated responses are cached by the EJBCA VA, validity times of OCSP replies are commonly short (< one day) and caches are not shared between nodes in a cluster, thus responses still need to be generated anew frequently. The traditional solution to this has been OCSP Stapling, caching the first reply encountered in the HTTP proxy. While this may solve the problem to some extent, it moves the burden of administration of caching the replies over to you.

Instead, EJBCA offers Precompiled OCSP Responses. Also known as Canned OCSP, this functionality allows a VA to generate the full set of expected OCSP responses on a regular schedule within a set timeframe when there are expected lulls in traffic. For any PKI, this will dramatically decrease the latency of the VA infrastructure.

images/download/attachments/141984841/Screenshot_2021-01-26_at_10.35.08-version-1-modificationdate-1639387989000-api-v2-effects-drop-shadow.png

Learn more about Precompiled OCSP Responses

Ephemeral Certificates

EJBCA can be configured to function as an Ephemeral Certificate CA. In this mode, EJBCA functions as a high-speed certificate factory, issuing certificates but not storing any trace of them in the local database.

images/inline/06c82c2db337b633ac03c90a96c9564305f8eb7b02a65ae88871b2d97afda46f.png

While the mode still allows for revocation of certificates, it does not allow for certificates to be searched for in the database or for any constraints based on existing certificates to be enforced. For more information , see Ephemeral Certificates.