Cryptography

QuantumLock

Post-quantum cryptographic infrastructure benchmarked under CPU-only constraints.

RustringpqcryptoserdeDocker

View Source

<50

Keygen Latency

Keygen Reduction

1M+

Operations Benchmarked

Kyber-768

Security Level

Problem

Problem statement, constraint shape, and the gap this project explores.

Problem Statement

Can post-quantum algorithms meet latency requirements on commodity CPU-only hardware?

Challenge

Built cryptographic infrastructure balancing security guarantees with operational constraints. Designed for environments where performance, memory usage, and algorithm choice directly impact system viability.

Why Existing Approaches Failed

Most post-quantum cryptography research benchmarks on high-end hardware with AVX-512 extensions. The real challenge is deploying PQ algorithms on constrained infrastructure — cloud VMs without AVX-512, edge devices, and environments where GPU acceleration is unavailable. QuantumLock explored whether Kyber-based key encapsulation and lattice-based operations could meet sub-50ms latency targets under these constraints.

Constraints

Non-negotiable boundaries that shaped the implementation.

Hardware

CPU-only, no AVX-512

Latency Target

<50ms per operation

Scale

1M+ operations benchmarked

Threads

8-core parallel execution

Memory

Bounded working set

Architecture

The primary design surface: flow, subsystem roles, and state boundaries.

Architecture Brief

QuantumLock implements a Kyber-768 key encapsulation pipeline in Rust using the pqcrypto crate family. The benchmark harness runs keygen, encapsulation, and decapsulation across thread pools of varying sizes to expose contention points.

Execution Flow

Keygen

Public/Private keypair

Encapsulate (shared secret + ciphertext)

Decapsulate (verify shared secret)

Benchmark record

KEM Pipeline

Kyber-768 keygen → encapsulation → decapsulation chain.

Benchmark Harness

Criterion-based microbenchmarks with statistical analysis.

Thread Pool Manager

Rayon-based parallelism with configurable worker counts.

SIMD Dispatcher

Runtime detection of SIMD capabilities with scalar fallback.

Serialization Layer

Environment first, numbers second. Metrics should be inspectable, not ornamental.

Test Environment

Runtime

Rust / ring

Workload

<50ms CPU-only latency

Stack

Rust, ring, pqcrypto, serde

Scope

Cryptography

Evidence

Project-level benchmark notes

Performance Results

Keygen Latency

<50ms

Keygen Reduction

27%

Operations Benchmarked

1M+ops

Security Level

Kyber-768NIST L3

Threads

8cores

Lessons Learned

Engineering takeaways from the implementation, including remaining work.

RNG contention is the silent killer of parallel cryptographic benchmarks

always use per-thread RNG.

SIMD gains in polynomial arithmetic are more significant than thread-level parallelism for this workload.

Benchmarking crypto correctly requires careful use of black_box() and warm-up iterations to avoid compiler artifacts.

Future: explore hardware acceleration via dedicated crypto coprocessors and measure the gap vs pure-Rust implementations.

PreviousEchoTrap

All Projects

NextHuffSpace