Benchmarks

JMH numbers from benchmarks/, run side-by-side against Monocle where a direct equivalent exists and against a hand-rolled decode → modify → re-encode baseline for the EO-only JSON and PowerSeries optics.

Reading the numbers

All tables show average time per operation (lower is better) with the 99.9 % confidence interval. eo* / m* are the EO and Monocle methods, naive* is a hand-written baseline without any optic machinery.

Sample output, run on a Linux 6.19 x86-64 box, JDK 25, JMH 1.37, with -f 1 -i 5 -wi 3 -t 1 (one fork, five measurement iterations, three warmups, single thread). Total wall time: ~10 min. The absolute numbers will vary by hardware; the ratios reproduce across machines.

JMH caveats. Trustworthy numbers need a quiet machine, fork count ≥ 3, CPU-frequency scaling locked, and no background builds. The tables below are indicative, not publication-grade. See benchmarks/README.md for repeatable invocations.

Lens (Tuple2 carrier)

Person.age focused via a Lens. Same fixture on both sides.

Operation eo Monocle ratio
get 0.45 ns 0.52 ns 1.16×
replace 1.18 ns 1.29 ns 1.09×
modify 1.37 ns 1.52 ns 1.11×

The EO GetReplaceLens stores get / enplace as plain fields and specialises its fused modify on the class, so the hot path is a straight two-function composition — no Tuple2 allocation for the (X, A) intermediate that the generic extension would materialise.

Prism (Either carrier)

Option[Int] prism plus an Either[String, Int] Right-prism:

Operation eo Monocle
getOption (Some) 0.42 ns 0.46 ns
getOption (None) 0.42 ns 0.47 ns
reverseGet 1.06 ns 1.10 ns
Right-getOption (Right) 1.17 ns 1.29 ns
Right-getOption (Left) 0.46 ns 0.80 ns
Right-reverseGet 1.06 ns 1.11 ns

Iso (Forgetful carrier)

(Int, String) ↔ Person(age, name) bijection.

Operation eo Monocle
get 1.63 ns 1.67 ns
reverseGet 1.22 ns 1.25 ns

BijectionIso stores get / reverseGet as plain fields — same storage shape as Monocle's case class Iso, same direct-call hot path.

Optional (Affine carrier)

Composed through a Nested0..6 chain. The depth-3 / depth-6 EO variants compose the Lens chain directly onto the leaf Optional via cross-carrier .andThen — the Morph[Tuple2, Affine] instance auto-lifts each Lens hop into the Affine carrier, no explicit morph step required. Made possible by dropping the <: Tuple bound on Affine.assoc (see Concepts → Cross-family composition).

Operation eo Monocle
modify_0 (Some leaf) 15.11 ns 12.52 ns
modify_0_empty (None) 0.72 ns 0.65 ns
replace_0 7.96 ns 1.74 ns
modify_3 79.02 ns 33.45 ns
modify_6 121.97 ns 52.49 ns

Both sides are within ~2× of each other across depths — the EO path pays Affine's branching overhead relative to Monocle's Option-specialised internals.

Getter (Forgetful carrier, no write)

Depth eo Monocle
get_0 0.54 ns 0.60 ns
get_3 1.50 ns 7.88 ns
get_6 2.68 ns 16.12 ns

Monocle's composed Getter.andThen chain pays per-hop typeclass dispatch Monocle's side doesn't optimise away at call-time. EO resolves the .get extension against each carrier's Accessor statically, so the composed chain inlines to a direct function call.

Getter composition isn't expressible through Optic.andThen in EO today (see Optics → Getter); the _3 / _6 EO numbers are from nested .get calls. Monocle's first-class Getter.andThen is the surface for its side.

Setter (SetterF carrier, write-only)

Depth eo Monocle
modify_0 1.45 ns 1.27 ns
modify_3 25.37 ns 13.18 ns
modify_6 50.27 ns 27.32 ns

Same composition caveat as Getter — EO's deep-modify benches nest modify calls where Monocle composes natively.

Fold (Forget[F] carrier)

foldMap(identity) over List[Int], sweeping size.

Size eo Monocle
8 50.8 ns 11.4 ns
64 458.4 ns 165.1 ns
512 3 868.7 ns 2 179.5 ns

Monocle wins here because its Fold.foldMap reduces to a direct Foldable[F].foldMap call; EO's Forget[F] carrier adds a small per-element dispatch layer through ForgetfulFold.

Traversal

each on List[Int], plus a modify(_ + 1) sweep:

Size eo (each) Monocle (fromTraverse) speedup
8 17.8 ns 119.4 ns 6.71×
64 145.7 ns 1 352.5 ns 9.28×
512 1 939.5 ns 16 214.0 ns 8.36×

A surprisingly large win — EO's Traversal.each (carrier MultiFocus[PSVec]) collects element references into a flat focus vector and rebuilds via Functor[PSVec].map, while Monocle's Traversal wraps each element in an Applicative[Id] traversal and pays the per-element wrapping cost.

JsonPrism — cursor-backed JSON edit

No Monocle equivalent at this layer. Two EO surfaces side by side: .modifyUnsafe (silent, pre-v0.2 shape) and .modify (default, returns Ior[Chain[JsonFailure], Json]). Baseline is the classical decode → modify → re-encode.

Depth .modifyUnsafe .modify (Ior) naive Unsafe vs naive Ior vs naive Ior tax
1 68.8 ± 2.7 67.4 ± 1.4 151 ns 2.20× 2.24× ~0 ns
2 114.6 ± 0.7 117.3 ± 2.8 151 ns 1.32× 1.29× +2.7 ns
3 129.3 ± 5.3 131.9 ± 4.9 234 ns 1.81× 1.77× +2.5 ns

The "Ior tax" column isolates the cost of opting into the default diagnostic-bearing surface: a single Ior.Right(json) case-class allocation per call, flat per path depth (not per step). At d1 the number is inside measurement noise (d1 Ior actually comes out slightly faster on this run — JMH standard error territory).

A fourth "wide" fixture (28-field record, measured separately) showed the ratio narrow to ~1.04× because the naive decoder touches every field regardless of arity, while EO still walks only the focused path. That wide-record bench stayed on the pre-rename harness; the numbers in the table above are for the standard Person / Deep3 fixtures.

When to reach for .modifyUnsafe. If you've measured and want pre-v0.2 throughput exactly, or you have a hot path where the Ior.Right allocation matters (heap-sensitive loops, very short overall work units). Otherwise the default is the recommended entry — you get structured JsonFailure diagnostics for the same order-of-magnitude speedup over naive.

AvroPrism — direct-walk over IndexedRecord

EO-only — no Monocle equivalent at this layer. AvroPrism.modify walks the IndexedRecord tree along its precomputed PathStep array, decodes only at the focused leaf, and stitches the parents back together — the classical alternative is to decode the whole record into its case-class tree, mutate the leaf, and re-encode end-to-end.

Two depths benched in AvroOpticsBench: depth 1 (Person.name, shallow) and depth 3 (Deep3.d2.d1.atom.value, deep). Each depth has paired eo* / native* rows so JMH reports them side-by-side. The eo* side is the silent *Unsafe hot path; the native* side is the raw kindlings-avro-derivation codec round-trip (AvroCodec.decodeEither → case-class .copy chain → AvroCodec.encode). Same shape as JsonPrismBench for eo-circe; the codec library swap is the only difference.

Run them with the standard JMH invocation, filtering by class:

sbt "benchmarks/Jmh/run -i 5 -wi 3 -f 3 -t 1 .*AvroOpticsBench.*"

A third pair of rows (eoModifyIor*) reports the default Ior-bearing surface for the same fixtures — the additional cost is a single Ior.Right(record) allocation per call, mirroring the "Ior tax" datapoint reported for JsonPrism.

The plan target is "≤2× the unwrapped baseline at deep paths" — the direct-walk speedup absorbs Avro's per-step IndexedRecord.get(i) / put(i, v) cost without ever materialising the case-class tree at the intermediate depths.

JsonTraversal — items.each.name edits

Uppercasing every items[*].name inside a Basket record, at three array sizes. Same two-surface story as JsonPrism above:

Items .modifyUnsafe .modify (Ior) naive Unsafe vs naive Ior vs naive Ior tax
8 797 ± 18 819 ± 22 1 659 ns 2.08× 2.03× +22 ns / +2.9 %
64 5 991 ± 188 6 128 ± 83 12 196 ns 2.04× 1.99× +137 ns / +2.3 %
512 47 046 ± 601 48 928 ±1 428 92 710 ns 1.97× 1.89× +1 882 ns / +4.0 %

Traversal's Ior tax scales with element count (~3.7 ns per element at size 512), consistent with per-element Chain bookkeeping on the success path. The gap to naive is ~2× at every size and stays ~2× when switching to the default Ior-bearing surface — the cursor-walk speedup absorbs the accumulator cost comfortably.

The ratio is roughly constant across sizes — the naive path pays a full decode / re-encode for every element, so both sides scale linearly and EO wins by a constant factor from avoiding the per-element codec round-trip.

PowerSeries traversal with downstream composition

EO-only — no Monocle equivalent. Three variants in the harness, one per common chain shape, all sharing the PowerSeries-backed Traversal.each as the composition vehicle.

PowerSeriesBench.eoModify_powerEachLens → Traversal.each → Lens

Toggles isMobile on every Phone inside a Person.phones: ArraySeq[Phone]. The two-hop chain exercises the flat "dense singleton outer + multi-focus inner + dense singleton inner" pattern.

Size eo naive ratio
4 66 ns 13 ns 5.1×
32 275 ns 80 ns 3.4×
256 1 890 ns 726 ns 2.6×
1024 6 927 ns 3 633 ns 1.9×

PowerSeriesNestedBench.eoModify_nested — 5-hop tree of traversals

Company → List[Department] → ArraySeq[Employee] → Boolean. Two traversal fan-outs with Lens hops between them — the worst shape for flat-carrier composition.

Size eo naive ratio
4 348 ns 65 ns 5.4×
32 1 175 ns 497 ns 2.4×
256 8 085 ns 3 376 ns 2.4×

PowerSeriesPrismBench.eoModify_sparseTraversal.each → Prism

Increments every Ok.value: Int inside an ArraySeq[Result] where Result = Ok(Int) | Err(String) is a 50/50 split. The Prism miss branch is the slow part of the composition machinery — this bench is the regression oracle for it.

Size eo naive ratio
8 59 ns 12 ns 5.0×
64 408 ns 73 ns 5.6×
512 3 610 ns 660 ns 5.5×

What makes these numbers possible

The PowerSeries carrier pairs an existential leftover xo: Snd[A] with a PSVec[B] focus vector — an Array[AnyRef] plus an (offset, length) window, so per-element slices during reassembly are pointer updates rather than arraycopies.

Most of the machinery is hidden behind the PSSingleton protocol — an internal trait implemented by the morphed Lens / Prism / Optional optics (their .to would otherwise build a throwaway PowerSeries + PSVec.Single per element). assoc.composeTo / composeFrom detect the protocol and call collectTo / reconstructSingleton directly, skipping the per-element wrapper allocations. The PSSingletonAlwaysHit refinement covers the "every call hits" case (Lens morphs) — it also skips the per-element length Array[Int] since every slot is implicitly 1.

Traversal.pEach's from rebuilds the container via Functor.map with a captured var counter — the State[Int, _] chain that Traverse.mapAccumulate would build shows up as 25 % CPU on State-thunk bookkeeping when used literally. For ArraySeq specifically both the .to (zero-copy from ArraySeq.ofRef.unsafeArray) and .from (direct ArraySeq.unsafeWrapArray of the shared PSVec.Slice backing array) go through an Array[AnyRef] end-to-end with a single System.arraycopy at most — Traverse[ArraySeq].map's builder-sizing path through SeqOps.size$ was 18 % CPU before this bypass.

IntArrBuilder.unsafeAppend / ObjArrBuilder.unsafeAppend skip the per-call grow-check on hot paths where the total is known upfront (composeTo pre-sizes all three builders to n in PSSingleton paths). PSVec.Slice.unsafeShareableArray completes the end-to-end zero-copy story for freshly-built result arrays.

The cumulative effect vs the pre-optimisation baseline on this same harness is −59 % to −67 % ns and −58 % to −67 % alloc across the three benches. Ratios to naive now sit at 1.9× on dense Lens-chain (powerEach @ 1024), 2.4× on nested, 5× on sparse-Prism — the remaining gap on sparse is inherent Prism miss-branch plumbing and is substantially smaller than the 5-10× ratios other optic libraries publish for the same shape.

Traversal.each / pEach (carrier MultiFocus[PSVec]) covers both single-pass modifies and chains that continue past the traversal — same optic, same .modify, with .foldMap for read-only aggregation and .andThen for downstream composition.

See the composition notes for the full tradeoff matrix.

Reproducing

From the repo root:

# Trustworthy numbers — three forks, five iterations, three warmups.
sbt "benchmarks/Jmh/run -i 5 -wi 3 -f 3 -t 1"

# Smoke check — one fork, faster but noisier.
sbt "benchmarks/Jmh/run -i 3 -wi 2 -f 1 -t 1"

# Filter by class (JMH regex):
sbt "benchmarks/Jmh/run -i 5 -wi 3 -f 3 -t 1 .*JsonTraversalBench.*"

JMH's GC and stack profilers are useful when a number is surprising:

sbt "benchmarks/Jmh/run -i 5 -wi 3 -f 3 -prof gc .*LensBench.*"
sbt "benchmarks/Jmh/run -i 5 -wi 3 -f 3 -prof stack .*PowerSeries.*"