1. Side-by-side
| Property | SHA-256d (Bitcoin) | B3PoW-Scratch v1.1.1 (B3Chain) |
|---|---|---|
| Inner round function | SHA-256 (Merkle-Damgard) | BLAKE3 (Bao tree) |
| State per attempt | 32 bytes | 1 048 576 bytes (1 MB scratchpad) |
| Lanes | 1 | 8 lanes × 128 KiB; cross-lane diffusion via LANE_SHUFFLE permutation L' = (5L+1) mod 8 = [1,6,3,0,5,2,7,4] |
| Inner rounds per nonce | 2 | 2048 outer × 2 inner = 4096 |
| Memory bandwidth required | trivial (fits in L1) | 8 lanes × 64 B = ~512 B / attempt of working set, dominated by ~1 MB pad re-reads |
| Time per attempt on CPU (single thread, median) | ~1.5 µs | ~700 µs |
| ASIC speed-up upper bound vs 1-thread CPU | ~107× (mature market, custom ALUs) | bounded by on-die SRAM / HBM bandwidth; commercially typical 1-3 GB/s per chip ≈ ~103-103.5×, not 107× |
| Verifier wall-clock budget | none (verification is microseconds) | 50 ms per header
(consensus.b3pow_verify_budget_ms) |
| Verifier cache | not needed | per-prev_block_hash LRU
(b3pow::Cache), 4 entries by default
(≈ 4 MB resident) |
2. Why this matters
The headline number is the ASIC speed-up upper bound. A SHA-256d ASIC ekes out enormous gains because the algorithm fits inside a single 32-byte register file; the entire pipeline can be made combinational and pipelined to 1-3 GHz with O(104) parallel pipelines on one die.
A B3PoW-Scratch ASIC cannot do that: every nonce needs to read megabytes of memory in a data-dependent pattern, and the silicon real estate that would otherwise go to "more pipelines" instead has to be spent on more SRAM. The ASIC advantage caps near the memory-bandwidth-per-watt frontier, which the GPU/CPU market is already on.
3. Honesty caveats
- The "≈ 1 000-3 000×" ceiling is an upper-bound estimate from analogous memory-bound PoW deployments (RandomX, Equihash 144/5). Real B3PoW-Scratch ASICs are not deployed at scale at the time of writing, so this number will be refined as data appears.
- Memory-hardness is a defense in depth layer on top of the identity-hash isolation audited at H-1 — it does not make a weak PoW algorithm strong, it makes a strong PoW algorithm harder to ASIC.
- Wall-clock budget enforcement
(
b3pow_verify_budget_ms) and the HEADERS verification cap (MAX_B3POW_VERIFY_PER_BATCH) are audited separately at H-1.1 and H-1.3; this page only covers the algorithmic comparison. - The primitive throughput cards (pow-throughput, block-validation) still measure the BLAKE3 round-function in isolation. They are useful because BLAKE3 is the inner function of B3PoW-Scratch, but they are not the chain-level number.
4. Source files
- contrib/miner/b3miner-rtl/SPEC.md — normative algorithm spec
- contrib/testing/compare/compare-b3pow-vs-sha256d.md — this page's source
- src/crypto/b3pow_scratch.cpp — the C++ port
-
src/crypto/b3pow_cache.cpp —
b3pow::Cache