FHE/TTEE Notes (NYC 2025 Jam)

I should say that this post is based on estimates and as Fabric goes through the process of further nailing down their design, things may change.

Drilling further down into this claim, is this referring to latency per operation or throughput with some kind of batching? Given the figures and described use cases, I assume it’s the latter?

Throughput. More work needs to be done for latency numbers I believe.

FHE Schemes: By extension, I’m curious what FHE schemes are being considered in these projections.

Bootstrapping is definitely being accounted for.

“Encrypted GPU” & Parallel Workloads: Workloads like LLM training already demand substantial memory. Adding to that FHE ciphertext expansion, how large would an encrypted GPU need to be to remain practical while offering enough memory and speed?

As far as I understand, this is mostly a question of ciphertext/plaintext size overhead, which I’ll let Michael comment on.

On the need for RAM-FHE: You mention that all known FHE schemes are memory-intensive, particularly requiring linear scans per read. Could you clarify? Are you specifically referring to applications requiring a RAM-FHE model? While true for some workloads, RAM-FHE isn’t a necessity in all cases.

It’s my understanding that the way general-purpose schemes work today is that the whole computation gets converted into a circuit. For any dependent read (i.e. where the memory location being read is a function of prior encrypted computation), all possible locations are fed into the circuit, constituting a “linear scan”. I’m not saying this is fundamental. Mark already made suggestions around how to improve in some cases. I’m also not super familiar with all the current schemes so please let me know if I missed some optimisations!

I’ll let Michael fill in the blanks I left!