Revisiting ECM on GPUs

Jonas Wloka, Jan Richter-Brockmann, Colin Stahlke, Thorsten Kleinjung, Christine Priplata, Tim Güneysu

19th International Conference on Cryptology and Network Security (CANS 2020), December 14-16, 2020,


Abstract

Modern public-key cryptography is a crucial part of our contemporary life where a secure communication channel with another party is needed. With the advance of more powerful computing architectures – especially Graphics Processing Units (GPUs) – traditional approaches like RSA and Diffie-Hellman schemes are more and more in danger of being broken.

We present a highly optimized implementation of Lenstra’s ECM algorithm customized for GPUs. Our implementation uses state-of-the-art elliptic curve arithmetic and optimized integer arithmetic while providing the possibility of arbitrarily scaling ECM’s parameters allowing an application even for larger discrete logarithm problems. Furthermore, the proposed software is not limited to any specific GPU generation and is to the best of our knowledge the first implementation supporting multiple device computation. To this end, for a bound of B1=8,192 and a modulus size of 192 bit, we achieve a throughput of 214 thousand ECM trials per second on a modern RTX 2080 Ti GPU considering only the first stage of ECM. To solve the discrete logarithm problem for larger bit sizes, our software can easily support larger parameter sets such that a throughput of 2,781 ECM trials per second is achieved using B1=50,000, B2=5,000,000, and a modulus size of 448 bit.

[pdf] [code]

tags: Cryptanalysis, ECM, GPUs, Prime Factorization