About
I am a second-year PhD student at Georgia Tech advised by Prof. Moinuddin Qureshi. Currently my research focuses on efficient serving of large language models through memory-system optimizations or minimal model modifications.
I received my B.S. from Seoul National University, where I worked with Prof. Jung Ho Ahn on accelerating machine learning under fully homomorphic encryption (FHE) and with Prof. Jinho Lee on a function-in-memory DRAM architecture for efficient graph processing.
You can view my CV here.
Education
-
2024–2029 (Expected) Georgia Institute of Technology, Atlanta, United States
PhD in Electrical and Computer Engineering -
2018–2024 Seoul National University (SNU), Seoul, South Korea
BS in Electrical and Computer Engineering
Work Experience
Qualcomm
Engineering Intern
May 2025 - Aug 2025
- Improving traceability of the Executorch AI compiler for edge devices.
- Designed and implemented the automatic tracing of torch.fx graph modifications with less than 3% overhead.
- Designed and implemented the initial structure of a new Fully Homomorphic Encryption (FHE) library, with a focus on enabling easy integration of existing/future optimizations. (C++, CUDA)
- Optimized the multi-gpu inference latency of Llama2 and ResNet18 within FHE.
Publications
-
PROWL: Efficient MoE Inference Through Dynamic, Workload-Agnostic Expert Reduction
Vima Gupta, Jae Hyung Ju, Kartik Sinha, Ada Gavrilovska, and Anand Iyer
in submission -
SplitStream: Maximizing System Bandwidth Utilization via Read-Write Buffer Management
Hritvik Taneja, Jae Hyung Ju, Anish Saxena, and Moinuddin Qureshi
in submission -
Piccolo: Large-scale graph processing with fine-grained in-memory scatter-gather
Changmin Shin, Jaeyong Song, Hongsun Jang, Dogeun Kim, Jun Sung, Taehee Kwon, Jae Hyung Ju, Frank Liu, Yeonkyu Choi, and Jinho Lee
IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2025 -
Neujeans: Private Neural Network Inference with Joint Optimization of Convolution and FHE Bootstrapping
*Jae Hyung Ju, *Jaiyoung Park, Jongmin Kim, Minsik Kang, Donghwan Kim, Jung Hee Cheon, and Jung Ho Ahn (*equal contribution)
ACM Conference on Computer and Communications Security (CCS), 2024 -
A Case for In-Memory Random Scatter-Gather for Fast Graph Processing
Changmin Shin, Taehee Kwon, Jaeyong Song, Jae Hyung Ju, Frank Liu, Yeonkyu Choi, and Jinho Lee
IEEE Computer Architecture Letters (CAL), 2024 -
NeuralEQ: Neural-Network-Based Equalizer for High-Speed Wireline Communication
Hanseok Kim, Jae Hyung Ju, Hyun Seok Choi, Hyeri Roh, Woo-Seok Choi
arXiv preprint arXiv:2308.02133, 2023