Program

The ASPLOS + EuroSys Joint schedule is available here.

Monday, 6:00 PM CEST – 7:30 PM CEST: Welcome Reception + EuroSys poster session

Location: Catering area


Day 1: Tuesday, April 1

8:00 AM CEST – 8:45 AM CEST: Registration and Welcome Coffee

Location: Catering area

8:45 AM CEST – 9:00 AM CEST: Opening

Location: Rotterdam Hall 1

9:00 AM CEST – 10:00 AM CEST: ASPLOS + EuroSys 2025 Joint Keynote 1 by Martin Maas (Google)

Session Chair: Christopher Rossbach (UT Austin)

Martin Maas headshot Abstract
A wide range of research areas – from natural language processing to computer vision and software engineering – have been (or are being) revolutionized by machine learning and artificial intelligence. Each of these areas went through an inflection point where they transitioned from ML as one of many approaches to ML becoming a predominant approach of the field. No example symbolizes this better than the AlexNet paper from 2012, which fundamentally transformed the field of computer vision.

Computer systems remain a notable exception. In this talk, I will discuss emerging trends in the ML for Systems domain, how systems differ from these other areas, and what an "AlexNet Moment" for systems might look like. Along the way, I will describe a framework for categorizing work in the field and discuss emerging research problems and opportunities.


Bio
Martin Maas is a research scientist at Google DeepMind, where he is working on new approaches to leverage artificial intelligence for solving computer systems problems. His research has been deployed in a range of Google systems and products, including Google Compute Engine, TCMalloc, Pixel phones and the TPU compiler. His work has received multiple recognitions, including an ASPLOS Best Paper Award, an IEEE Micro Top Pick, a SIGPLAN Research Highlight, and a CACM Research Highlight. He has been active in several leadership roles in the community, including as General Chair of ISMM 2025, WACI Chair and Vice Program Chair at ASPLOS 2025, Program Chair of ISMM 2020, and as one of the co-organizers of the ML for Systems workshop at NeurIPS. He also co-leads Google’s involvement in the free and open RISC-V instruction set architecture. Martin holds a Ph.D. from the University of California at Berkeley and a B.A. from the University of Cambridge, both in Computer Science.

10:00 AM CEST – 10:30 AM CEST: Coffee Break

Location: Catering area

10:30 AM CEST – 11:00 AM CEST: Award ceremony

Location: Rotterdam Hall 1

11:00 AM CEST – 12:00 PM CEST: ASPLOS + EuroSys 2025 Joint Keynote 2 by Gernot Heiser (Univ. of New South Wales)

Session Chair: Haibo Chen (Shanghai Jiao Tong University)

Gernot Heiser headshot Abstract
Half a century after PSOS, the first attempts to prove an operating system (OS) secure, OS faults remain a major threat to computer systems security. A major step forward was the verification of the seL4 microkernel, the first proof of implementation correctness of anOS kernel. Over the next 4 years this proof was extended to the binary code, proofs of security enforcement, and sound and complete worst-case execution-time analysis. The proofs now cover 4 ISAs.

Yet, 15 years later, there is still no provably secure OS. While seL4 has been successfully deployed in defence and civilian security- and safety-critical systems, it is a microkernel that mostly guarantees process isolation without providing the application-oriented services expected from an OS. This not only makes seL4 difficult to deploy, but means that there is limited assurance that a system built on top is secure in any real sense.

Why has seL4 not been leveraged into a secure OS? In this talk I will explore some of the reasons behind this disappointing state of affairs, and what can be done about it. Specifically I will discuss our current work on LionsOS, a new seL4-based OS targeting the embedded/cyberphysical domain, and designed to be verifiable. I will also discuss more speculative, early-stage work towards a provably secure, general-purpose OS.


Bio
Gernot Heiser is Scientia (distinguished) Professor and John Lions Chair of Operating Systems at UNSW Sydney, where he leads the Trustworthy Systems research group. His research interest are in operating systems, real-time systems, security and safety. His research vision is to completely change the cybersecurity game, from playing catch-up with attackers, to making computer systems provably secure and safe. With his team he pioneered the large-scale formal verification of systems code, specifically the design, implementation and formal verification of the seL4 microkernel; this work was recognised with an ACM SIGOPS Hall of Fame Award and the ACM Software System Award.

Heiser's former company Open Kernel Labs, acquired by General Dynamics in 2012, marketed the OKL4 microkernel, which shipped on billions of mobile wireless chips and has been deployed on the secure enclave of iOS devices. He presently serves as Chief Scientist of Neutrality, and Chairman of the seL4 Foundation. Gernot is a Fellow of the ACM, the IEEE, Engineers Australia, the Australian Academy of Technology and Engineering (ATSE) and the Royal Society of New South Wales (RSN) and a Member of the German Academy of Sciences Leopoldina. He is also an ACM Distinguished Lecturer and an IEEE Distinguished Visitor.

12:00 PM CEST – 1:30 PM CEST: Lunch

Location: Catering area

1:30 PM CEST – 3:10 PM CEST

Session Chair: Lisa Wu Wills (Duke Univ.)
Mosaic: Exploiting Instruction-Level Parallelism on Deep Learning Accelerators with iTex Tessellation
Jianxing Xu (University of Science and Technology of China,SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yuanbo Wen (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Zikang Liu (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Ruibai Xu (University of Science and Technology of China,SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Tingfeng Ruan (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Jun Bi (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Rui Zhang (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Di Huang (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Xinkai Song (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yifan Hao (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Xing Hu (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Zidong Du (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Chongqing Zhao (Tencent), Jie Jiang (Tencent), Qi Guo (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences)

DynaX: Sparse Attention Acceleration with Dynamic X:M Fine-Grained Structured Pruning
Xiao Xiong (College of Computer Science, Chongqing University), Zhaorui Chen (College of Computer Science, Chongqing University), Yue Liang (College of Computer Science, Chongqing University), Minghao Tian (College of Computer Science, Chongqing University), Jiaxing Shang (College of Computer Science, Chongqing University), Jiang Zhong (College of Computer Science, Chongqing University), Dajiang Liu (College of Computer Science, Chongqing University)

Accelerating Retrieval-Augmented Generation
Derrick Quinn (Cornell University), Mohammad Nouri (Cornell University), Neel Patel (Cornell University), John Salihu (University of Kansas), Alireza Salemi (UMass Amherst), Sukhan Lee (Samsung Electronics), Hamed Zamani (UMass Amherst), Mohammad Alian (Cornell University)

GUST: Graph Edge-Coloring Utilization for Accelerating Sparse Matrix Vector Multiplication
Armin Gerami (Computer Science, University of Maryland), Bahar Asgari (Computer Science, University of Maryland)

RASSM: Residue-based Acceleration of Single Sparse Matrix Computation via Adaptive Tiling
Anirudh Jain (Georgia Institute of Technology), Pulkit Gupta (Georgia Institute of Technology), Thomas M. Conte (Georgia Institute of Technology)
Session Chair: Jung Ho Ahn (Seoul National Univ.)
Orion: A Fully Homomorphic Encryption Framework for Deep Learning
Austin Ebel (Tandon School of Engineering, New York University), Karthik Garimella (Tandon School of Engineering, New York University), Brandon Reagen (Tandon School of Engineering, New York University)

CIPHERMATCH: Accelerating Homomorphic Encryption-Based String Matching via Memory-Efficient Data Packing Packing and In-Flash Processing
Mayank Kabra (ETH Zurich), Rakesh Nadig (ETH Zurich), Harshita Gupta (ETH Zurich), Manos Frouzakis (ETH Zurich), Rahul Bera (ETH Zurich), Vamanan Arulchelvan (ETH Zurich), Yu Liang (ETH Zurich), Haiyu Mao (ETH Zurich), Mohammad Sadrosadati (ETH Zurich), Onur Mutlu (ETH Zurich)

ReSBM: Region-based Scale and Minimal-Level Bootstrapping Management for FHE via Min-Cut
Yan Liu (Ant Group), Jianxin Lai (Ant Group), Long Li (Ant Group), Tianxiang Sui (Ant Group), Linjie Xiao (Ant Group), Peng Yuan (Ant Group), Xiaojing Zhang (Ant Group), Qing Zhu (Ant Group), Wenguang Chen (Tsinghua University,Ant Group), Jingling Xue (UNSW)

HALO: Loop-aware Bootstrapping Management for Fully Homomorphic Encryption
Seonyoung Cheon (Yonsei University), Yongwoo Lee (Yonsei University), Hoyun Youm (Yonsei University), Dongkwan Kim (Yonsei University), Sungwoo Yun (Yonsei University), Kunmo Jeong (Yonsei University), Dongyoon Lee (Stony Brook University), Hanjun Kim (Yonsei University)

Affinity-based Optimizations for TFHE on Processing-in-DRAM
Kevin Nam (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University), Heonhui Jung (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University), Hyunyoung Oh (Department of AI·Software, Gachon University), Yunheung Paek (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University)
Session Chair: Pramod Bhatotia (Technische Univ. München)
FMCC: Flexible Measurement-based Quantum Computation over Cluster State
Yingheng Li (University of Pittsburgh), Aditya Pawar (University of Pittsburgh), Zewei Mo (University of Pittsburgh), Youtao Zhang (University of Pittsburgh), Jun Yang (University of Pittsburgh), Xulong Tang (University of Pittsburgh)

QRCC: Evaluating Large Quantum Circuits on Small Quantum Computers through Integrated Qubit Reuse and Circuit Cutting
Aditya Pawar (Electrical and Computer Engineering Department, University of Pittsburgh), Yingheng Li (Computer Science Department, University of Pittsburgh), Zewei Mo (Computer Science Department, University of Pittsburgh), Yanan Guo (Electrical and Computer Engineering Department, University of Pittsburgh), Xulong Tang (Computer Science Department, University of Pittsburgh), Youtao Zhang (Computer Science Department, University of Pittsburgh), Jun Yang (Electrical and Computer Engineering Department, University of Pittsburgh)

Optimizing Quantum Circuits, Fast and Slow
Amanda Xu (University of Wisconsin-Madison), Abtin Molavi (University of Wisconsin-Madison), Swamit Tannu (University of Wisconsin-Madison), Aws Albarghouthi (University of Wisconsin-Madison)

BQSim: GPU-accelerated Batch Quantum Circuit Simulation using Decision Diagram
Shui Jiang (The Chinese University of Hong Kong ,University of Wisconsin-Madison), Yi-Hua Chung (University of Wisconsin-Madison), Chih-Chun Chang (University of Wisconsin-Madison), Tsung-Yi Ho (The Chinese University of Hong Kong), Tsung-Wei Huang (University of Wisconsin-Madison)

Fat-Tree QRAM: A High-Bandwidth Shared Quantum Random Access Memory for Parallel Queries
Shifan Xu (Yale Quantum Institute, Yale University), Alvin Lu (Yale Quantum Institute, Yale University), Yongshan Ding (Yale Quantum Institute, Yale University)
Session Chair: Arrvindh Shriraman (Simon Fraser Univ.)
Energy-aware Scheduling and Input Buffer Overflow Prevention for Energy-harvesting Systems
Harsh Desai (Carnegie Mellon University), Xinye Wang (Carnegie Mellon University), Brandon Lucia (Carnegie Mellon University)

Generalizing Reuse Patterns for Efficient DNN on Microcontrollers
Jiesong Liu (North Carolina State University), Bin Ren (College of William and Mary), Xipeng Shen (North Carolina State University)

Earth+: On-Board Satellite Imagery Compression Leveraging Historical Earth Observations
Kuntai Du (University of Chicago), Yihua Cheng (University of Chicago), Peder Olsen (Microsoft Research), Shadi Noghabi (Microsoft Research), Junchen Jiang (University of Chicago)

Pirate: No Compromise Low-Bandwidth VR Streaming for Edge Devices
Yingtian Zhang (The Pennsylvania State University), Yan Kang (The Pennsylvania State University), Ziyu Ying (The Pennsylvania State University), Wanhang Lu (The Pennsylvania State University), Sijie Lan (The Pennsylvania State University), Huijuan Xu (The Pennsylvania State University), Kiwan Maeng (The Pennsylvania State University), Anand Sivasubramaniam (The Pennsylvania State University), Mahmut T. Kandemir (The Pennsylvania State University), Chita R. Das (The Pennsylvania State University)

Nazar: Monitoring and Adapting ML Models on Mobile Devices
Wei Hao (Columbia University), Zixi Wang (Columbia University), Lauren Hong (Columbia University), Lingxiao Li (Columbia University), Nader Karayanni (Columbia University), AnMei Dasbach-Prisk (University of California San Diego), Chengzhi Mao (Columbia University), Junfeng Yang (Columbia University), Asaf Cidon (Columbia University)

3:10 PM CEST – 3:40 PM CEST: Coffee Break

Location: Catering area

3:40 PM CEST – 5:00 PM CEST

Session Chair: Yu Hua (Huazhong Univ. of Science and Technology)
Composing Distributed Computations Through Task and Kernel Fusion
Rohan Yadav (Stanford University), Shiv Sundram (Stanford University), Wonchan Lee (NVIDIA), Michael Garland (NVIDIA), Michael Bauer (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)

CXLfork: Fast Remote Fork over CXL Fabrics
Chloe Alverti (University of Illinois Urbana-Champaign), Stratos Psomadakis (National Technical University of Athens), Burak Ocalan (University of Illinois Urbana-Champaign), Shashwat Jaiswal (University of Illinois Urbana-Champaign), Tianyin Xu (University of Illinois Urbana-Champaign), Josep Torrellas (University of Illinois Urbana-Champaign)

OS2G: A High-Performance DPU Offloading Architecture for GPU-based Deep Learning with Object Storage
Zhen Jin (Zhejiang University,Alibaba Group), Yiquan Chen (Alibaba Group), Mingxu Liang (Alibaba Group), Yijing Wang (Alibaba Group), Guoju Fang (Alibaba Group), Ao Zhou (Alibaba Group), Keyao Zhang (Zhejiang University), Jiexiong Xu (Zhejiang University), Wenhai Lin (Zhejiang University), Yiquan Lin (Zhejiang University), Shushu Zhao (Alibaba Group), Wenkai Shi (Alibaba Group), Zhenhua He (Alibaba Group), Shishun Cai (Alibaba Group), Wenzhi Chen (Zhejiang University)

pulse: Accelerating Distributed Pointer-Traversals on Disaggregated Memory
Yupeng Tang (Yale University), Seung-seob Lee (Yale University), Abhishek Bhattacharjee (Yale University), Anurag Khandelwal (Yale University)
Session Chair: Mark Silberstein (Technion & NVIDIA)
RANGE-BLOCKS: A Synchronization Facility for Domain-Specific Architectures
Anagha Molakalmur Anil Kumar (Simon Fraser University), Aditya Prasanna (Simon Fraser University), Arrvindh Shriraman (Simon Fraser University)

Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning
Zhaoying Li (National University of Singapore), Pranav Dangi (National University of Singapore), Chenyang Yin (Peking University), Thilini Kaushalya Bandara (National University of Singapore), Rohan Juneja (National University of Singapore), Cheng Tan (Google), Zhenyu Bai (National University of Singapore), Tulika Mitra (National University of Singapore)

Squeezing Operator Performance Potential for the Ascend Architecture
Yuhang Zhou (State Key Laboratory for Novel Software Technology, Nanjing University), Zhibin Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Guyue Liu (Peking University), Shipeng Li (State Key Laboratory for Novel Software Technology, Nanjing University), Xi Lin (State Key Laboratory for Novel Software Technology, Nanjing University), Zibo Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Yongzhong Wang (Huawei Technologies Co., Ltd.), Fuchun Wei (Huawei Technologies Co., Ltd.), Jingyi Zhang (Huawei Technologies Co., Ltd.), Zhiheng Hu (Huawei Technologies Co., Ltd.), Yanlin Liu (Huawei Technologies Co., Ltd.), Chunsheng Li (Huawei Technologies Co., Ltd.), Ziyang Zhang (Huawei Technologies Co., Ltd.), Yaoyuan Wang (Huawei Technologies Co., Ltd.), Bin Zhou (Shandong University), Wanchun Dou (Nanjing University, State Key Laboratory for Novel Software Technology), Guihai Chen (Nanjing University, State Key Laboratory for Novel Software Technology), Chen Tian (Nanjing University, State Key Laboratory for Novel Software Technology)

PICACHU: Plug-In CGRA Handling Upcoming Nonlinear Operations in LLMs
Jiajun Qin (New York University,Zhejiang University), Tianhua Xia (New York University), Cheng Tan (Google,Arizona State University), Jeff Zhang (Arizona State University), Sai Qian Zhang (New York University)
Session Chair: John Wickerson (Imperial College London)
Salus: A Practical Trusted Execution Environment for CPU-FPGA Heterogeneous Cloud Platforms
Yu Zou (Alibaba Group), Yiran Li (Alibaba Group), Sheng Wang (Alibaba Group), Le Su (Alibaba Group), Zhen Gu (DAMO Academy, Alibaba Group,Hupan Lab), Yanheng Lu (DAMO Academy, Alibaba Group,Hupan Lab), Yijin Guan (DAMO Academy, Alibaba Group,Hupan Lab), Dimin Niu (DAMO Academy, Alibaba Group,Hupan Lab), Mingyu Gao (Tsinghua University,Shanghai AI Laboratory), Yuan Xie (DAMO Academy, Alibaba Group,Hupan Lab), Feifei Li (Alibaba Group)

Harmonia: A Unified Framework for Heterogeneous FPGA Acceleration in the Cloud
Luyang Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Heng Pan (Computer Network Information Center, Chinese Academy of Sciences), Xinchen Wan (Hong Kong University of Science and Technology), Kai Lv (Institute of Computing Technology, Chinese Academy of Sciences), Zilong Wang (Hong Kong University of Science and Technology), Qian Zhao (Douyin Co., Ltd.), Feng Ning (Douyin Co., Ltd.), Qingsong Ning (Douyin Co., Ltd.), Shideng Zhang (Douyin Co., Ltd.), Zhenyu Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Layong Luo (Researcher), Gaogang Xie (Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences)

PhasePrint: Exposing Cloud FPGA Fingerprints by Inducing Timing Faults at Runtime
Jubayer Mahmod (Virginia Tech), Matthew Hicks (Virginia Tech)

Hassert: Hardware Assertion-Based Verification Framework with FPGA Acceleration
Ziqing Zhang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Weijie Weng (Xiamen University of Technology), Yaning Li (University College Dublin), Lijia Cai (Hong Kong University of Science and Technology), Haoyu Wang (Zhejiang University), David Boland (The University of Sydney), Yungang Bao (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Kan Shi (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences)
Session Chair: Marios Kogias (Imperial College London)
Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms
Benjamin Reidys (University of Illinois Urbana-Champaign), Pantea Zardoshti (Microsoft), Íñigo Goiri (Microsoft), Celine Irvene (Microsoft), Daniel S. Berger (Microsoft,University of Washington), Haoran Ma (University of California-Los Angeles), Kapil Arya (Microsoft), Eli Cortez (Microsoft), Taylor Stark (Microsoft), Eugene Bak (Microsoft), Mehmet Iyigun (Microsoft), Stanko Novaković (Google), Lisa Hsu (Meta), Karel Trueba (Microsoft), Abhisek Pan (Microsoft), Chetan Bansal (Microsoft), Saravan Rajmohan (Microsoft), Jian Huang (University of Illinois Urbana-Champaign), Ricardo Bianchini (Microsoft)

Cooperative Graceful Degradation in Containerized Clouds
Kapil Agrawal (University of California, Irvine), Sangeetha Abdu Jyothi (University of California, Irvine and VMware Research)

DarwinGame: Playing Tournaments for Tuning Applications in Noisy Cloud Environments
Rohan Basu Roy (University of Utah), Vijay Gadepally (Massachusetts Institute of Technology), Devesh Tiwari (Northeastern University)

Copper and Wire: Bridging Expressiveness and Performance for Service Mesh Policies
Divyanshu Saxena (The University of Texas at Austin), William Zhang (The University of Texas at Austin), Shankara Pailoor (The University of Texas at Austin), Isil Dillig (The University of Texas at Austin), Aditya Akella (The University of Texas at Austin)

5:00 PM CEST – 5:30 PM CEST: Coffee break & ASPLOS poster session (Tuesday afternoon presentations)

Location: Catering area

5:30 PM CEST – 6:30 PM CEST: Wild and Crazy Ideas (WACI)

Location: Rotterdam Hall 1A

7:00 PM CEST – 8:00 PM CEST: Business meeting

Location: Rotterdam Hall 1A


Day 2: Wednesday, April 2

8:30 AM CEST – 9:00 AM CEST: Registration and Welcome Coffee

Location: Catering area

9:00 AM CEST – 10:40 AM CEST

Session Chair: Jonathan Ragan-Kelley (Massachusetts Inst. of Technology)
MetaSapiens: Real-Time Neural Rendering with Efficiency-Aware Pruning and Accelerated Foveated Rendering
Weikai Lin (University of Rochester), Yu Feng (Shanghai Jiao Tong University), Yuhao Zhu (University of Rochester)

D-VSync: Decoupled Rendering and Displaying for Smartphone Graphics
Yuanpei Wu (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Dong Du (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Chao Xu (Fields Lab, Huawei Central Software Institute), Yubin Xia (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Ming Fu (Fields Lab, Huawei Central Software Institute), Binyu Zang (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Haibo Chen (IPADS, Shanghai Jiao Tong University,Key Laboratory of System Software (Chinese Academy of Science))

StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination
Yu Feng (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Zheng Liu (Shanghai Jiao Tong University), Weikai Lin (University of Rochester), Zihan Liu (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Jingwen Leng (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Minyi Guo (Shanghai Jiaotong University,Shanghai Qi Zhi Institute), Zhezhi He (Shanghai Jiao Tong University), Jieru Zhao (Shanghai Jiao Tong University), Yuhao Zhu (University of Rochester)

ARC: Warp-level Adaptive Atomic Reduction in GPUs to Accelerate Differentiable Rendering
Sankeerth Durvasula (Vector Institute, University of Toronto), Adrian Zhao (Vector Institute, University of Toronto), Fan Chen (University of Toronto), Ruofan Liang (Vector Institute, University of Toronto), Pawan Kumar Sanjaya (Vector Institute, University of Toronto), Yushi Guan (Vector Institute, University of Toronto), Christina Giannoula (Vector Institute, University of Toronto), Nandita Vijaykumar (Vector Institute, University of Toronto)

Treelet Accelerated Ray Tracing on GPUs
Yuan Hsi Chou (University of British Columbia), Tor M. Aamodt (University of British Columbia)
Session Chair: Sotiris Apostolakis (Google)
EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters
Xinkai Wang (Shanghai Jiao Tong University), Xiaofeng Hou (Shanghai Jiao Tong University), Chao Li (Shanghai Jiao Tong University), Yuancheng Li (Shanghai Jiao Tong University), Du Liu (Shanghai Jiao Tong University), Guoyao Xu (Alibaba Group), Guodong Yang (Alibaba Group), Liping Zhang (Alibaba Group), Yuemin Wu (Alibaba Cloud), Xiaopeng Yuan (Alibaba Cloud), Quan Chen (Shanghai Jiao Tong University), Minyi Guo (Shanghai Jiao Tong University)

Mint: Cost-Efficient Tracing with All Requests Collection via Commonality and Variability Analysis
Haiyu Huang (Sun Yat-sen University), Cheng Chen (Alibaba Group), Kunyi Chen (Alibaba Group), Pengfei Chen (Sun Yat-sen University), Guangba Yu (Sun Yat-sen University), Zilong He (Sun Yat-sen University), Yilun Wang (Sun Yat-sen University), Huxing Zhang (Alibaba Group), Qi Zhou (Alibaba Group)

Automatic Tracing in Task-Based Runtime Systems
Rohan Yadav (Stanford University), Michael Bauer (NVIDIA), David Broman (KTH Royal Institute of Technology), Michael Garland (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)

Enabling Efficient Mobile Tracing with BTrace
Jiawei Wang (Huawei Dresden Research Center,Huawei Central Software Institute), Nian Liu (Huawei Central Software Institute), Arnau Casadevall-Saiz (Huawei Dresden Research Center,Huawei Central Software Institute), Yutao Liu (Huawei Dresden Research Center,Huawei Central Software Institute), Diogo Behrens (Huawei Dresden Research Center,Huawei Central Software Institute), Ming Fu (Huawei Central Software Institute), Ning Jia (Huawei Central Software Institute), Hermann Härtig (Technische Universität Dresden), Haibo Chen (Huawei Central Software Institute,Shanghai Jiao Tong University)

Rethinking Java Performance Analysis
Stephen M. Blackburn (Google,Australian National University), Zixian Cai (Australian National University), Rui Chen (Unaffiliated-Independent), Xi Yang (IOP Systems), John Zhang (Canva), John Zigman (The University of Sydney)
Session Chair: Mingyu Gao (Tsinghua Univ.)
MPC-Pipe: an Efficient Pipeline Scheme for Semi-honest MPC Machine Learning
Yongqin Wang (Department of Electrical & Computer Engineering, University of Southern California), Rachit Rajat (Department of Electrical & Computer Engineering, University of Southern California), Murali Annavaram (Department of Electrical & Computer Engineering, University of Southern California)

Cinnamon: A Framework for Scale-Out Encrypted AI
Siddharth Jayashankar (Carnegie Mellon University), Edward Chen (Carnegie Mellon University), Tom Tang (Carnegie Mellon University), Wenting Zheng (Carnegie Mellon University), Dimitrios Skarlatos (Carnegie Mellon University)

PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption
Yifan Tan (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Cheng Tan (Northeastern University), Zeyu Mi (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Haibo Chen (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University)

Practical Federated Recommendation Model Learning Using ORAM with Controlled Privacy
Jinyu Liu (The Pennsylvania State University), Wenjie Xiong (Virginia Tech), G. Edward Suh (NVIDIA,Cornell University), Kiwan Maeng (The Pennsylvania State University)

Tackling ML-based Dynamic Mispredictions using Statically Computed Invariants for Attack Surface Reduction
Chris Porter (IBM Research), Sharjeel Khan (Georgia Institute of Technology), Kangqi Ni (Georgia Institute of Technology), Santosh Pande (Georgia Institute of Technology)
Session Chair: Mengjia Yan (Massachusetts Inst. of Technology)
Control Logic Synthesis: Drawing the Rest of the OWL
Zachary D. Sisco (University of California, Santa Barbara), Andrew David Alex (University of California, Santa Barbara), Zechen Ma (University of California, Santa Barbara), Yeganeh Aghamohammadi (University of California, Santa Barbara), Boming Kong (University of California, Santa Barbara), Benjamin Darnell (University of Illinois Urbana-Champaign), Timothy Sherwood (University of California, Santa Barbara), Ben Hardekopf (University of California, Santa Barbara), Jonathan Balkind (University of California, Santa Barbara)

CRUSH: A Credit-Based Approach for Functional Unit Sharing in Dynamically Scheduled HLS
Jiahui Xu (ETH Zurich), Lana Josipović (ETH Zurich)

AMuLeT: Automated Design-Time Testing of Secure Speculation Countermeasures
Bo Fu (University of Toronto), Leo Tenenbaum (University of Toronto), David Adler (University of Toronto), Assaf Klein (Technion – Israel Institute of Technology), Arpit Gogia (IMDEA Software Institute), Alaa R. Alameldeen (Simon Fraser University), Marco Guarnieri (IMDEA Software Institute), Mark Silberstein (Technion – Israel Institute of Technology), Oleksii Oleksenko (Azure Research, Microsoft), Gururaj Saileshwar (University of Toronto)

Don't Repeat Yourself! Coarse-Grained Circuit Deduplication to Accelerate RTL Simulation
Haoyuan Wang (UC Santa Cruz), Thomas Nijssen (UC Santa Cruz), Scott Beamer (UC Santa Cruz)

Parendi: Thousand-Way Parallel RTL Simulation
Mahyar Emami (EPFL), Thomas Bourgeat (EPFL), James R. Larus (EPFL)

10:40 AM CEST – 11:10 AM CEST: Coffee break & ASPLOS poster session (Wednesday afternoon presentations)

Location: Catering area

11:10 AM CEST – 12:30 PM CEST

Session Chair: Daniel S. Berger (Microsoft & UW)
Necro-reaper: Pruning away Dead Memory Traffic in Warehouse-Scale Computers
Sotiris Apostolakis (Google), Chris Kennelly (Google), Xinliang David Li (Google), Parthasarathy Ranganathan (Google)

Embracing Imbalance: Dynamic Load Shifting among Microservice Containers in Shared Clusters
Shutian Luo (Yale University), Jianxiong Liao (Sun Yat-sen University,University of Macau), Chenyu Lin (University of Macau), Huanle Xu (University of Macau), Zhi Zhou (Sun Yat-sen University), Chengzhong Xu (University of Macau)

Tela: A Temporal Load-Aware Cloud Virtual Disk Placement Scheme
Difan Tan (Huazhong University of Science and Technology), Jiawei Li (Huazhong University of Science and Technology), Hua Wang (Huazhong University of Science and Technology), Xiaoxiao Li (Huazhong University of Science and Technology), Wenbo Liu (Huazhong University of Science and Technology), Zijin Qin (Huazhong University of Science and Technology), Ke Zhou (Huazhong University of Science and Technology), Ming Xie (Tencent Inc.), Mengling Tao (Tencent Inc.)

FleetIO: Managing Multi-Tenant Cloud Storage with Multi-Agent Reinforcement Learning
Jinghan Sun (UIUC), Benjamin Reidys (UIUC), Daixuan Li (UIUC), Jichuan Chang (Google), Marc Snir (UIUC), Jian Huang (UIUC)
Session Chair: Paul V. Gratz (Texas A&M Univ.)
ZRAID: Leveraging Zone Random Write Area (ZRWA) for Alleviating Partial Parity Tax in ZNS RAID
Minwook Kim (Seoul National University), Seongyeop Jeong (Seoul National University), Jin-Soo Kim (Seoul National University)

Marionette: A RowHammer Attack via Row Coupling
Seungmin Baek (Seoul National University), Minbok Wi (Seoul National University), Seonyong Park (Seoul National University), Hwayong Nam (Seoul National University), Michael Jaemin Kim (Seoul National University), Nam Sung Kim (University of Illinois), Jung Ho Ahn (Seoul National University)

MOAT: Securely Mitigating Rowhammer with Per-Row Activation Counters
Moinuddin Qureshi (Georgia Institute of Technology), Salman Qazi (Google)

HyperHammer: Breaking Free from KVM-Enforced Isolation
Wei Chen (Peking University), Zhi Zhang (University of Western Australia), Xin Zhang (Peking University), Qingni Shen (Peking University), Yuval Yarom (Ruhr University Bochum), Daniel Genkin (Georgia Institute of Technology), Chen Yan (Peking University), Zhe Wang (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Zhongguancun Laboratory)
Session Chair: Saugata Ghose (Univ. of Illinois Urbana-Champaign)
ReCA: Integrated Acceleration for Real-Time and Efficient Cooperative Embodied Autonomous Agents
Zishen Wan (Georgia Institute of Technology), Yuhang Du (University of Minnesota, Twin Cities), Mohamed Ibrahim (Georgia Institute of Technology), Jiayi Qian (Georgia Institute of Technology), Jason Jabbour (Harvard University), Yang (Katie) Zhao (University of Minnesota, Twin Cities), Tushar Krishna (Georgia Institute of Technology), Arijit Raychowdhury (Georgia Institute of Technology), Vijay Janapa Reddi (Harvard University)

AnA: An Attentive Autonomous Driving System
Wonkyo Choe (University of Virginia), Rongxiang Wang (University of Virginia), Felix Xiaozhu Lin (University of Virginia)

SuperNoVA: Algorithm-Hardware Co-Design for Resource-Aware SLAM
Seah Kim (University of California, Berkeley), Roger Hsiao (University of California, Berkeley), Borivoje Nikolić (University of California, Berkeley), James Demmel (University of California, Berkeley), Yakun Sophia Shao (University of California, Berkeley)

OctoCache: Caching Voxels for Accelerating 3D Occupancy Mapping in Autonomous Systems
Peiqing Chen (University of Maryland), Minghao Li (Harvard University), Zishen Wan (Georgia Institute of Technology), Yushun Hsiao (Harvard University), Minlan Yu (Harvard University), Vijay Janapa Reddi (Harvard University), Zaoxing Liu (University of Maryland)
Session Chair: Sara Achour (Stanford Univ.)
Efficient Lossless Compression of Scientific Floating-Point Data on CPUs and GPUs
Noushin Azami (Department of Computer Science, Texas State University), Alex Fallin (Department of Computer Science, Texas State University), Martin Burtscher (Department of Computer Science, Texas State University)

Data Cache for Intermittent Computing Systems with Non-Volatile Main Memory
Sourav Mohapatra (Department of Computer Science, TU Delft,ARM Limited), Vito Kortbeek (Department of Computer Science, TU Delft,Synopsys), Marco Antonio van Eerden (Department of Computer Science, TU Delft), Jochem Broekhoff (Department of Computer Science, TU Delft,DSP Innovation B.V.), Saad Ahmed (School of Interactive Computing, Georgia Institute of Technology), Przemysław Pawełczak (Department of Computer Science, TU Delft)

Fusion: An Analytics Object Store Optimized for Query Pushdown
Jianan Lu (Princeton University), Ashwini Raina (Princeton University), Asaf Cidon (Columbia University), Michael J. Freedman (Princeton University)

VertexSurge: Variable Length Graph Pattern Match on Billion-edge Graphs
Weiyu Xie (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), MingXing Zhang (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Xia Liao (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Kang Chen (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Jinlei Jiang (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), YongWei Wu (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University)

12:30 PM CEST – 2:00 PM CEST: Lunch

Location: Catering area

2:00 PM CEST – 3:40 PM CEST

Session Chair: Xiaosong Ma (MBZUAI)
Fast On-device LLM Inference with NPUs
Daliang Xu (Key Lab of HCST (PKU), MOE; SCS, Peking University), Hao Zhang (Beijing University of Posts and Telecommunications), Liming Yang (Key Lab of HCST (PKU), MOE; SCS, Peking University), Ruiqi Liu (Key Lab of HCST (PKU), MOE; SCS, Peking University), Gang Huang (Key Lab of HCST (PKU), MOE; SCS, Peking University,National Key Laboratory of Data Space Technology and System), Mengwei Xu (Beijing University of Posts and Telecommunications), Xuanzhe Liu (Key Lab of HCST (PKU), MOE; SCS, Peking University)

Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow
Yixuan Mei (Carnegie Mellon University), Yonghao Zhuang (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Juncheng Yang (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University), Rashmi Vinayak (Carnegie Mellon University)

FlexSP: Accelerating Large Language Model Training via Flexible Sequence Parallelism
Yujie Wang (Peking University), Shiju Wang (Beihang University), Shenhan Zhu (Peking University), Fangcheng Fu (Peking University), Xinyi Liu (Peking University), Xuefeng Xiao (ByteDance Inc.), Huixia Li (ByteDance Inc.), Jiashi Li (ByteDance Inc.), Faming Wu (ByteDance Inc.), Bin Cui (Peking University)

Spindle: Efficient Distributed Training of Multi-Task Large Models via Wavefront Scheduling
Yujie Wang (Peking University), Shenhan Zhu (Peking University), Fangcheng Fu (Peking University), Xupeng Miao (Purdue University), Jie Zhang (Alibaba Group), Juan Zhu (Alibaba Group), Fan Hong (Alibaba Group), Yong Li (Alibaba Group), Bin Cui (Peking University)

Vela: A Virtualized LLM Training System with GPU Direct RoCE
Apoorve Mohan (IBM Research), Robert Walkup (IBM Research), Bengi Karacali (IBM Research), Ming-Hung Chen (IBM Research), Abdullah Kayi (IBM Research), Liran Schour (IBM), Shweta Salaria (IBM Research), Sophia Wen (IBM Research), IHsin Chung (IBM Research), Abdul Alim (IBM Research), Constantinos Evangelinos (IBM Research), Lixiang Luo (IBM Research), Marc Dombrowa (IBM Research), Laurent Schares (IBM Research), Ali Sydney (IBM Research), Pavlos Maniotis (IBM Research), Sandhya Koteshwara (IBM Research), Brent Tang (IBM), Joel Belog (IBM), Rei Odaira (IBM), Vasily Tarasov (IBM Research), Eran Gampel (IBM Cloud), Drew Thorstensen (IBM), Talia Gershon (IBM Research), Seetharami Seelam (IBM Research)
Session Chair: Martin Maas (Google DeepMind)
Forecasting GPU Performance for Deep Learning Training and Inference
Seonho Lee (Georgia Institute of Technology), Amar Phanishayee (Meta), Divya Mahajan (Georgia Institute of Technology)

MVQ: Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization
Shuaiting Li (Zhejiang University), Chengxuan Wang (Zhejiang University), Juncan Deng (Zhejiang University), Zeyu Wang (Zhejiang University), Zewen Ye (Zhejiang University), Zongsheng Wang (Zhejiang University), Haibin Shen (Zhejiang University), Kejie Huang (Zhejiang University)

PartIR: Composing SPMD Partitioning Strategies for Machine Learning
Sami Alabed (Google DeepMind), Daniel Belov (Google DeepMind), Bart Chrzaszcz (Google DeepMind), Juliana Franco (Google DeepMind), Dominik Grewe (Google DeepMind), Dougal Maclaurin (Google DeepMind), James Molloy (Google DeepMind), Tom Natan (Google DeepMind), Tamara Norman (Google DeepMind), Xiaoyue Pan (Google DeepMind), Adam Paszke (Google DeepMind), Norman Alexander Rink (Google DeepMind), Michael Schaarschmidt (Isomorphic Labs), Timur Sitdikov (Google DeepMind), Agnieszka Swietlik (Google DeepMind), Dimitrios Vytiniotis (Google DeepMind), Joel Wee (Google DeepMind)

Using Analytical Performance/Power Model and Fine-Grained DVFS to Enhance AI Accelerator Energy Efficiency
Zibo Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Yijia Zhang (Peng Cheng Laboratory), Fuchun Wei (Huawei Technologies Co., Ltd), Bingqiang Wang (Peng Cheng Laboratory), Yanlin Liu (Huawei Technologies Co., Ltd), Zhiheng Hu (Huawei Technologies Co., Ltd), Jingyi Zhang (Huawei Technologies Co., Ltd), Xiaoxin Xu (Huawei Technologies Co., Ltd), Jian He (Huawei Technologies Co., Ltd), Xiaoliang Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Wanchun Dou (State Key Laboratory for Novel Software Technology, Nanjing University), Guihai Chen (State Key Laboratory for Novel Software Technology, Nanjing University), Chen Tian (State Key Laboratory for Novel Software Technology, Nanjing University)

Early Termination for Hyperdimensional Computing Using Inferential Statistics
Pu (Luke) Yi (Stanford University), Yifan Yang (Stanford University), Chae Young Lee (Stanford University), Sara Achour (Stanford University)
Session Chair: Murali Annavaram (Univ. of South California)
Saving Energy with Per-Variable Bitwidth Speculation
Tommy McMichen (Northwestern University), David Dlott (Northwestern University), Panitan Wongse-ammat (Northwestern University), Nathan Greiner (Northwestern University), Hussain Khajanchi (Northwestern University), Russ Joseph (Northwestern University), Simone Campanoni (Northwestern University)

ShadowLoad: Injecting State into Hardware Prefetchers
Lorenz Hetterich (CISPA Helmholtz Center for Information Security), Fabian Thomas (CISPA Helmholtz Center for Information Security), Lukas Gerlach (CISPA Helmholtz Center for Information Security), Ruiyi Zhang (CISPA Helmholtz Center for Information Security), Nils Bernsdorf (CISPA Helmholtz Center for Information Security), Eduard Ebert (CISPA Helmholtz Center for Information Security), Michael Schwarz (CISPA Helmholtz Center for Information Security)

Skia: Exposing Shadow Branches
Chrysanthos Pepi (Texas A&M University,Intel), Bhargav Reddy Godala (Princeton University,Intel), Krishnam Tibrewala (Texas A&M University), Gino A. Chacon (Intel,AheadComputing), Paul V. Gratz (Texas A&M University), Daniel A. Jiménez (Texas A&M University,Barcelona Supercomputing Center), Gilles A. Pokam (Intel), David I. August (Princeton University)

Hierarchical Prefetching, A Software-Hardware Instruction Prefetcher for Server Applications
Tingji Zhang (Tsinghua University), Boris Grot (University of Edinburgh,Huawei Research), Wenjian He (Huawei Technologies Co., Ltd.), Yashuai Lv (Huawei Technologies Co., Ltd.), Peng Qu (Tsinghua University), Fang Su (Huawei Technologies Co., Ltd.), Wenxin Wang (Tsinghua University), Guowei Zhang (Huawei Technologies Co., Ltd.), Xuefeng Zhang (Tsinghua University), Youhui Zhang (Tsinghua University,Zhongguancun National Laboratory)

Bounding Speculative Execution of Atomic Regions to a Single Retry
Eduardo José Gómez-Hernández (Computer Engineering Department, University of Murcia), Juan M. Cebrian (Computer Engineering Department, University of Murcia), Stefanos Kaxiras (Department of Information Technology, Uppsala University), Alberto Ros (Computer Engineering Department, University of Murcia)
Session Chair: Jayneel Gandhi (Meta)
Formalising CXL Cache Coherence
Chengsong Tan (Kaihong), Alastair F. Donaldson (Imperial College London), John Wickerson (Imperial College London)

CtXnL: A Software-Hardware Co-Designed Solution for Efficient CXL-Based Transaction Processing
Zhao Wang (Peking University, School of Integrated Circuits,Peking University, School of Computer Science), Yiqi Chen (Peking University, School of Integrated Circuits), Cong Li (Peking University, School of Integrated Circuits), Yijin Guan (Alibaba Group, DAMO Academy,Hupan Lab), Dimin Niu (Alibaba Group, DAMO Academy,Hupan Lab), Tianchan Guan (Alibaba Group, DAMO Academy,Hupan Lab), Zhaoyang Du (Alibaba Group, DAMO Academy,Hupan Lab), Xingda Wei (Shanghai Jiao Tong University, Institute of Parallel and Distributed Systems, SEIEE), Guangyu Sun (Peking University, School of Integrated Circuits,Beijing Advanced Innovation Center for Integrated Circuits)

ByteFS: System Support for (CXL-based) Memory-Semantic Solid-State Drives
Shaobo Li (University of Illinois Urbana-Champaign), Yirui (Eric) Zhou (University of Illinois Urbana-Champaign), Hao Ren (University of Illinois Urbana-Champaign), Jian Huang (University of Illinois Urbana-Champaign)

M5: Mastering Page Migration and Memory Management for CXL-based Tiered Memory Systems
Yan Sun (University of Illinois Urbana-Champaign), Jongyul Kim (University of Illinois Urbana-Champaign), Zeduo Yu (University of Illinois Urbana-Champaign), Jiyuan Zhang (University of Illinois Urbana-Champaign), Siyuan Chai (University of Illinois Urbana-Champaign), Michael Jaemin Kim (Seoul National University), Hwayong Nam (Seoul National University), Jaehyun Park (Seoul National University), Eojin Na (Seoul National University), Yifan Yuan (Intel Labs), Ren Wang (Intel Labs), Jung Ho Ahn (Seoul National University), Tianyin Xu (University of Illinois Urbana-Champaign), Nam Sung Kim (University of Illinois Urbana-Champaign)

Systematic CXL Memory Characterization and Performance Analysis at Scale
Jinshu Liu (Virginia Tech), Hamid Hadian (Virginia Tech), Yuyue Wang (Virginia Tech), Daniel S. Berger (Microsoft and University of Washington), Marie Nguyen (Samsung), Xun Jian (Virginia Tech), Sam H. Noh (Virginia Tech), Huaicheng Li (Virginia Tech)

3:40 PM CEST – 4:10 PM CEST: Coffee break & ASPLOS poster session (Wednesday morning presentations)

Location: Catering area

4:10 PM CEST – 5:50 PM CEST

Session Chair: Chaojie Zhang (Microsoft)
MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs
Shiyi Cao (UC Berkeley), Shu Liu (UC Berkeley), Tyler Griggs (UC Berkeley), Peter Schafhalter (UC Berkeley), Xiaoxuan Liu (UC Berkeley), Ying Sheng (Stanford University), Joseph E. Gonzalez (UC Berkeley), Matei Zaharia (UC Berkeley), Ion Stoica (UC Berkeley)

FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models
Xinglin Pan (The Hong Kong University of Science and Technology (Guangzhou)), Wenxiang Lin (Harbin Institute of Technology, Shenzhen), Lin Zhang (Hong Kong University of Science and Technology), Shaohuai Shi (Harbin Institute of Technology, Shenzhen), Zhenheng Tang (The Hong Kong University of Science and Technology), Rui Wang (The Hong Kong University of Science and Technology (Guangzhou)), Bo Li (Hong Kong University of Science and Technology), Xiaowen Chu (The Hong Kong University of Science and Technology (Guangzhou),Hong Kong University of Science and Technology)

CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory
Jiashun Suo (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Xiaojian Liao (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Limin Xiao (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Li Ruan (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Jinquan Wang (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Xiao Su (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Zhisheng Huo (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University)

Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
Zhiyuan Fang (Sun Yat-sen University), Yuegui Huang (Sun Yat-sen University), Zicong Hong (Hong Kong University of Science and Technology), Yufeng Lyu (Huawei Technologies Co. Ltd), Wuhui Chen (Sun Yat-sen University,Peng Cheng Laboratory), Yue Yu (Peng Cheng Laboratory), Fan Yu (Huawei Technologies Co. Ltd), Zibin Zheng (Sun Yat-sen University)

MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training
Weilin Cai (The Hong Kong University of Science and Technology (Guangzhou)), Le Qin (The Hong Kong University of Science and Technology (Guangzhou)), Jiayi Huang (The Hong Kong University of Science and Technology (Guangzhou))
Session Chair: Christina Giannoula (Univ. of Toronto)
Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning
Shenggan Cheng (National University of Singapore), Shengjie Lin (Georgia Institute of Technology), Lansong Diao (Alibaba Group), Hao Wu (George Mason University), Siyu Wang (Alibaba Group), Chang Si (Alibaba Group), Ziming Liu (National University of Singapore), Xuanlei Zhao (National University of Singapore), Jiangsu Du (Sun Yat-sen University), Wei Lin (Alibaba Group), Yang You (National University of Singapore)

Design and Operation of Shared Machine Learning Clusters on Campus
Kaiqiang Xu (Hong Kong University of Science and Technology), Decang Sun (Hong Kong University of Science and Technology), Hao Wang (Hong Kong University of Science and Technology), Zhenghang Ren (Hong Kong University of Science and Technology), Xinchen Wan (Hong Kong University of Science and Technology), Xudong Liao (Hong Kong University of Science and Technology), Zilong Wang (Hong Kong University of Science and Technology), Junxue Zhang (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology)

PCcheck: Persistent Concurrent Checkpointing for ML
Foteini Strati (ETH Zurich), Michal Friedman (ETH Zurich), Ana Klimovic (ETH Zurich)

Tally: Non-Intrusive Performance Isolation for Concurrent Deep Learning Workloads
Wei Zhao (Stanford University,CentML), Anand Jayarajan (University of Toronto,Vector Institute), Gennady Pekhimenko (University of Toronto,Vector Institute)

Load and MLP-Aware Thread Orchestration for Recommendation Systems Inference on CPUs
Rishabh Jain (The Pennsylvania State University), Teyuh Chou (Advanced Micro Devices, Inc.), Onur Kayiran (Advanced Micro Devices, Inc.), John Kalamatianos (Advanced Micro Devices, Inc.), Gabriel H. Loh (Advanced Micro Devices, Inc.), Mahmut T. Kandemir (The Pennsylvania State University), Chita R. Das (The Pennsylvania State University)
Session Chair: Steve Blackburn (Google)
Validating JVM Compilers via Maximizing Optimization Interactions
Zifan Xie (Huazhong University of Science and Technology), Ming Wen (Huazhong University of Science and Technology), Shiyu Qiu (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology)

Faster Chaitin-like Register Allocation via Grammatical Decompositions of Control-Flow Graphs
Xuran Cai (Hong Kong University of Science and Technology), Amir Kafshdar Goharshady (University of Oxford), S. Hitarth (Hong Kong University of Science and Technology), Chun Kit Lam (Hong Kong University of Science and Technology)

Towards Sound Reassembly of Modern x86-64 Binaries
Hyungseok Kim (The Affiliated Institute of ETRI), Soomin Kim (KAIST), Sang Kil Cha (KAIST)

SmoothE: Differentiable E-Graph Extraction
Yaohui Cai (Cornell University), Kaixin Yang (Cornell University), Chenhui Deng (Cornell University), Cunxi Yu (University of Maryland, College Park), Zhiru Zhang (Cornell University)

Exo 2: Growing a Scheduling Language
Yuka Ikarashi (MIT CSAIL), Kevin Qian (MIT CSAIL), Samir Droubi (MIT CSAIL), Alex Reinking (Adobe), Gilbert Louis Bernstein (University of Washington), Jonathan Ragan-Kelley (MIT CSAIL)
Session Chair: Pedro Fonseca (Purdue Univ.)
Manta: Hybrid-Sensitive Type Inference Toward Type-Assisted Bug Detection for Stripped Binaries
Chengfeng Ye (The Hong Kong University of Science and Technology), Yuandao Cai (The Hong Kong University of Science and Technology), Anshunkang Zhou (The Hong Kong University of Science and Technology), Heqing Huang (City University of Hong Kong), Hao Ling (The Hong Kong University of Science and Technology), Charles Zhang (The Hong Kong University of Science and Technology)

Selectively Uniform Concurrency Testing
Huan Zhao (National University of Singapore), Dylan Wolff (National University of Singapore), Umang Mathur (National University of Singapore), Abhik Roychoudhury (National University of Singapore)

TAOPT: Tool-Agnostic Optimization of Parallelized Automated Mobile UI Testing
Dezhi Ran (Key Lab of HCST (PKU), MOE; SCS, Peking University), Zihe Song (University of Texas at Dallas), Wenyu Wang (University of Illinois at Urbana-Champaign), Wei Yang (University of Texas at Dallas), Tao Xie (Key Lab of HCST (PKU), MOE; SCS, Peking University)

Debugger Toolchain Validation via Cross-Level Debugging
Yibiao Yang (State Key Laboratory for Novel Software Technology, Nanjing University), Maolin Sun (State Key Laboratory for Novel Software Technology, Nanjing University), Jiangchang Wu (State Key Laboratory for Novel Software Technology, Nanjing University), Qingyang Li (State Key Laboratory for Novel Software Technology, Nanjing University), Yuming Zhou (State Key Laboratory for Novel Software Technology, Nanjing University)

Dynamic Partial Deadlock Detection and Recovery via Garbage Collection
Georgian-Vlad Saioc (Aarhus University,Programming Systems Group, Uber Technologies, Inc.), I-Ting Angelina Lee (Washington University in St. Louis), Anders Møller (Aarhus University), Milind Chabbi (Programming Systems Group, Uber Technologies, Inc.)

6:00 PM CEST – 6:15 PM CEST: Shuttle bus to banquet, Postillion Hotel, back side

7:00 PM CEST – 11:30 PM CEST: ASPLOS + EuroSys Banquet

Location: ss Rotterdam


Day 3: Thursday, April 3

8:30 AM CEST – 9:00 AM CEST: Registration and Welcome Coffee

Location: Catering area

9:00 AM CEST – 10:40 AM CEST

Session Chair: Thaleia Doudali (IMDEA Software)
Accelerating LLM Serving for Multi-turn Dialogues with Efficient Resource Management
Jinwoo Jeong (Korea University), Jeongseob Ahn (Korea University)

COMET: Towards Practical W4A4KV4 LLMs Serving
Lian Liu (Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Long Cheng (North China Electric Power University), Haimeng Ren (ShanghaiTech University), Zhaohui Xu (ShanghaiTech University), Yudong Pan (Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Mengdi Wang (Institute of Computing Technology, CAS), Xiaowei Li (Institute of Computing Technology, CAS,Zhongguancun Laboratory), Yinhe Han (Institute of Computing Technology, CAS), Ying Wang (Institute of Computing Technology, CAS)

Past-Future Scheduler for LLM Serving under SLA Guarantees
Ruihao Gong (Beihang University), Shihao Bai (SenseTime), Siyu Wu (Beihang University), Yunqian Fan (SenseTime), Zaijun Wang (SenseTime), Xiuhong Li (Peking University), Hailong Yang (Beihang University), Xianglong Liu (Beihang University)

POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference
Aditya K Kamath (Paul G Allen School of Computer Science and Engineering, University of Washington), Ramya Prabhu (Microsoft Research India), Jayashree Mohan (Microsoft Research India), Simon Peter (Paul G Allen School of Computer Science and Engineering, University of Washington), Ramachandran Ramjee (Microsoft Research India), Ashish Panwar (Microsoft Research India)

TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms
Jovan Stojkovic (University of Illinois at Urbana-Champaign), Chaojie Zhang (Microsoft Azure Research), Íñigo Goiri (Microsoft Azure Research), Esha Choukse (Microsoft Azure Research), Haoran Qiu (Microsoft Azure Research), Rodrigo Fonseca (Microsoft Azure Research), Josep Torrellas (University of Illinois at Urbana-Champaign), Ricardo Bianchini (Microsoft Azure)
Session Chair: Manuel Rigger (National Univ. of Singapore)
The Mutators Reloaded: Fuzzing Compilers with Large Language Model Generated Mutation Operators
Xianfei Ou (Nanjing University), Cong Li (Ant Group,Zhejiang University), Yanyan Jiang (Nanjing University), Chang Xu (Nanjing University)

ClosureX: Compiler Support for Correct Persistent Fuzzing
Rishi Ranjan (Virginia Tech), Ian Paterson (Virginia Tech), Matthew Hicks (Virignia Tech)

Ratte: Fuzzing for Miscompilations in Multi-Level Compilers Using Composable Semantics
Pingshi Yu (Imperial College London), Nicolas Wu (Imperial College London), Alastair F. Donaldson (Imperial College London)

Snowplow: Effective Kernel Fuzzing with a Learned White-box Test Mutator
Sishuai Gong (Purdue University), Wang Rui (Purdue University), Deniz Altinbüken (Google DeepMind), Pedro Fonseca (Purdue University), Petros Maniatis (Google DeepMind)

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models
Chenyuan Yang (University of Illinois at Urbana-Champaign), Zijie Zhao (University of Illinois at Urbana-Champaign), Lingming Zhang (University of Illinois at Urbana-Champaign)
Session Chair: Gernot Heiser (UNSW)
Controlled Preemption: Amplifying Side-Channel Attacks from Userspace
Yongye Zhu (University of California, Berkeley), Boru Chen (University of California, Berkeley), Zirui Neil Zhao (NVIDIA,UT Austin), Christopher W. Fletcher (University of California, Berkeley)

Protecting Cryptographic Code Against Spectre-RSB
Santiago Arranz Olmos (MPI-SP), Gilles Barthe (MPI-SP, IMDEA Software Institute), Chitchanok Chuengsatiansup (University of Melbourne), Benjamin Gregoire (Inria), Vincent Laporte (Université de Lorraine, CNRS, Inria, LORIA), Tiago Oliveira (Sandbox AQ), Peter Schwabe (MPI-SP,Radboud University), Yuval Yarom (Ruhr University Bochum), Zhiyuan Zhang (MPI-SP)

Reload+Reload: Exploiting Cache and Memory Contention Side Channel on AMD SEV
Li-Chung Chiang (National Taiwan University), Shih-Wei Li (National Taiwan University)

SMaCk: Efficient Instruction Cache Attacks via Self-Modifying Code Conflicts
Seonghun Son (Iowa State University), Daniel Moghimi (Google), Berk Gulmezoglu (Iowa State University)

FlexProf: Flexible, Side-Channel-Free Memory Access
Jarrett Minton (University of Utah), Rajeev Balasubramonian (University of Utah)
Session Chair: Pramod Bhatotia (Technische Univ. München)
Clapton: Clifford Assisted Problem Transformation for Error Mitigation in Variational Quantum Algorithms
Lennart Maximilian Seifert (Department of Computer Science, University of Chicago), Siddharth Dangwal (Department of Computer Science, University of Chicago), Frederic T. Chong (Department of Computer Science, University of Chicago), Gokul Subramanian Ravi (Electrical Engineering and Computer Science Department, University of Michigan)

QECC-Synth: A Layout Synthesizer for Quantum Error Correction Codes on Sparse Architectures
Keyi Yin (University of California, San Diego), Hezi Zhang (University of California, San Diego), Xiang Fang (University of California, San Diego), Yunong Shi (AWS Quantum Technologies), Travis S. Humble (Oak Ridge National Laboratory), Ang Li (Pacific Northwest National Laboratory), Yufei Ding (University of California, San Diego)

HetEC: Architectures for Heterogeneous Quantum Error Correction Codes
Samuel Stein (Future Computing Technologies, Pacific Northwest National Laboratory), Shifan Xu (Yale Quantum Institute, Yale University), Andrew W. Cross (IBM Quantum, IBM T.J Watson Research Center), Theodore J. Yoder (IBM Quantum, IBM T. J. Watson Research Center), Ali Javadi-Abhari (IBM Quantum, IBM T. J. Watson Research Center), Chenxu Liu (Future Computing Technologies, Pacific Northwest National Laboratory), Kun Liu (Yale Quantum Institute, Yale University), Zeyuan Zhou (Yale Quantum Institute, Yale University), Charlie Guinn (Department of Physics, Princeton University), Yufei Ding (Department of Computer Science & Engineering, University of California San Diego), Yongshan Ding (Yale Quantum Institute, Yale University), Ang Li (Future Computing Technologies, Pacific Northwest National Laboratory,Department of Electrical & Computer Engineering, University of Washington)

Micro Blossom: Accelerated Minimum-Weight Perfect Matching Decoding for Quantum Error Correction
Yue Wu (Yale University), Namitha Liyanage (Yale University), Lin Zhong (Yale University)

RESCQ: Realtime Scheduling for Continuous Angle Quantum Error Correction Architectures
Sayam Sethi (Department of Electrical and Computer Engineering, The University of Texas at Austin), Jonathan Mark Baker (Department of Electrical and Computer Engineering, The University of Texas at Austin)

10:40 AM CEST – 11:10 AM CEST: Coffee break & ASPLOS poster session (Thursday afternoon presentations)

Location: Catering area

11:10 AM CEST – 12:30 PM CEST

Session Chair: Gennady Pekhimenko (Toronto/CentML)
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Byungsoo Jeon (NVIDIA), Mengdi Wu (Carnegie Mellon Univerisity), Shiyi Cao (UC Berkeley), Sunghyun Kim (Massachusetts Institute of Technology), Sunghyun Park (NVIDIA), Neeraj Aggarwal (Carnegie Mellon University), Colin Unger (Stanford University), Daiyaan Arfeen (Carnegie Mellon University), Peiyuan Liao (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Mohammad Alizadeh (Massachusetts Institute of Technology), Gregory R. Ganger (Carnegie Mellon University), Tianqi Chen (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University)

Cascade: A Dependency-aware Efficient Training Framework for Temporal Graph Neural Network
Yue Dai (Department of Computer Science, University of Pittsburgh), Xulong Tang (Department of Computer Science, University of Pittsburgh), Youtao Zhang (Department of Computer Science, University of Pittsburgh)

Frugal: Efficient and Economic Embedding Model Training with Commodity GPUs
Minhui Xie (Tsinghua University,Renmin University of China), Shaoxun Zeng (Tsinghua University), Hao Guo (Tsinghua University), Shiwei Gao (Tsinghua University), Youyou Lu (Tsinghua University)

FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale
Zeyu Zhu (Institute of Automation, Chinese Academy of Sciences,School of Future Technology, University of Chinese Academy of Sciences), Peisong Wang (Institute of Automation, Chinese Academy of Sciences), Qinghao Hu (Institute of Automation, Chinese Academy of Sciences,AiRiA), Gang Li (Shanghai Jiao Tong University), Xiaoyao Liang (Shanghai Jiao Tong University), Jian Cheng (Institute of Automation, Chinese Academy of Sciences,AiRiA)
Session Chair: Michael Stumm (Univ. of Toronto)
Extended User Interrupts (xUI): Fast and Flexible Notification without Polling
Berk Aydogmus (Purdue University), Linsong Guo (UC San Diego), Danial Zuberi (UC San Diego), Tal Garfinkel (UC San Diego), Dean Tullsen (UC San Diego), Amy Ousterhout (UC San Diego), Kazem Taram (Purdue University)

Stramash: A Fused-kernel Operating System For Cache-Coherent, Heterogeneous-ISA Platforms
Tong Xing (The University of Edinburgh), Cong Xiong (Imperial College London), Tianrui Wei (UC Berkeley), April Sanchez (Google), Binoy Ravindran (Virginia Tech), Jonathan Balkind (UC Santa Barbara), Antonio Barbalace (The University of Edinburgh)

H-Houdini: Scalable Invariant Learning
Sushant Dinesh (University of California, Berkeley), Yongye Zhu (University of California, Berkeley), Christopher W. Fletcher (University of California, Berkeley)

Target-Aware Implementation of Real Expressions
Brett Saiki (University of Washington), Jackson Brough (University of Utah), Jonas Regehr (University of Utah), Jesus Ponce (University of Utah), Varun Pradeep (University of Washington), Aditya Akhileshwaran (University of Washington), Zachary Tatlock (University of Washington), Pavel Panchekha (University of Utah)
Session Chair: Trevor E. Carlson (NUS)
TaintEMU: Decoupling Tracking from Functional Domains for Architecture-Agnostic and Efficient Whole-System Taint Tracking
Lei Cui (Guangxi Normal University), Youquan Xian (Guangxi Normal University), Peng Liu (Guangxi Normal University), Longjin Lu (Independent Researcher)

Segue & ColorGuard: Optimizing SFI Performance and Scalability on Modern Architectures
Shravan Narayan (UT Austin), Tal Garfinkel (UC San Diego), Evan Johnson (UC San Diego), Zachary Yedidia (Stanford University), Yingchen Wang (UC Berkeley), Andrew Brown (Intel), Anjo Vahldiek-Oberwagner (Intel Labs), Michael LeMay (Intel Labs), Wenyong Huang (Intel), Xin Wang (Intel), Mingqiu Sun (Intel), Dean Tullsen (UC San Diego), Deian Stefan (UC San Diego)

Pave: Information Flow Control for Privacy-preserving Online Data Processing Services
Minkyung Park (University of Texas at Dallas), Jaeseung Choi (Sogang University), Hyeonmin Lee (University of Virginia), Taekyoung Kwon (Seoul National University)

Sharing is leaking: blocking transient-execution attacks with core-gapped confidential VMs
Charly Castes (EPFL,Google), Andrew Baumann (Google)
Session Chair: Michael Swift (Univ. of Wisconsin)
EDM: An Ultra-Low Latency Ethernet Fabric for Memory Disaggregation
Weigao Su (Purdue University), Vishal Shrivastav (Purdue University)

Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness
Shaofeng Wu (The Chinese University of Hong Kong), Qiang Su (The Chinese University of Hong Kong), Zhixiong Niu (Microsoft Research), Hong Xu (The Chinese University of Hong Kong)

Gigaflow: Pipeline-Aware Sub-Traversal Caching for Modern SmartNICs
Annus Zulfiqar (University of Michigan), Ali Imran (University of Michigan), Venkat Kunaparaju (Purdue University), Ben Pfaff (Feldera Inc.), Gianni Antichi (Politecnico di Milano), Muhammad Shahbaz (University of Michigan)

TNIC: A Trusted NIC Architecture
Dimitra Giantsidi (The University of Edinburgh), Julian Pritzi (Technical University of Munich), Felix Gust (Technical University of Munich), Antonios Katsarakis (Huawei Research), Atsushi Koshiba (Technical University of Munich), Pramod Bhatotia (Technical University of Munich)

12:30 PM CEST – 2:00 PM CEST: Lunch

Location: Catering area

2:00 PM CEST – 3:40 PM CEST

Session Chair: Steven Lyubomirsky (NVIDIA)
Einsum Trees: An Abstraction for Optimizing the Execution of Tensor Expressions
Alexander Breuer (Friedrich Schiller University Jena), Mark Blacher (Friedrich Schiller University Jena), Max Engel (Friedrich Schiller University Jena), Joachim Giesen (Friedrich Schiller University Jena), Alexander Heinecke (Intel Corporation), Julien Klaus (Friedrich Schiller University Jena), Stefan Remke (Friedrich Schiller University Jena)

Optimizing Deep Learning Inference Efficiency through Block Dependency Analysis
Zhanyuan Di (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Leping Wang (SKLP, Institute of Computing Technology, CAS), En Shao (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Zhaojia Ma (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ziyi Ren (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Feng Hua (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Lixian Ma (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Jie Zhao (Hunan University), Guangming Tan (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ninghui Sun (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences)

Pruner: A Draft-then-Verify Exploration Mechanism to Accelerate Tensor Program Tuning
Liang Qiao (University of Science and Technology of China), Jun Shi (University of Science and Technology of China), Xiaoyu Hao (University of Science and Technology of China), Xi Fang (University of Science and Technology of China), Sen Zhang (University of Science and Technology of China), Minfan Zhao (University of Science and Technology of China), Ziqi Zhu (University of Science and Technology of China), Junshi Chen (University of Science and Technology of China), Hong An (University of Science and Technology of China), Xulong Tang (University of Pittsburgh), Bing Li (NIO), Honghui Yuan (NIO), Xinyang Wang (NIO)

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Ruihang Lai (Carnegie Mellon University), Junru Shao (OpenAI), Siyuan Feng (Shanghai Jiao Tong University), Steven Lyubomirsky (NVIDIA), Bohan Hou (Carnegie Mellon University), Wuwei Lin (OpenAI), Zihao Ye (University of Washington), Hongyi Jin (Carnegie Mellon University), Yuchen Jin (Hyperbolic), Jiawei Liu (University of Illinois Urbana-Champaign), Lesheng Jin (Hyperbolic), Yaxing Cai (NVIDIA), Ziheng Jiang (ByteDance), Yong Wu (NVIDIA), Sunghyun Park (NVIDIA), Prakalp Srivastava (Netflix), Jared Roesch (NVIDIA), Todd C. Mowry (Carnegie Mellon University), Tianqi Chen (Carnegie Mellon University,NVIDIA)

Towards End-to-End Optimization of LLM-based Applications with Ayo
Xin Tan (The Chinese University of Hong Kong), Yimin Jiang (Unaffiliated), Yitao Yang (The Chinese University of Hong Kong), Hong Xu (The Chinese University of Hong Kong)
Session Chair: Timothy Pinkston (Univ. of South California)
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Haiyu Mao (King's College London,ETH Zürich), Christina Giannoula (University of Toronto,Vector Institute), Mohammad Sadrosadati (ETH Zürich), Juan Gómez-Luna (NVIDIA), Huawei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Xiaowei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ying Wang (SKLP, Institute of Computing Technology, CAS), Onur Mutlu (ETH Zürich)

PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference
Yufeng Gu (University of Michigan), Alireza Khadem (University of Michigan), Sumanth Umesh (University of Michigan), Ning Liang (University of Michigan), Xavier Servot (ETH Zurich), Onur Mutlu (ETH Zurich), Ravi Iyer (Google), Reetuparna Das (University of Michigan)

CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms
Asif Ali Khan (Technische Universität Dresden), Hamid Farzaneh (Technische Universität Dresden), Karl Friedrich Alexander Friebel (Technische Universität Dresden), Clément Fournier (Technische Universität Dresden), Lorenzo Chelini (Intel), Jeronimo Castrillon (Technische Universität Dresden)

Toleo: Scaling Freshness to Tera-scale Memory Using CXL and PIM
Juechu Dong (University of Michigan), Jonah Rosenblum (University of Michigan), Satish Narayanasamy (University of Michigan)

Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators
Shixin Zhao (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Yuming Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Bing Li (Institute of Microelectronics, Chinese Academy of Sciences), Yintao He (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Mengdi Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yinhe Han (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Ying Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences)
Session Chair: Abdulrahman Mahmoud (MBZUAI)
RTL Verification for Secure Speculation Using Contract Shadow Logic
Qinhan Tan (Princeton University), Yuheng Yang (Massachusetts Institute of Technology), Thomas Bourgeat (École Polytechnique Fédérale de Lausanne), Sharad Malik (Princeton University), Mengjia Yan (Massachusetts Institute of Technology)

ElasticMiter: Formally Verified Dataflow Circuit Rewrites
Ayatallah Elakhras (EPFL), Jiahui Xu (ETH Zurich), Martin Erhart (ETH Zurich), Paolo Ienne (EPFL), Lana Josipovic (ETH Zurich)

Robustness Verification for Checking Crash Consistency of Non-volatile Memory
Zhilei Han (School of Software, Tsinghua University), Fei He (School of Software, Tsinghua University)

Proactive Runtime Detection of Aging-Related Silent Data Corruptions: A Bottom-Up Approach
Jiacheng Ma (University of Michigan), Majd Ganaiem (Technion – Israel Institute of Technology), Madeline Burbage (University of Washington), Theo Gregersen (University of Washington), Rachel McAmis (University of Washington), Freddy Gabbay (The Hebrew University of Jerusalem), Baris Kasikci (University of Washington,Google)

Hardware Sentinel: Protecting Software Applications from Hardware Silent Data Corruptions
Rhea Dutta (Meta Platforms, Inc.), Harish Dattatraya Dixit (Meta Platforms, Inc.), Rik Van Riel (Meta Platforms, Inc.), Gautham Vunnam (Meta Platforms, Inc.), Sriram Sankar (Meta Platforms, Inc.)
Session Chair: Daniel A. Jiménez (TAMU and BSC)
TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing
Husheng Han (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Xinyao Zheng (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Yuanbo Wen (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Yifan Hao (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Erhu Feng (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems (MoE)), Ling Liang (Peking university), Jianan Mu (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Xiaqing Li (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Tianyun Ma (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Cambricon Technologies), Pengwei Jin (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Xinkai Song (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Zidong Du (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Shanghai Innovation Center for Processor Technologies, SHIC), Qi Guo (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Xing Hu (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,ZGC LAB)

MDPeek: Breaking Balanced Branches in SGX with Memory Disambiguation Unit Side Channels
Chang Liu (Tsinghua University), Shuaihu Feng (Zhongguancun Laboratory), Yuan Li (Zhongguancun Laboratory), Dongsheng Wang (Tsinghua University), Wenjian He (Huawei Technologies Co., Ltd.), Yongqiang Lyu (Tsinghua University), Trevor E. Carlson (National University of Singapore)

Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge Proof
Zhuoran Ji (School of Cyber Science and Technology, Shandong University), Jianyu Zhao (School of Cyber Science and Technology, Shandong University), Peimin Gao (School of Cyber Science and Technology, Shandong University), Xiangkai Yin (School of Cyber Science and Technology, Shandong University), Lei Ju (Quan Cheng Laboratory)

BatchZK: A Fully Pipelined GPU-Accelerated System for Batch Generation of Zero-Knowledge Proofs
Tao Lu (Zhejiang University,National University of Singapore), Yuxun Chen (Zhejiang University), Zonghui Wang (Zhejiang University), Xiaohang Wang (Zhejiang University), Wenzhi Chen (Zhejiang University), Jiaheng Zhang (National University of Singapore)

UniZK: Accelerating Zero-Knowledge Proof with Unified Hardware and Flexible Kernel Mapping
Cheng Wang (Xi'an Jiaotong University,Institute for Interdisciplinary Information Core Technology), Mingyu Gao (Tsinghua University,Shanghai Qi Zhi Institute)

3:40 PM CEST – 4:10 PM CEST: Coffee break & ASPLOS poster session (Thursday morning presentations)

Location: Catering area

4:10 PM CEST – 5:30 PM CEST

Session Chair: Rodrigo Bruno (Univ. of Lisbon)
Litmus: Fair Pricing for Serverless Computing
Qi Pei (Computer Science / Watson School, The State University of New York at Binghamton), Yipeng Wang (Intel Lab., Intel), Seunghee Shin (Computer Science / Watson School, The State University of New York at Binghamton)

Concurrency-Informed Orchestration for Serverless Functions
Qichang Liu (University of Virginia), Yue Cheng (University of Virginia), Haiying Shen (University of Virginia), Ao Wang (Alibaba Group), Bharathan Balaji (Amazon)

Dilu: Enabling GPU Resourcing-on-Demand for Serverless DL Serving via Introspective Elasticity
Cunchi Lv (ICT, CAS,UCAS), Xiao Shi (ICT, CAS,Nanjing Institute of InforSuperBahn), Zhengyu Lei (ICT, CAS,UCAS), Jinyue Huang (ICT, CAS, UCAS), Wenting Tan (ICT, CAS), Xiaohui Zheng (ICT, CAS), Xiaofang Zhao (ICT, CAS,IICT, Suzhou, CAS)

Medusa: Accelerating Serverless LLM Inference with Materialization
Shaoxun Zeng (Tsinghua University), Minhui Xie (Tsinghua University), Shiwei Gao (Tsinghua University), Youmin Chen (Tsinghua University), Youyou Lu (Tsinghua University)
Session Chair: Alex Reinking (Adobe)
Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency
Hansung Kim (University of California, Berkeley), Ruohan Richard Yan (University of California, Berkeley), Joshua You (University of California, Berkeley), Tieliang Vamber Yang (NVIDIA Corporation), Yakun Sophia Shao (University of California, Berkeley)

Towards Unified Analysis of GPU Consistency
Haining Tong (University of Helsinki), Natalia Gavrilenko (Huawei Dresden Research Center), Hernan Ponce de Leon (Huawei Dresden Research Center), Keijo Heljanko (University of Helsinki,Helsinki Institute for Information Technology)

Aqua: Network-Accelerated Memory Offloading for LLMs in Scale-Up GPU Domains
Abhishek Vijaya Kumar (Cornell University), Gianni Antichi (Politecnico di Milano), Rachee Singh (Cornell University)

Optimizing Datalog for the GPU
Yihao Sun (Syracuse University), Ahmedur Rahman Shovon (University of Illinois, Chicago), Thomas Gilray (Washington State University), Sidharth Kumar (University of Illinois, Chicago), Kristopher Micinski (Syracuse University)
Session Chair: Baris Kasikci (UW)
MaxEmbed: Maximizing SSD bandwidth utilization for huge embedding models serving
Ruwen Fan (Tsinghua University), Minhui Xie (Tsinghua University), Haodi Jiang (Tsinghua University), Youyou Lu (Tsinghua University)

AnyKey: A Key-Value SSD for All Workload Types
Chanyoung Park (Hanyang University), Jungho Lee (Hanyang University), Chun-Yi Liu (Micron Technology Inc.), Kyungtae Kang (Hanyang Univeristy), Mahmut Taylan Kandemir (Pennsylvania State University), Wonil Choi (Hanyang University)

Simplifying and Accelerating NOR Flash I/O Stack for RAM-Restricted Microcontrollers
Hao Huang (Harbin Institute of Technology, Shenzhen), Yanqi Pan (Harbin Institute of Technology, Shenzhen), Wen Xia (Harbin Institute of Technology, Shenzhen), Xiangyu Zou (Harbin Institute of Technology, Shenzhen), Darong Yang (Harbin Institute of Technology, Shenzhen), Liang Shi (East China Normal University), Hongwei Du (Harbin Institute of Technology, Shenzhen)

A Software Caching Runtime for Embedded NVRAM Systems
Harrison Williams (Virginia Tech), Matthew Hicks (Virginia Tech)
Session Chair: Steve Blackburn (Google)
Velosiraptor: Code Synthesis for Memory Translation
Reto Achermann (University of British Columbia), Em Chu (JuliaHub), Ryan Mehri (Replit), Ilias Karimalis (University of British Columbia), Margo Seltzer (University of British Columbia)

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention
Ramya Prabhu (Microsoft Research), Ajay Nayak (Indian Institute of Science), Jayashree Mohan (Microsoft Research), Ramachandran Ramjee (Microsoft Research), Ashish Panwar (Microsoft Research)

Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology
Konstantinos Kanellopoulos (ETH Zürich), Konstantinos Sgouras (ETH Zürich), Nisa Bostanci (ETH Zürich), Andreas Kosmas Kakolyris (ETH Zürich), Berkin Kerim Konar (ETH Zürich), Rahul Bera (ETH Zürich), Mohammad Sadrosadati (ETH Zürich), Rakesh Kumar (Norwegian University of Science and Technology (NTNU)), Nandita Vijaykumar (University of Toronto), Onur Mutlu (ETH Zürich)

Instruction-Aware Cooperative TLB and Cache Replacement Policies
Dimitrios Chasapis (Barcelona Supercomputing Center (BSC)), Georgios Vavouliotis (Unaffiliated), Daniel A. Jiménez (Texas A&M University), Marc Casas (Barcelona Supercomputing Center (BSC),Universitat Politècnica de Catalunya (UPC))


The program page was generated and formatted using Professor Saugata Ghose‘s (UIUC) Conference Program Generator.