Preliminary Program

EuroSys 2025 will be running concurrently with ASPLOS. The EuroSys program is available here.

Monday 31st March

18:00-19:30 (Monday) Welcome Reception + EuroSys Poster Session

Tuesday 1st April

09:00-10:00 Registration and Welcome Coffee

10:00-12:00 ASPLOS and EuroSys Joint Plenary Session

12:00-13:30 Lunch

13:30-15:10

ML Acceleration (1-A)Quantum Computing (1-B)Edge Computing (1-C)Homomorphic Encryption (1-D)
‘Mosaic: Exploiting Instruction-Level Parallelism on Deep Learning Accelerators with iTex Tessellation’, Jianxing Xu (University of Science and Technology of China,SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yuanbo Wen (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Zikang Liu (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Ruibai Xu (University of Science and Technology of China,SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Tingfeng Ruan (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Jun Bi (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Rui Zhang (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Di Huang (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Xinkai Song (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yifan Hao (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Xing Hu (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Zidong Du (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Chongqing Zhao (Tencent), Jie Jiang (Tencent), Qi Guo (SKL of Processors, Institute of Computing Technology, Chinese Academy of Sciences)‘FMCC: Flexible Measurement-based Quantum Computation over Cluster State’, Yingheng Li (University of Pittsburgh), Aditya Pawar (University of Pittsburgh), Zewei Mo (University of Pittsburgh), Youtao Zhang (University of Pittsburgh), Jun Yang (University of Pittsburgh), Xulong Tang (University of Pittsburgh)‘Energy-aware Scheduling and Input Buffer Overflow Prevention for Energy-harvesting Systems’, Harsh Desai (Carnegie Mellon University), Xinye Wang (Carnegie Mellon University), Brandon Lucia (Carnegie Mellon University)‘Orion: A Fully Homomorphic Encryption Framework for Deep Learning’, Austin Ebel (Tandon School of Engineering, New York University), Karthik Garimella (Tandon School of Engineering, New York University), Brandon Reagen (Tandon School of Engineering, New York University)
‘DynaX: Sparse Attention Acceleration with Dynamic X:M Fine-Grained Structured Pruning’, Xiao Xiong (College of Computer Science, Chongqing University), Zhaorui Chen (College of Computer Science, Chongqing University), Yue Liang (College of Computer Science, Chongqing University), Minghao Tian (College of Computer Science, Chongqing University), Jiaxing Shang (College of Computer Science, Chongqing University), Jiang Zhong (College of Computer Science, Chongqing University), Dajiang Liu (College of Computer Science, Chongqing University)‘QRCC: Evaluating Large Quantum Circuits on Small Quantum Computers through Integrated Qubit Reuse and Circuit Cutting’, Aditya Pawar (Electrical and Computer Engineering Department, University of Pittsburgh), Yingheng Li (Computer Science Department, University of Pittsburgh), Zewei Mo (Computer Science Department, University of Pittsburgh), Yanan Guo (Electrical and Computer Engineering Department, University of Pittsburgh), Xulong Tang (Computer Science Department, University of Pittsburgh), Youtao Zhang (Computer Science Department, University of Pittsburgh), Jun Yang (Electrical and Computer Engineering Department, University of Pittsburgh)‘Generalizing Reuse Patterns for Efficient DNN on Microcontrollers’, Jiesong Liu (North Carolina State University), Bin Ren (College of William and Mary), Xipeng Shen (North Carolina State University)‘CIPHERMATCH: Computation using In-flash Processing for Homomorphic Encryption & Reliable String Matching’, Mayank Kabra,Rakesh Nadig,Harshita Gupta,Manos Frouzakis,Rahul Bera,Vamanan Arulchelvan,Yu Liang,Haiyu Mao,Mohammad Sadrosadati,Onur Mutlu)
‘Accelerating Retrieval-Augmented Generation’, Derrick Quinn (Cornell University), Mohammad Nouri (Cornell University), Neel Patel (Cornell University), John Salihu (University of Kansas), Alireza Salemi (UMass Amherst), Sukhan Lee (Samsung Electronics), Hamed Zamani (UMass Amherst), Mohammad Alian (Cornell University)‘Optimizing Quantum Circuits, Fast and Slow’, Amanda Xu (University of Wisconsin-Madison), Abtin Molavi (University of Wisconsin-Madison), Swamit Tannu (University of Wisconsin-Madison), Aws Albarghouthi (University of Wisconsin-Madison)‘Earth+: On-Board Satellite Imagery Compression Leveraging Historical Earth Observations’, Kuntai Du (University of Chicago), Yihua Cheng (University of Chicago), Peder Olsen (Microsoft Research), Shadi Noghabi (Microsoft Research), Junchen Jiang (University of Chicago)‘ReSBM: Region-based Scale and Minimal-Level Bootstrapping Management for FHE via Min-Cut’, Yan Liu (Ant Group), Jianxin Lai (Ant Group), Long Li (Ant Group), Tianxiang Sui (Ant Group), Linjie Xiao (Ant Group), Peng Yuan (Ant Group), Xiaojing Zhang (Ant Group), Qing Zhu (Ant Group), Wenguang Chen (Tsinghua University,Ant Group), Jingling Xue (UNSW)
‘GUST: Graph Edge-Coloring Utilization for Accelerating Sparse Matrix Vector Multiplication’, Armin Gerami (Computer Science, University of Maryland), Bahar Asgari (Computer Science, University of Maryland)‘BQSim: GPU-accelerated Batch Quantum Circuit Simulation using Decision Diagram’, Shui Jiang (The Chinese University of Hong Kong ,University of Wisconsin-Madison), Yi-Hua Chung (University of Wisconsin-Madison), Chih-Chun Chang (University of Wisconsin-Madison), Tsung-Yi Ho (The Chinese University of Hong Kong), Tsung-Wei Huang (University of Wisconsin-Madison)‘Pirate: No Compromise Low-Bandwidth VR Streaming for Edge Devices’, Yingtian Zhang (The Pennsylvania State University), Yan Kang (The Pennsylvania State University), Ziyu Ying (The Pennsylvania State University), Wanhang Lu (The Pennsylvania State University), Sijie Lan (The Pennsylvania State University), Huijuan Xu (The Pennsylvania State University), Kiwan Maeng (The Pennsylvania State University), Anand Sivasubramaniam (The Pennsylvania State University), Mahmut T. Kandemir (The Pennsylvania State University), Chita R. Das (The Pennsylvania State University)‘HALO: Loop-aware Bootstrapping Management for Fully Homomorphic Encryption’, Seonyoung Cheon (Yonsei University), Yongwoo Lee (Yonsei University), Hoyun Youm (Yonsei University), Dongkwan Kim (Yonsei University), Sungwoo Yun (Yonsei University), Kunmo Jeong (Yonsei University), Dongyoon Lee (Stony Brook University), Hanjun Kim (Yonsei University)
‘RASSM: Residue-based Acceleration of Single Sparse Matrix Computation via Adaptive Tiling’, Anirudh Jain (Georgia Institute of Technology), Pulkit Gupta (Georgia Institute of Technology), Thomas M. Conte (Georgia Institute of Technology)‘Fat-Tree QRAM: A High-Bandwidth Shared Quantum Random Access Memory for Parallel Queries’, Shifan Xu (Yale Quantum Institute, Yale University), Alvin Lu (Yale Quantum Institute, Yale University), Yongshan Ding (Yale Quantum Institute, Yale University)‘Nazar: Monitoring and Adapting ML Models on Mobile Devices’, Wei Hao (Columbia University), Zixi Wang (Columbia University), Lauren Hong (Columbia University), Lingxiao Li (Columbia University), Nader Karayanni (Columbia University), AnMei Dasbach-Prisk (University of California San Diego), Chengzhi Mao (Columbia University), Junfeng Yang (Columbia University), Asaf Cidon (Columbia University)‘Affinity-based Optimizations for TFHE on Processing-in-DRAM’, Kevin Nam (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University), Heonhui Jung (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University), Hyunyoung Oh (Department of AI·Software, Gachon University), Yunheung Paek (Department of Electrical and Computer Engineering (ECE), Seoul National University,Inter-university Semiconductor Research Center (ISRC), Seoul National University)

15:10-15:40 Coffee Break

15:40-17:00

Cloud Computing 1 (2-A)Distributed Computing (2-B)Accelerators (2-C)FPGAs (2-D)
‘Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud Platforms’, Benjamin Reidys (University of Illinois Urbana-Champaign), Pantea Zardoshti (Microsoft), Íñigo Goiri (Microsoft), Celine Irvene (Microsoft), Daniel S. Berger (Microsoft,University of Washington), Haoran Ma (University of California-Los Angeles), Kapil Arya (Microsoft), Eli Cortez (Microsoft), Taylor Stark (Microsoft), Eugene Bak (Microsoft), Mehmet Iyigun (Microsoft), Stanko Novaković (Google), Lisa Hsu (Meta), Karel Trueba (Microsoft), Abhisek Pan (Microsoft), Chetan Bansal (Microsoft), Saravan Rajmohan (Microsoft), Jian Huang (University of Illinois Urbana-Champaign), Ricardo Bianchini (Microsoft)‘Composing Distributed Computations Through Task and Kernel Fusion’, Rohan Yadav (Stanford University), Shiv Sundram (Stanford University), Wonchan Lee (NVIDIA), Michael Garland (NVIDIA), Michael Bauer (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)‘RANGE-BLOCKS: A Synchronization Facility for Domain-Specific Architectures’, Anagha Molakalmur Anil Kumar (Simon Fraser University), Aditya Prasanna (Simon Fraser University), Arrvindh Shriraman (Simon Fraser University)‘Salus: A Practical Trusted Execution Environment for CPU-FPGA Heterogeneous Cloud Platforms’, Yu Zou (Alibaba Group), Yiran Li (Alibaba Group), Sheng Wang (Alibaba Group), Le Su (Alibaba Group), Zhen Gu (DAMO Academy, Alibaba Group,Hupan Lab), Yanheng Lu (DAMO Academy, Alibaba Group,Hupan Lab), Yijin Guan (DAMO Academy, Alibaba Group,Hupan Lab), Dimin Niu (DAMO Academy, Alibaba Group,Hupan Lab), Mingyu Gao (Tsinghua University,Shanghai AI Laboratory), Yuan Xie (DAMO Academy, Alibaba Group,Hupan Lab), Feifei Li (Alibaba Group)
‘Cooperative Graceful Degradation in Containerized Clouds’, Kapil Agrawal (University of California, Irvine), Sangeetha Abdu Jyothi (University of California, Irvine and VMware Research)‘CXLfork: Fast Remote Fork over CXL Fabrics’, Chloe Alverti (University of Illinois Urbana-Champaign), Stratos Psomadakis (National Technical University of Athens), Burak Ocalan (University of Illinois Urbana-Champaign), Shashwat Jaiswal (University of Illinois Urbana-Champaign), Tianyin Xu (University of Illinois Urbana-Champaign), Josep Torrellas (University of Illinois Urbana-Champaign)‘Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning’, Zhaoying Li (National University of Singapore), Pranav Dangi (National University of Singapore), Chenyang Yin (Peking University), Thilini Kaushalya Bandara (National University of Singapore), Rohan Juneja (National University of Singapore), Cheng Tan (Google), Zhenyu Bai (National University of Singapore), Tulika Mitra (National University of Singapore)‘Harmonia: A Unified Framework for Heterogeneous FPGA Acceleration in the Cloud’, Luyang Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Heng Pan (Computer Network Information Center, Chinese Academy of Sciences), Xinchen Wan (Hong Kong University of Science and Technology), Kai Lv (Institute of Computing Technology, Chinese Academy of Sciences), Zilong Wang (Hong Kong University of Science and Technology), Qian Zhao (Douyin Co., Ltd.), Feng Ning (Douyin Co., Ltd.), Qingsong Ning (Douyin Co., Ltd.), Shideng Zhang (Douyin Co., Ltd.), Zhenyu Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Layong Luo (Researcher), Gaogang Xie (Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences)
‘DarwinGame: Playing Tournaments for Tuning Applications in Noisy Cloud Environments’, Rohan Basu Roy (University of Utah), Vijay Gadepally (Massachusetts Institute of Technology), Devesh Tiwari (Northeastern University)‘OS2G: A High-Performance DPU Offloading Architecture for GPU-based Deep Learning with Object Storage’, Zhen Jin (Zhejiang University,Alibaba Group), Yiquan Chen (Alibaba Group), Mingxu Liang (Alibaba Group), Yijing Wang (Alibaba Group), Guoju Fang (Alibaba Group), Ao Zhou (Alibaba Group), Keyao Zhang (Zhejiang University), Jiexiong Xu (Zhejiang University), Wenhai Lin (Zhejiang University), Yiquan Lin (Zhejiang University), Shushu Zhao (Alibaba Group), Wenkai Shi (Alibaba Group), Zhenhua He (Alibaba Group), Shishun Cai (Alibaba Group), Wenzhi Chen (Zhejiang University)‘Squeezing Operator Performance Potential for the Ascend Architecture’, Yuhang Zhou (State Key Laboratory for Novel Software Technology, Nanjing University), Zhibin Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Guyue Liu (Peking University), Shipeng Li (State Key Laboratory for Novel Software Technology, Nanjing University), Xi Lin (State Key Laboratory for Novel Software Technology, Nanjing University), Zibo Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Yongzhong Wang (Huawei Technologies Co., Ltd.), Fuchun Wei (Huawei Technologies Co., Ltd.), Jingyi Zhang (Huawei Technologies Co., Ltd.), Zhiheng Hu (Huawei Technologies Co., Ltd.), Yanlin Liu (Huawei Technologies Co., Ltd.), Chunsheng Li (Huawei Technologies Co., Ltd.), Ziyang Zhang (Huawei Technologies Co., Ltd.), Yaoyuan Wang (Huawei Technologies Co., Ltd.), Bin Zhou (Shandong University), Wanchun Dou (Nanjing University, State Key Laboratory for Novel Software Technology), Guihai Chen (Nanjing University, State Key Laboratory for Novel Software Technology), Chen Tian (Nanjing University, State Key Laboratory for Novel Software Technology)‘PhasePrint: Exposing Cloud FPGA Fingerprints by Inducing Timing Faults at Runtime’, Jubayer Mahmod (Virginia Tech), Matthew Hicks (Virginia Tech)
‘Copper and Wire: Bridging Expressiveness and Performance for Service Mesh Policies’, Divyanshu Saxena (The University of Texas at Austin), William Zhang (The University of Texas at Austin), Shankara Pailoor (The University of Texas at Austin), Isil Dillig (The University of Texas at Austin), Aditya Akella (The University of Texas at Austin)‘pulse: Accelerating Distributed Pointer-Traversals on Disaggregated Memory’, Yupeng Tang (Yale University), Seung-seob Lee (Yale University), Abhishek Bhattacharjee (Yale University), Anurag Khandelwal (Yale University)‘PICACHU: Plug-In CGRA Handling Upcoming Nonlinear Operations in LLMs’, Jiajun Qin (New York University,Zhejiang University), Tianhua Xia (New York University), Cheng Tan (Google,Arizona State University), Jeff Zhang (Arizona State University), Sai Qian Zhang (New York University)‘Hassert: Hardware Assertion-Based Verification Framework with FPGA Acceleration’, Ziqing Zhang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Weijie Weng (Xiamen University of Technology), Yaning Li (University College Dublin), Lijia Cai (Hong Kong University of Science and Technology), Haoyu Wang (Zhejiang University), David Boland (The University of Sydney), Yungang Bao (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Kan Shi (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences)

17:00-17:30 Coffee break & ASPLOS poster session (Tuesday afternoon presentations)

17:30-18:30 Wild and Crazy Ideas (WACI)

19:00-20:00 Business meeting

Wednesday 2nd April

08:30-09:00 Registration and Welcome Coffee

09:00-10:40

Performance Analysis & Tracing (3-A)Graphics (3-B)ML Security (3-C)EDA (3-D)
‘EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters’, Xinkai Wang (Shanghai Jiao Tong University), Xiaofeng Hou (Shanghai Jiao Tong University), Chao Li (Shanghai Jiao Tong University), Yuancheng Li (Shanghai Jiao Tong University), Du Liu (Shanghai Jiao Tong University), Guoyao Xu (Alibaba Group), Guodong Yang (Alibaba Group), Liping Zhang (Alibaba Group), Yuemin Wu (Alibaba Cloud), Xiaopeng Yuan (Alibaba Cloud), Quan Chen (Shanghai Jiao Tong University), Minyi Guo (Shanghai Jiao Tong University)‘MetaSapiens: Real-Time Neural Rendering with Efficiency-Aware Pruning and Accelerated Foveated Rendering’, Weikai Lin (University of Rochester), Yu Feng (Shanghai Jiao Tong University), Yuhao Zhu (University of Rochester)‘MPC-Pipe: an Efficient Pipeline Scheme for Semi-honest MPC Machine Learning’, Yongqin Wang (Department of Electrical & Computer Engineering, University of Southern California), Rachit Rajat (Department of Electrical & Computer Engineering, University of Southern California), Murali Annavaram (Department of Electrical & Computer Engineering, University of Southern California)‘Control Logic Synthesis: Drawing the Rest of the OWL’, Zachary D. Sisco (University of California, Santa Barbara), Andrew David Alex (University of California, Santa Barbara), Zechen Ma (University of California, Santa Barbara), Yeganeh Aghamohammadi (University of California, Santa Barbara), Boming Kong (University of California, Santa Barbara), Benjamin Darnell (University of Illinois Urbana-Champaign), Timothy Sherwood (University of California, Santa Barbara), Ben Hardekopf (University of California, Santa Barbara), Jonathan Balkind (University of California, Santa Barbara)
‘Mint: Cost-Efficient Tracing with All Requests Collection via Commonality and Variability Analysis’, Haiyu Huang (Sun Yat-sen University), Cheng Chen (Alibaba Group), Kunyi Chen (Alibaba Group), Pengfei Chen (Sun Yat-sen University), Guangba Yu (Sun Yat-sen University), Zilong He (Sun Yat-sen University), Yilun Wang (Sun Yat-sen University), Huxing Zhang (Alibaba Group), Qi Zhou (Alibaba Group)‘D-VSync: Decoupled Rendering and Displaying for Smartphone Graphics’, Yuanpei Wu (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Dong Du (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Chao Xu (Fields Lab, Huawei Central Software Institute), Yubin Xia (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Ming Fu (Fields Lab, Huawei Central Software Institute), Binyu Zang (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems, Ministry of Education), Haibo Chen (IPADS, Shanghai Jiao Tong University,Key Laboratory of System Software (Chinese Academy of Science))‘Cinnamon: A Framework for Scale-Out Encrypted AI’, Siddharth Jayashankar (Carnegie Mellon University), Edward Chen (Carnegie Mellon University), Tom Tang (Carnegie Mellon University), Wenting Zheng (Carnegie Mellon University), Dimitrios Skarlatos (Carnegie Mellon University)‘CRUSH: A Credit-Based Approach for Functional Unit Sharing in Dynamically Scheduled HLS’, Jiahui Xu (ETH Zurich), Lana Josipović (ETH Zurich)
‘Automatic Tracing in Task-Based Runtime Systems’, Rohan Yadav (Stanford University), Michael Bauer (NVIDIA), David Broman (KTH Royal Institute of Technology), Michael Garland (NVIDIA), Alex Aiken (Stanford University), Fredrik Kjolstad (Stanford University)‘StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination’, Yu Feng (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Zheng Liu (Shanghai Jiao Tong University), Weikai Lin (University of Rochester), Zihan Liu (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Jingwen Leng (Shanghai Jiao Tong University,Shanghai Qi Zhi Institute), Minyi Guo (Shanghai Jiaotong University,Shanghai Qi Zhi Institute), Zhezhi He (Shanghai Jiao Tong University), Jieru Zhao (Shanghai Jiao Tong University), Yuhao Zhu (University of Rochester)‘PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption’, Yifan Tan (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Cheng Tan (Northeastern University), Zeyu Mi (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University), Haibo Chen (Institute of Parallel and Distributed Systems, SEIEE, Shanghai Jiao Tong University)‘AMuLeT: Automated Design-Time Testing of Secure Speculation Countermeasures’, Bo Fu (University of Toronto), Leo Tenenbaum (University of Toronto), David Adler (University of Toronto), Assaf Klein (Technion – Israel Institute of Technology), Arpit Gogia (IMDEA Software Institute), Alaa R. Alameldeen (Simon Fraser University), Marco Guarnieri (IMDEA Software Institute), Mark Silberstein (Technion – Israel Institute of Technology), Oleksii Oleksenko (Azure Research, Microsoft), Gururaj Saileshwar (University of Toronto)
‘Enabling Efficient Mobile Tracing with BTrace’, Jiawei Wang (Huawei Dresden Research Center,Huawei Central Software Institute), Nian Liu (Huawei Central Software Institute), Arnau Casadevall-Saiz (Huawei Dresden Research Center,Huawei Central Software Institute), Yutao Liu (Huawei Dresden Research Center,Huawei Central Software Institute), Diogo Behrens (Huawei Dresden Research Center,Huawei Central Software Institute), Ming Fu (Huawei Central Software Institute), Ning Jia (Huawei Central Software Institute), Hermann Härtig (Technische Universität Dresden), Haibo Chen (Huawei Central Software Institute,Shanghai Jiao Tong University)‘ARC: Warp-level Adaptive Atomic Reduction in GPUs to Accelerate Differentiable Rendering’, Sankeerth Durvasula (Vector Institute, University of Toronto), Adrian Zhao (Vector Institute, University of Toronto), Fan Chen (University of Toronto), Ruofan Liang (Vector Institute, University of Toronto), Pawan Kumar Sanjaya (Vector Institute, University of Toronto), Yushi Guan (Vector Institute, University of Toronto), Christina Giannoula (Vector Institute, University of Toronto), Nandita Vijaykumar (Vector Institute, University of Toronto)‘Practical Federated Recommendation Model Learning Using ORAM with Controlled Privacy’, Jinyu Liu (The Pennsylvania State University), Wenjie Xiong (Virginia Tech), G. Edward Suh (NVIDIA,Cornell University), Kiwan Maeng (The Pennsylvania State University)‘Don’t Repeat Yourself! Coarse-Grained Circuit Deduplication to Accelerate RTL Simulation’, Haoyuan Wang (UC Santa Cruz), Thomas Nijssen (UC Santa Cruz), Scott Beamer (UC Santa Cruz)
‘Rethinking Java Performance Analysis’, Stephen M. Blackburn (Google,Australian National University), Zixian Cai (Australian National University), Rui Chen (Unaffiliated-Independent), Xi Yang (IOP Systems), John Zhang (Canva), John Zigman (The University of Sydney)‘Treelet Accelerated Ray Tracing on GPUs’, Yuan Hsi Chou (University of British Columbia), Tor M. Aamodt (University of British Columbia)‘Tackling ML-based Dynamic Mispredictions using Statically Computed Invariants for Attack Surface Reduction’, Chris Porter (IBM Research), Sharjeel Khan (Georgia Institute of Technology), Kangqi Ni (Georgia Institute of Technology), Santosh Pande (Georgia Institute of Technology)‘Parendi: Thousand-Way Parallel RTL Simulation’, Mahyar Emami (EPFL), Thomas Bourgeat (EPFL), James R. Larus (EPFL)

10:40-11:10 Coffee break & ASPLOS poster session (Wednesday afternoon presentations)

11:10-12:30

Cloud Computing 2 (4-A)Memory & Storage (4-B)Potpourri 1 (4-C)Autonomous Systems (4-D)
‘Necro-reaper: Pruning away Dead Memory Traffic in Warehouse-Scale Computers’, Sotiris Apostolakis (Google), Chris Kennelly (Google), Xinliang David Li (Google), Parthasarathy Ranganathan (Google)‘ZRAID: Leveraging Zone Random Write Area (ZRWA) for Alleviating Partial Parity Tax in ZNS RAID’, Minwook Kim (Seoul National University), Seongyeop Jeong (Seoul National University), Jin-Soo Kim (Seoul National University)‘Efficient Lossless Compression of Scientific Floating-Point Data on CPUs and GPUs’, Noushin Azami (Department of Computer Science, Texas State University), Alex Fallin (Department of Computer Science, Texas State University), Martin Burtscher (Department of Computer Science, Texas State University)‘ReCA: Integrated Acceleration for Real-Time and Efficient Cooperative Embodied Autonomous Agents’, Zishen Wan (Georgia Institute of Technology), Yuhang Du (University of Minnesota, Twin Cities), Mohamed Ibrahim (Georgia Institute of Technology), Jiayi Qian (Georgia Institute of Technology), Jason Jabbour (Harvard University), Yang (Katie) Zhao (University of Minnesota, Twin Cities), Tushar Krishna (Georgia Institute of Technology), Arijit Raychowdhury (Georgia Institute of Technology), Vijay Janapa Reddi (Harvard University)
‘Embracing Imbalance: Dynamic Load Shifting among Microservice Containers in Shared Clusters’, Shutian Luo (Yale University), Jianxiong Liao (Sun Yat-sen University,University of Macau), Chenyu Lin (University of Macau), Huanle Xu (University of Macau), Zhi Zhou (Sun Yat-sen University), Chengzhong Xu (University of Macau)‘Marionette: A RowHammer Attack via Row Coupling’, Seungmin Baek (Seoul National University), Minbok Wi (Seoul National University), Seonyong Park (Seoul National University), Hwayong Nam (Seoul National University), Michael Jaemin Kim (Seoul National University), Nam Sung Kim (University of Illinois), Jung Ho Ahn (Seoul National University)‘Data Cache for Intermittent Computing Systems with Non-Volatile Main Memory’, Sourav Mohapatra (Department of Computer Science, TU Delft,ARM Limited), Vito Kortbeek (Department of Computer Science, TU Delft,Synopsys), Marco Antonio van Eerden (Department of Computer Science, TU Delft), Jochem Broekhoff (Department of Computer Science, TU Delft,DSP Innovation B.V.), Saad Ahmed (School of Interactive Computing, Georgia Institute of Technology), Przemysław Pawełczak (Department of Computer Science, TU Delft)‘AnA: An Attentive Autonomous Driving System’, Wonkyo Choe (University of Virginia), Rongxiang Wang (University of Virginia), Felix Xiaozhu Lin (University of Virginia)
‘Tela: A Temporal Load-Aware Cloud Virtual Disk Placement Scheme’, Difan Tan (Huazhong University of Science and Technology), Jiawei Li (Huazhong University of Science and Technology), Hua Wang (Huazhong University of Science and Technology), Xiaoxiao Li (Huazhong University of Science and Technology), Wenbo Liu (Huazhong University of Science and Technology), Zijin Qin (Huazhong University of Science and Technology), Ke Zhou (Huazhong University of Science and Technology), Ming Xie (Tencent Inc.), Mengling Tao (Tencent Inc.)‘MOAT: Securely Mitigating Rowhammer with Per-Row Activation Counters’, Moinuddin Qureshi (Georgia Institute of Technology), Salman Qazi (Google)‘Fusion: An Analytics Object Store Optimized for Query Pushdown’, Jianan Lu (Princeton University), Ashwini Raina (Princeton University), Asaf Cidon (Columbia University), Michael J. Freedman (Princeton University)‘SuperNoVA: Algorithm-Hardware Co-Design for Resource-Aware SLAM’, Seah Kim (University of California, Berkeley), Roger Hsiao (University of California, Berkeley), Borivoje Nikolić (University of California, Berkeley), James Demmel (University of California, Berkeley), Yakun Sophia Shao (University of California, Berkeley)
‘FleetIO: Managing Multi-Tenant Cloud Storage with Multi-Agent Reinforcement Learning’, Jinghan Sun (UIUC), Benjamin Reidys (UIUC), Daixuan Li (UIUC), Jichuan Chang (Google), Marc Snir (UIUC), Jian Huang (UIUC)‘HyperHammer: Breaking Free from KVM-Enforced Isolation’, Wei Chen (Peking University), Zhi Zhang (University of Western Australia), Xin Zhang (Peking University), Qingni Shen (Peking University), Yuval Yarom (Ruhr University Bochum), Daniel Genkin (Georgia Institute of Technology), Chen Yan (Peking University), Zhe Wang (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Zhongguancun Laboratory)‘VertexSurge: Variable Length Graph Pattern Match on Billion-edge Graphs’, Weiyu Xie (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), MingXing Zhang (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Xia Liao (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Kang Chen (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), Jinlei Jiang (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University), YongWei Wu (Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University)‘OctoCache: Caching Voxels for Accelerating 3D Occupancy Mapping in Autonomous Systems’, Peiqing Chen (University of Maryland), Minghao Li (Harvard University), Zishen Wan (Georgia Institute of Technology), Yushun Hsiao (Harvard University), Minlan Yu (Harvard University), Vijay Janapa Reddi (Harvard University), Zaoxing Liu (University of Maryland)

12:30-14:00 Lunch

14:00-15:40

ML Systems 1 (5-A)Microarchitecture (5-B)Large Language Models (5-C)CXL Storage (5-D)
‘Forecasting GPU Performance for Deep Learning Training and Inference’, Seonho Lee (Georgia Institute of Technology), Amar Phanishayee (Meta), Divya Mahajan (Georgia Institute of Technology)‘Saving Energy with Per-Variable Bitwidth Speculation’, Tommy McMichen (Northwestern University), David Dlott (Northwestern University), Panitan Wongse-ammat (Northwestern University), Nathan Greiner (Northwestern University), Hussain Khajanchi (Northwestern University), Russ Joseph (Northwestern University), Simone Campanoni (Northwestern University)‘Fast On-device LLM Inference with NPUs’, Daliang Xu (Key Lab of HCST (PKU), MOE), SCS, Peking University), Hao Zhang (Beijing University of Posts and Telecommunications), Liming Yang (Key Lab of HCST (PKU), MOE), SCS, Peking University), Ruiqi Liu (Key Lab of HCST (PKU), MOE), SCS, Peking University), Gang Huang (Key Lab of HCST (PKU), MOE), SCS, Peking University,National Key Laboratory of Data Space Technology and System), Mengwei Xu (Beijing University of Posts and Telecommunications), Xuanzhe Liu (Key Lab of HCST (PKU), MOE), SCS, Peking University)‘Formalising CXL Cache Coherence’, Chengsong Tan (Kaihong), Alastair F. Donaldson (Imperial College London), John Wickerson (Imperial College London)
‘MVQ: Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization’, Shuaiting Li (Zhejiang University), Chengxuan Wang (Zhejiang University), Juncan Deng (Zhejiang University), Zeyu Wang (Zhejiang University), Zewen Ye (Zhejiang University), Zongsheng Wang (Zhejiang University), Haibin Shen (Zhejiang University), Kejie Huang (Zhejiang University)‘ShadowLoad: Injecting State into Hardware Prefetchers’, Lorenz Hetterich (CISPA Helmholtz Center for Information Security), Fabian Thomas (CISPA Helmholtz Center for Information Security), Lukas Gerlach (CISPA Helmholtz Center for Information Security), Ruiyi Zhang (CISPA Helmholtz Center for Information Security), Nils Bernsdorf (CISPA Helmholtz Center for Information Security), Eduard Ebert (CISPA Helmholtz Center for Information Security), Michael Schwarz (CISPA Helmholtz Center for Information Security)‘Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow’, Yixuan Mei (Carnegie Mellon University), Yonghao Zhuang (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Juncheng Yang (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University), Rashmi Vinayak (Carnegie Mellon University)‘CtXnL: A Software-Hardware Co-Designed Solution for Efficient CXL-Based Transaction Processing’, Zhao Wang (Peking University, School of Integrated Circuits,Peking University, School of Computer Science), Yiqi Chen (Peking University, School of Integrated Circuits), Cong Li (Peking University, School of Integrated Circuits), Yijin Guan (Alibaba Group, DAMO Academy,Hupan Lab), Dimin Niu (Alibaba Group, DAMO Academy,Hupan Lab), Tianchan Guan (Alibaba Group, DAMO Academy,Hupan Lab), Zhaoyang Du (Alibaba Group, DAMO Academy,Hupan Lab), Xingda Wei (Shanghai Jiao Tong University, Institute of Parallel and Distributed Systems, SEIEE), Guangyu Sun (Peking University, School of Integrated Circuits,Beijing Advanced Innovation Center for Integrated Circuits)
‘PartIR: Composing SPMD Partitioning Strategies for Machine Learning’, Sami Alabed (Google DeepMind), Daniel Belov (Google DeepMind), Bart Chrzaszcz (Google DeepMind), Juliana Franco (Google DeepMind), Dominik Grewe (Google DeepMind), Dougal Maclaurin (Google DeepMind), James Molloy (Google DeepMind), Tom Natan (Google DeepMind), Tamara Norman (Google DeepMind), Xiaoyue Pan (Google DeepMind), Adam Paszke (Google DeepMind), Norman Alexander Rink (Google DeepMind), Michael Schaarschmidt (Isomorphic Labs), Timur Sitdikov (Google DeepMind), Agnieszka Swietlik (Google DeepMind), Dimitrios Vytiniotis (Google DeepMind), Joel Wee (Google DeepMind)‘Skia: Exposing Shadow Branches’, Chrysanthos Pepi (Texas A&M University,Intel), Bhargav Reddy Godala (Princeton University,Intel), Krishnam Tibrewala (Texas A&M University), Gino A. Chacon (Intel,AheadComputing), Paul V. Gratz (Texas A&M University), Daniel A. Jiménez (Texas A&M University,Barcelona Supercomputing Center), Gilles A. Pokam (Intel), David I. August (Princeton University)‘FlexSP: Accelerating Large Language Model Training via Flexible Sequence Parallelism’, Yujie Wang (Peking University), Shiju Wang (Beihang University), Shenhan Zhu (Peking University), Fangcheng Fu (Peking University), Xinyi Liu (Peking University), Xuefeng Xiao (ByteDance Inc.), Huixia Li (ByteDance Inc.), Jiashi Li (ByteDance Inc.), Faming Wu (ByteDance Inc.), Bin Cui (Peking University)‘ByteFS: System Support for (CXL-based) Memory-Semantic Solid-State Drives’, Shaobo Li (University of Illinois Urbana-Champaign), Yirui (Eric) Zhou (University of Illinois Urbana-Champaign), Hao Ren (University of Illinois Urbana-Champaign), Jian Huang (University of Illinois Urbana-Champaign)
‘Using Analytical Performance/Power Model and Fine-Grained DVFS to Enhance AI Accelerator Energy Efficiency’, Zibo Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Yijia Zhang (Peng Cheng Laboratory), Fuchun Wei (Huawei Technologies Co., Ltd), Bingqiang Wang (Peng Cheng Laboratory), Yanlin Liu (Huawei Technologies Co., Ltd), Zhiheng Hu (Huawei Technologies Co., Ltd), Jingyi Zhang (Huawei Technologies Co., Ltd), Xiaoxin Xu (Huawei Technologies Co., Ltd), Jian He (Huawei Technologies Co., Ltd), Xiaoliang Wang (State Key Laboratory for Novel Software Technology, Nanjing University), Wanchun Dou (State Key Laboratory for Novel Software Technology, Nanjing University), Guihai Chen (State Key Laboratory for Novel Software Technology, Nanjing University), Chen Tian (State Key Laboratory for Novel Software Technology, Nanjing University)‘Hierarchical Prefetching, A Software-Hardware Instruction Prefetcher for Server Applications’, Tingji Zhang (Tsinghua University), Boris Grot (University of Edinburgh,Huawei Research), Wenjian He (Huawei Technologies Co., Ltd.), Yashuai Lv (Huawei Technologies Co., Ltd.), Peng Qu (Tsinghua University), Fang Su (Huawei Technologies Co., Ltd.), Wenxin Wang (Tsinghua University), Guowei Zhang (Huawei Technologies Co., Ltd.), Xuefeng Zhang (Tsinghua University), Youhui Zhang (Tsinghua University,Zhongguancun National Laboratory)‘Spindle: Efficient Distributed Training of Multi-Task Large Models via Wavefront Scheduling’, Yujie Wang (Peking University), Shenhan Zhu (Peking University), Fangcheng Fu (Peking University), Xupeng Miao (Purdue University), Jie Zhang (Alibaba Group), Juan Zhu (Alibaba Group), Fan Hong (Alibaba Group), Yong Li (Alibaba Group), Bin Cui (Peking University)‘M5: Mastering Page Migration and Memory Management for CXL-based Tiered Memory Systems’, Yan Sun (University of Illinois Urbana-Champaign), Jongyul Kim (University of Illinois Urbana-Champaign), Zeduo Yu (University of Illinois Urbana-Champaign), Jiyuan Zhang (University of Illinois Urbana-Champaign), Siyuan Chai (University of Illinois Urbana-Champaign), Michael Jaemin Kim (Seoul National University), Hwayong Nam (Seoul National University), Jaehyun Park (Seoul National University), Eojin Na (Seoul National University), Yifan Yuan (Intel Labs), Ren Wang (Intel Labs), Jung Ho Ahn (Seoul National University), Tianyin Xu (University of Illinois Urbana-Champaign), Nam Sung Kim (University of Illinois Urbana-Champaign)
‘Early Termination for Hyperdimensional Computing Using Inferential Statistics’, Pu (Luke) Yi (Stanford University), Yifan Yang (Stanford University), Chae Young Lee (Stanford University), Sara Achour (Stanford University)‘Bounding Speculative Execution of Atomic Regions to a Single Retry’, Eduardo José Gómez-Hernández (Computer Engineering Department, University of Murcia), Juan M. Cebrian (Computer Engineering Department, University of Murcia), Stefanos Kaxiras (Department of Information Technology, Uppsala University), Alberto Ros (Computer Engineering Department, University of Murcia)‘Vela: A Virtualized LLM Training System with GPU Direct RoCE’, Apoorve Mohan (IBM Research), Robert Walkup (IBM Research), Bengi Karacali (IBM Research), Ming-Hung Chen (IBM Research), Abdullah Kayi (IBM Research), Liran Schour (IBM), Shweta Salaria (IBM Research), Sophia Wen (IBM Research), IHsin Chung (IBM Research), Abdul Alim (IBM Research), Constantinos Evangelinos (IBM Research), Lixiang Luo (IBM Research), Marc Dombrowa (IBM Research), Laurent Schares (IBM Research), Ali Sydney (IBM Research), Pavlos Maniotis (IBM Research), Sandhya Koteshwara (IBM Research), Brent Tang (IBM), Joel Belog (IBM), Rei Odaira (IBM), Vasily Tarasov (IBM Research), Eran Gampel (IBM Cloud), Drew Thorstensen (IBM), Talia Gershon (IBM Research), Seetharami Seelam (IBM Research)‘Systematic CXL Memory Characterization and Performance Analysis at Scale’, Jinshu Liu (Virginia Tech), Hamid Hadian (Virginia Tech), Yuyue Wang (Virginia Tech), Daniel S. Berger (Microsoft and University of Washington), Marie Nguyen (Samsung), Xun Jian (Virginia Tech), Sam H. Noh (Virginia Tech), Huaicheng Li (Virginia Tech)

15:40-16:10 Coffee break & ASPLOS poster session (Wednesday morning presentations)

16:10-17:50

ML Systems 2 (6-A)Compilers & Languages (6-B)Mixture of Experts (6-C)Testing (6-D)
‘Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning’, Shenggan Cheng (National University of Singapore), Shengjie Lin (Georgia Institute of Technology), Lansong Diao (Alibaba Group), Hao Wu (George Mason University), Siyu Wang (Alibaba Group), Chang Si (Alibaba Group), Ziming Liu (National University of Singapore), Xuanlei Zhao (National University of Singapore), Jiangsu Du (Sun Yat-sen University), Wei Lin (Alibaba Group), Yang You (National University of Singapore)‘Validating JVM Compilers via Maximizing Optimization Interactions’, Zifan Xie (Huazhong University of Science and Technology), Ming Wen (Huazhong University of Science and Technology), Shiyu Qiu (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology)‘MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs’, Shiyi Cao (UC Berkeley), Shu Liu (UC Berkeley), Tyler Griggs (UC Berkeley), Peter Schafhalter (UC Berkeley), Xiaoxuan Liu (UC Berkeley), Ying Sheng (Stanford University), Joseph E. Gonzalez (UC Berkeley), Matei Zaharia (UC Berkeley), Ion Stoica (UC Berkeley)‘Manta: Hybrid-Sensitive Type Inference Toward Type-Assisted Bug Detection for Stripped Binaries’, Chengfeng Ye (The Hong Kong University of Science and Technology), Yuandao Cai (The Hong Kong University of Science and Technology), Anshunkang Zhou (The Hong Kong University of Science and Technology), Heqing Huang (City University of Hong Kong), Hao Ling (The Hong Kong University of Science and Technology), Charles Zhang (The Hong Kong University of Science and Technology)
‘Design and Operation of Shared Machine Learning Clusters on Campus’, Kaiqiang Xu (Hong Kong University of Science and Technology), Decang Sun (Hong Kong University of Science and Technology), Hao Wang (Hong Kong University of Science and Technology), Zhenghang Ren (Hong Kong University of Science and Technology), Xinchen Wan (Hong Kong University of Science and Technology), Xudong Liao (Hong Kong University of Science and Technology), Zilong Wang (Hong Kong University of Science and Technology), Junxue Zhang (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology)‘Faster Chaitin-like Register Allocation via Grammatical Decompositions of Control-Flow Graphs’, Xuran Cai (Hong Kong University of Science and Technology), Amir Kafshdar Goharshady (University of Oxford), S. Hitarth (Hong Kong University of Science and Technology), Chun Kit Lam (Hong Kong University of Science and Technology)‘FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models’, Xinglin Pan (The Hong Kong University of Science and Technology (Guangzhou)), Wenxiang Lin (Harbin Institute of Technology, Shenzhen), Lin Zhang (Hong Kong University of Science and Technology), Shaohuai Shi (Harbin Institute of Technology, Shenzhen), Zhenheng Tang (The Hong Kong University of Science and Technology), Rui Wang (The Hong Kong University of Science and Technology (Guangzhou)), Bo Li (Hong Kong University of Science and Technology), Xiaowen Chu (The Hong Kong University of Science and Technology (Guangzhou),Hong Kong University of Science and Technology)‘Selectively Uniform Concurrency Testing’, Huan Zhao (National University of Singapore), Dylan Wolff (National University of Singapore), Umang Mathur (National University of Singapore), Abhik Roychoudhury (National University of Singapore)
‘PCcheck: Persistent Concurrent Checkpointing for ML’, Foteini Strati (ETH Zurich), Michal Friedman (ETH Zurich), Ana Klimovic (ETH Zurich)‘Towards Sound Reassembly of Modern x86-64 Binaries’, Hyungseok Kim (The Affiliated Institute of ETRI), Soomin Kim (KAIST), Sang Kil Cha (KAIST)‘CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory’, Jiashun Suo (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Xiaojian Liao (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Limin Xiao (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Li Ruan (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Jinquan Wang (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Xiao Su (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University), Zhisheng Huo (State Key Laboratory of CCSE and School of Computer Science and Engineering, Beihang University)‘TAOPT: Tool-Agnostic Optimization of Parallelized Automated Mobile UI Testing’, Dezhi Ran (Key Lab of HCST (PKU), MOE), SCS, Peking University), Zihe Song (University of Texas at Dallas), Wenyu Wang (University of Illinois at Urbana-Champaign), Wei Yang (University of Texas at Dallas), Tao Xie (Key Lab of HCST (PKU), MOE), SCS, Peking University)
‘Tally: Non-Intrusive Performance Isolation for Concurrent Deep Learning Workloads’, Wei Zhao (Stanford University,CentML), Anand Jayarajan (University of Toronto,Vector Institute), Gennady Pekhimenko (University of Toronto,Vector Institute)‘SmoothE: Differentiable E-Graph Extraction’, Yaohui Cai (Cornell University), Kaixin Yang (Cornell University), Chenhui Deng (Cornell University), Cunxi Yu (University of Maryland, College Park), Zhiru Zhang (Cornell University)‘Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline’, Zhiyuan Fang (Sun Yat-sen University), Yuegui Huang (Sun Yat-sen University), Zicong Hong (Hong Kong University of Science and Technology), Yufeng Lyu (Huawei Technologies Co. Ltd), Wuhui Chen (Sun Yat-sen University,Peng Cheng Laboratory), Yue Yu (Peng Cheng Laboratory), Fan Yu (Huawei Technologies Co. Ltd), Zibin Zheng (Sun Yat-sen University)‘Debugger Toolchain Validation via Cross-Level Debugging’, Yibiao Yang (State Key Laboratory for Novel Software Technology, Nanjing University), Maolin Sun (State Key Laboratory for Novel Software Technology, Nanjing University), Jiangchang Wu (State Key Laboratory for Novel Software Technology, Nanjing University), Qingyang Li (State Key Laboratory for Novel Software Technology, Nanjing University), Yuming Zhou (State Key Laboratory for Novel Software Technology, Nanjing University)
‘Load and MLP-Aware Thread Orchestration for Recommendation Systems Inference on CPUs’, Rishabh Jain (The Pennsylvania State University), Teyuh Chou (Advanced Micro Devices, Inc.), Onur Kayiran (Advanced Micro Devices, Inc.), John Kalamatianos (Advanced Micro Devices, Inc.), Gabriel H. Loh (Advanced Micro Devices, Inc.), Mahmut T. Kandemir (The Pennsylvania State University), Chita R. Das (The Pennsylvania State University)‘Exo 2: Growing a Scheduling Language’, Yuka Ikarashi (MIT CSAIL), Kevin Qian (MIT CSAIL), Samir Droubi (MIT CSAIL), Alex Reinking (Adobe), Gilbert Louis Bernstein (University of Washington), Jonathan Ragan-Kelley (MIT CSAIL)‘MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training’, Weilin Cai (The Hong Kong University of Science and Technology (Guangzhou)), Le Qin (The Hong Kong University of Science and Technology (Guangzhou)), Jiayi Huang (The Hong Kong University of Science and Technology (Guangzhou))‘Dynamic Partial Deadlock Detection and Recovery via Garbage Collection’, Georgian-Vlad Saioc (Aarhus University,Programming Systems Group, Uber Technologies, Inc.), I-Ting Angelina Lee (Washington University in St. Louis), Anders Møller (Aarhus University), Milind Chabbi (Programming Systems Group, Uber Technologies, Inc.)

19:00- ASPLOS & EuroSys Banquet

Thursday 3rd April

08:30-09:00 Registration and Welcome Coffee

09:00-10:40

Serving LLMs (7-A)Quantum Error Correction (7-B)Side Channels (7-C)Fuzz Testing (7-D)
‘Accelerating LLM Serving for Multi-turn Dialogues with Efficient Resource Management’, Jinwoo Jeong (Korea University), Jeongseob Ahn (Korea University)‘Clapton: Clifford Assisted Problem Transformation for Error Mitigation in Variational Quantum Algorithms’, Lennart Maximilian Seifert (Department of Computer Science, University of Chicago), Siddharth Dangwal (Department of Computer Science, University of Chicago), Frederic T. Chong (Department of Computer Science, University of Chicago), Gokul Subramanian Ravi (Electrical Engineering and Computer Science Department, University of Michigan)‘Controlled Preemption: Amplifying Side-Channel Attacks from Userspace’, Yongye Zhu (University of California, Berkeley), Boru Chen (University of California, Berkeley), Zirui Neil Zhao (NVIDIA,UT Austin), Christopher W. Fletcher (University of California, Berkeley)‘The Mutators Reloaded: Fuzzing Compilers with Large Language Model Generated Mutation Operators’, Xianfei Ou (Nanjing University), Cong Li (Ant Group,Zhejiang University), Yanyan Jiang (Nanjing University), Chang Xu (Nanjing University)
‘COMET: Towards Practical W4A4KV4 LLMs Serving’, Lian Liu (Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Long Cheng (North China Electric Power University), Haimeng Ren (ShanghaiTech University), Zhaohui Xu (ShanghaiTech University), Yudong Pan (Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Mengdi Wang (Institute of Computing Technology, CAS), Xiaowei Li (Institute of Computing Technology, CAS,Zhongguancun Laboratory), Yinhe Han (Institute of Computing Technology, CAS), Ying Wang (Institute of Computing Technology, CAS)‘QECC-Synth: A Layout Synthesizer for Quantum Error Correction Codes on Sparse Architectures’, Keyi Yin (University of California, San Diego), Hezi Zhang (University of California, San Diego), Xiang Fang (University of California, San Diego), Yunong Shi (AWS Quantum Technologies), Travis S. Humble (Oak Ridge National Laboratory), Ang Li (Pacific Northwest National Laboratory), Yufei Ding (University of California, San Diego)‘Protecting Cryptographic Code Against Spectre-RSB’, Santiago Arranz Olmos (MPI-SP), Gilles Barthe (MPI-SP, IMDEA Software Institute), Chitchanok Chuengsatiansup (University of Melbourne), Benjamin Gregoire (Inria), Vincent Laporte (Université de Lorraine, CNRS, Inria, LORIA), Tiago Oliveira (Sandbox AQ), Peter Schwabe (MPI-SP,Radboud University), Yuval Yarom (Ruhr University Bochum), Zhiyuan Zhang (MPI-SP)‘ClosureX: Compiler Support for Correct Persistent Fuzzing’, Rishi Ranjan (Virginia Tech), Ian Paterson (Virginia Tech), Matthew Hicks (Virignia Tech)
‘Past-Future Scheduler for LLM Serving under SLA Guarantees’, Ruihao Gong (Beihang University), Shihao Bai (SenseTime), Siyu Wu (Beihang University), Yunqian Fan (SenseTime), Zaijun Wang (SenseTime), Xiuhong Li (Peking University), Hailong Yang (Beihang University), Xianglong Liu (Beihang University)‘HetEC: Architectures for Heterogeneous Quantum Error Correction Codes’, Samuel Stein (Future Computing Technologies, Pacific Northwest National Laboratory), Shifan Xu (Yale Quantum Institute, Yale University), Andrew W. Cross (IBM Quantum, IBM T.J Watson Research Center), Theodore J. Yoder (IBM Quantum, IBM T. J. Watson Research Center), Ali Javadi-Abhari (IBM Quantum, IBM T. J. Watson Research Center), Chenxu Liu (Future Computing Technologies, Pacific Northwest National Laboratory), Kun Liu (Yale Quantum Institute, Yale University), Zeyuan Zhou (Yale Quantum Institute, Yale University), Charlie Guinn (Department of Physics, Princeton University), Yufei Ding (Department of Computer Science & Engineering, University of California San Diego), Yongshan Ding (Yale Quantum Institute, Yale University), Ang Li (Future Computing Technologies, Pacific Northwest National Laboratory,Department of Electrical & Computer Engineering, University of Washington)‘Reload+Reload: Exploiting Cache and Memory Contention Side Channel on AMD SEV’, Li-Chung Chiang (National Taiwan University), Shih-Wei Li (National Taiwan University)‘Ratte: Fuzzing for Miscompilations in Multi-Level Compilers Using Composable Semantics’, Pingshi Yu (Imperial College London), Nicolas Wu (Imperial College London), Alastair F. Donaldson (Imperial College London)
‘POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference’, Aditya K Kamath (Paul G Allen School of Computer Science and Engineering, University of Washington), Ramya Prabhu (Microsoft Research India), Jayashree Mohan (Microsoft Research India), Simon Peter (Paul G Allen School of Computer Science and Engineering, University of Washington), Ramachandran Ramjee (Microsoft Research India), Ashish Panwar (Microsoft Research India)‘Micro Blossom: Accelerated Minimum-Weight Perfect Matching Decoding for Quantum Error Correction’, Yue Wu (Yale University), Namitha Liyanage (Yale University), Lin Zhong (Yale University)‘SMaCk: Efficient Instruction Cache Attacks via Self-Modifying Code Conflicts’, Seonghun Son (Iowa State University), Daniel Moghimi (Google), Berk Gulmezoglu (Iowa State University)‘Snowplow: Effective Kernel Fuzzing with a Learned White-box Test Mutator’, Sishuai Gong (Purdue University), Wang Rui (Purdue University), Deniz Altinbüken (Google DeepMind), Pedro Fonseca (Purdue University), Petros Maniatis (Google DeepMind)
‘TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms’, Jovan Stojkovic (University of Illinois at Urbana-Champaign), Chaojie Zhang (Microsoft Azure Research), Íñigo Goiri (Microsoft Azure Research), Esha Choukse (Microsoft Azure Research), Haoran Qiu (Microsoft Azure Research), Rodrigo Fonseca (Microsoft Azure Research), Josep Torrellas (University of Illinois at Urbana-Champaign), Ricardo Bianchini (Microsoft Azure)‘RESCQ: Realtime Scheduling for Continuous Angle Quantum Error Correction Architectures’, Sayam Sethi (Department of Electrical and Computer Engineering, The University of Texas at Austin), Jonathan Mark Baker (Department of Electrical and Computer Engineering, The University of Texas at Austin)‘FlexProf: Flexible, Side-Channel-Free Memory Access’, Jarrett Minton (University of Utah), Rajeev Balasubramonian (University of Utah)‘KernelGPT: Enhanced Kernel Fuzzing via Large Language Models’, Chenyuan Yang (University of Illinois at Urbana-Champaign), Zijie Zhao (University of Illinois at Urbana-Champaign), Lingming Zhang (University of Illinois at Urbana-Champaign)

10:40-11:10 Coffee break & ASPLOS poster session (Thursday afternoon presentations)

11:10-12:30

ML Training (8-A)Networking (8-B)Potpourri 2 (8-C)Security (8-D)
‘GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism’, Byungsoo Jeon (NVIDIA), Mengdi Wu (Carnegie Mellon Univerisity), Shiyi Cao (UC Berkeley), Sunghyun Kim (Massachusetts Institute of Technology), Sunghyun Park (NVIDIA), Neeraj Aggarwal (Carnegie Mellon University), Colin Unger (Stanford University), Daiyaan Arfeen (Carnegie Mellon University), Peiyuan Liao (Carnegie Mellon University), Xupeng Miao (Carnegie Mellon University), Mohammad Alizadeh (Massachusetts Institute of Technology), Gregory R. Ganger (Carnegie Mellon University), Tianqi Chen (Carnegie Mellon University), Zhihao Jia (Carnegie Mellon University)‘EDM: An Ultra-Low Latency Ethernet Fabric for Memory Disaggregation’, Weigao Su (Purdue University), Vishal Shrivastav (Purdue University)‘Extended User Interrupts (xUI): Fast and Flexible Notification without Polling’, Berk Aydogmus (Purdue University), Linsong Guo (UC San Diego), Danial Zuberi (UC San Diego), Tal Garfinkel (UC San Diego), Dean Tullsen (UC San Diego), Amy Ousterhout (UC San Diego), Kazem Taram (Purdue University)‘TaintEMU: Decoupling Tracking from Functional Domains for Architecture-Agnostic and Efficient Whole-System Taint Tracking’, Lei Cui (Guangxi Normal University), Youquan Xian (Guangxi Normal University), Peng Liu (Guangxi Normal University), Longjin Lu (Independent Researcher)
‘Cascade: A Dependency-aware Efficient Training Framework for Temporal Graph Neural Network’, Yue Dai (Department of Computer Science, University of Pittsburgh), Xulong Tang (Department of Computer Science, University of Pittsburgh), Youtao Zhang (Department of Computer Science, University of Pittsburgh)‘Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness’, Shaofeng Wu (The Chinese University of Hong Kong), Qiang Su (The Chinese University of Hong Kong), Zhixiong Niu (Microsoft Research), Hong Xu (The Chinese University of Hong Kong)‘Stramash: A Fused-kernel Operating System For Cache-Coherent, Heterogeneous-ISA Platforms’, Tong Xing (The University of Edinburgh), Cong Xiong (Imperial College London), Tianrui Wei (UC Berkeley), April Sanchez (Google), Binoy Ravindran (Virginia Tech), Jonathan Balkind (UC Santa Barbara), Antonio Barbalace (The University of Edinburgh)‘Segue & ColorGuard: Optimizing SFI Performance and Scalability on Modern Architectures’, Shravan Narayan (UT Austin), Tal Garfinkel (UC San Diego), Evan Johnson (UC San Diego), Zachary Yedidia (Stanford University), Yingchen Wang (UC Berkeley), Andrew Brown (Intel), Anjo Vahldiek-Oberwagner (Intel Labs), Michael LeMay (Intel Labs), Wenyong Huang (Intel), Xin Wang (Intel), Mingqiu Sun (Intel), Dean Tullsen (UC San Diego), Deian Stefan (UC San Diego)
‘Frugal: Efficient and Economic Embedding Model Training with Commodity GPUs’, Minhui Xie (Tsinghua University,Renmin University of China), Shaoxun Zeng (Tsinghua University), Hao Guo (Tsinghua University), Shiwei Gao (Tsinghua University), Youyou Lu (Tsinghua University)‘Gigaflow: Pipeline-Aware Sub-Traversal Caching for Modern SmartNICs’, Annus Zulfiqar (University of Michigan), Ali Imran (University of Michigan), Venkat Kunaparaju (Purdue University), Ben Pfaff (Feldera Inc.), Gianni Antichi (Politecnico di Milano), Muhammad Shahbaz (University of Michigan)‘H-Houdini: Scalable Invariant Learning’, Sushant Dinesh (University of California, Berkeley), Yongye Zhu (University of California, Berkeley), Christopher W. Fletcher (University of California, Berkeley)‘Pave: Information Flow Control for Privacy-preserving Online Data Processing Services’, Minkyung Park (University of Texas at Dallas), Jaeseung Choi (Sogang University), Hyeonmin Lee (University of Virginia), Taekyoung Kwon (Seoul National University)
‘FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale’, Zeyu Zhu (Institute of Automation, Chinese Academy of Sciences,School of Future Technology, University of Chinese Academy of Sciences), Peisong Wang (Institute of Automation, Chinese Academy of Sciences), Qinghao Hu (Institute of Automation, Chinese Academy of Sciences,AiRiA), Gang Li (Shanghai Jiao Tong University), Xiaoyao Liang (Shanghai Jiao Tong University), Jian Cheng (Institute of Automation, Chinese Academy of Sciences,AiRiA)‘TNIC: A Trusted NIC Architecture’, Dimitra Giantsidi (The University of Edinburgh), Julian Pritzi (Technical University of Munich), Felix Gust (Technical University of Munich), Antonios Katsarakis (Huawei Research), Atsushi Koshiba (Technical University of Munich), Pramod Bhatotia (Technical University of Munich)‘Target-Aware Implementation of Real Expressions’, Brett Saiki (University of Washington), Jackson Brough (University of Utah), Jonas Regehr (University of Utah), Jesus Ponce (University of Utah), Varun Pradeep (University of Washington), Aditya Akhileshwaran (University of Washington), Zachary Tatlock (University of Washington), Pavel Panchekha (University of Utah)‘Sharing is leaking: blocking transient-execution attacks with core-gapped confidential VMs’, Charly Castes (EPFL,Google), Andrew Baumann (Google)

12:30-14:00 Lunch

14:00-15:40

ML Compilers (9-A)Processing in Memory (9-B)Verification & Reliability (9-C)Trust (9-D)
‘Einsum Trees: An Abstraction for Optimizing the Execution of Tensor Expressions’, Alexander Breuer (Friedrich Schiller University Jena), Mark Blacher (Friedrich Schiller University Jena), Max Engel (Friedrich Schiller University Jena), Joachim Giesen (Friedrich Schiller University Jena), Alexander Heinecke (Intel Corporation), Julien Klaus (Friedrich Schiller University Jena), Stefan Remke (Friedrich Schiller University Jena)‘PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System’, Yintao He (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Haiyu Mao (King’s College London,ETH Zürich), Christina Giannoula (University of Toronto,Vector Institute), Mohammad Sadrosadati (ETH Zürich), Juan Gómez-Luna (NVIDIA), Huawei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Xiaowei Li (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ying Wang (SKLP, Institute of Computing Technology, CAS), Onur Mutlu (ETH Zürich)‘RTL Verification for Secure Speculation Using Contract Shadow Logic’, Qinhan Tan (Princeton University), Yuheng Yang (Massachusetts Institute of Technology), Thomas Bourgeat (École Polytechnique Fédérale de Lausanne), Sharad Malik (Princeton University), Mengjia Yan (Massachusetts Institute of Technology)‘TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing’, Husheng Han (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Xinyao Zheng (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Yuanbo Wen (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Yifan Hao (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Erhu Feng (IPADS, Shanghai Jiao Tong University,Engineering Research Center for Domain-specific Operating Systems (MoE)), Ling Liang (Peking university), Jianan Mu (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Xiaqing Li (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Tianyun Ma (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Cambricon Technologies), Pengwei Jin (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences ), Xinkai Song (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Zidong Du (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,Shanghai Innovation Center for Processor Technologies, SHIC), Qi Guo (SKLP, Institute of Computing Technology, Chinese Academy of Sciences), Xing Hu (SKLP, Institute of Computing Technology, Chinese Academy of Sciences,ZGC LAB)
‘Optimizing Deep Learning Inference Efficiency through Block Dependency Analysis’, Zhanyuan Di (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Leping Wang (SKLP, Institute of Computing Technology, CAS), En Shao (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Zhaojia Ma (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ziyi Ren (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Feng Hua (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Lixian Ma (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Jie Zhao (Hunan University), Guangming Tan (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences), Ninghui Sun (SKLP, Institute of Computing Technology, CAS,University of Chinese Academy of Sciences)‘PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference’, Yufeng Gu (University of Michigan), Alireza Khadem (University of Michigan), Sumanth Umesh (University of Michigan), Ning Liang (University of Michigan), Xavier Servot (ETH Zurich), Onur Mutlu (ETH Zurich), Ravi Iyer (Google), Reetuparna Das (University of Michigan)‘ElasticMiter: Formally Verified Dataflow Circuit Rewrites’, Ayatallah Elakhras (EPFL), Jiahui Xu (ETH Zurich), Martin Erhart (ETH Zurich), Paolo Ienne (EPFL), Lana Josipovic (ETH Zurich)‘MDPeek: Breaking Balanced Branches in SGX with Memory Disambiguation Unit Side Channels’, Chang Liu (Tsinghua University), Shuaihu Feng (Zhongguancun Laboratory), Yuan Li (Zhongguancun Laboratory), Dongsheng Wang (Tsinghua University), Wenjian He (Huawei Technologies Co., Ltd.), Yongqiang Lyu (Tsinghua University), Trevor E. Carlson (National University of Singapore)
‘Pruner: A Draft-then-Verify Exploration Mechanism to Accelerate Tensor Program Tuning’, Liang Qiao (University of Science and Technology of China), Jun Shi (University of Science and Technology of China), Xiaoyu Hao (University of Science and Technology of China), Xi Fang (University of Science and Technology of China), Sen Zhang (University of Science and Technology of China), Minfan Zhao (University of Science and Technology of China), Ziqi Zhu (University of Science and Technology of China), Junshi Chen (University of Science and Technology of China), Hong An (University of Science and Technology of China), Xulong Tang (University of Pittsburgh), Bing Li (NIO), Honghui Yuan (NIO), Xinyang Wang (NIO)‘CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms’, Asif Ali Khan (Technische Universität Dresden), Hamid Farzaneh (Technische Universität Dresden), Karl Friedrich Alexander Friebel (Technische Universität Dresden), Clément Fournier (Technische Universität Dresden), Lorenzo Chelini (Intel), Jeronimo Castrillon (Technische Universität Dresden)‘Robustness Verification for Checking Crash Consistency of Non-volatile Memory’, Zhilei Han (School of Software, Tsinghua University), Fei He (School of Software, Tsinghua University)‘Accelerating Number Theoretic Transform with Multi-GPU Systems for Efficient Zero Knowledge Proof’, Zhuoran Ji (School of Cyber Science and Technology, Shandong University), Jianyu Zhao (School of Cyber Science and Technology, Shandong University), Peimin Gao (School of Cyber Science and Technology, Shandong University), Xiangkai Yin (School of Cyber Science and Technology, Shandong University), Lei Ju (Quan Cheng Laboratory)
‘Relax: Composable Abstractions for End-to-End Dynamic Machine Learning’, Ruihang Lai (Carnegie Mellon University), Junru Shao (OpenAI), Siyuan Feng (Shanghai Jiao Tong University), Steven Lyubomirsky (NVIDIA), Bohan Hou (Carnegie Mellon University), Wuwei Lin (OpenAI), Zihao Ye (University of Washington), Hongyi Jin (Carnegie Mellon University), Yuchen Jin (Hyperbolic), Jiawei Liu (University of Illinois Urbana-Champaign), Lesheng Jin (Hyperbolic), Yaxing Cai (NVIDIA), Ziheng Jiang (ByteDance), Yong Wu (NVIDIA), Sunghyun Park (NVIDIA), Prakalp Srivastava (Netflix), Jared Roesch (NVIDIA), Todd C. Mowry (Carnegie Mellon University), Tianqi Chen (Carnegie Mellon University,NVIDIA)‘Toleo: Scaling Freshness to Tera-scale Memory Using CXL and PIM’, Juechu Dong (University of Michigan), Jonah Rosenblum (University of Michigan), Satish Narayanasamy (University of Michigan)‘Proactive Runtime Detection of Aging-Related Silent Data Corruptions: A Bottom-Up Approach’, Jiacheng Ma (University of Michigan), Majd Ganaiem (Technion – Israel Institute of Technology), Madeline Burbage (University of Washington), Theo Gregersen (University of Washington), Rachel McAmis (University of Washington), Freddy Gabbay (The Hebrew University of Jerusalem), Baris Kasikci (University of Washington,Google)‘BatchZK: A Fully Pipelined GPU-Accelerated System for Batch Generation of Zero-Knowledge Proofs’, Tao Lu (Zhejiang University,National University of Singapore), Yuxun Chen (Zhejiang University), Zonghui Wang (Zhejiang University), Xiaohang Wang (Zhejiang University), Wenzhi Chen (Zhejiang University), Jiaheng Zhang (National University of Singapore)
‘Towards End-to-End Optimization of LLM-based Applications with Ayo’, Xin Tan (The Chinese University of Hong Kong), Yimin Jiang (Unaffiliated), Yitao Yang (The Chinese University of Hong Kong), Hong Xu (The Chinese University of Hong Kong)‘Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM Accelerators’, Shixin Zhao (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Yuming Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Bing Li (Institute of Microelectronics, Chinese Academy of Sciences), Yintao He (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Academy of Sciences), Mengdi Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Yinhe Han (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences), Ying Wang (State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences)‘Hardware Sentinel: Protecting Software Applications from Hardware Silent Data Corruptions’, Rhea Dutta (Meta Platforms, Inc.), Harish Dattatraya Dixit (Meta Platforms, Inc.), Rik Van Riel (Meta Platforms, Inc.), Gautham Vunnam (Meta Platforms, Inc.), Sriram Sankar (Meta Platforms, Inc.)‘UniZK: Accelerating Zero-Knowledge Proof with Unified Hardware and Flexible Kernel Mapping’, Cheng Wang (Xi’an Jiaotong University,Institute for Interdisciplinary Information Core Technology), Mingyu Gao (Tsinghua University,Shanghai Qi Zhi Institute)

15:40-16:10 Coffee break & ASPLOS poster session (Thursday morning presentations)

16:10-17:30

GPGPU (10-A)Solid State Storage (10-B)Serverless Computing (10-C)Memory Management (10-D)
‘Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency’, Hansung Kim (University of California, Berkeley), Ruohan Richard Yan (University of California, Berkeley), Joshua You (University of California, Berkeley), Tieliang Vamber Yang (NVIDIA Corporation), Yakun Sophia Shao (University of California, Berkeley)‘MaxEmbed: Maximizing SSD bandwidth utilization for huge embedding models serving’, Ruwen Fan (Tsinghua University), Minhui Xie (Tsinghua University), Haodi Jiang (Tsinghua University), Youyou Lu (Tsinghua University)‘Litmus: Fair Pricing for Serverless Computing’, Qi Pei (Computer Science / Watson School, The State University of New York at Binghamton), Yipeng Wang (Intel Lab., Intel), Seunghee Shin (Computer Science / Watson School, The State University of New York at Binghamton)‘Velosiraptor: Code Synthesis for Memory Translation’, Reto Achermann (University of British Columbia), Em Chu (JuliaHub), Ryan Mehri (Replit), Ilias Karimalis (University of British Columbia), Margo Seltzer (University of British Columbia)
‘Towards Unified Analysis of GPU Consistency’, Haining Tong (University of Helsinki), Natalia Gavrilenko (Huawei Dresden Research Center), Hernan Ponce de Leon (Huawei Dresden Research Center), Keijo Heljanko (University of Helsinki,Helsinki Institute for Information Technology)‘AnyKey: A Key-Value SSD for All Workload Types’, Chanyoung Park (Hanyang University), Jungho Lee (Hanyang University), Chun-Yi Liu (Micron Technology Inc.), Kyungtae Kang (Hanyang Univeristy), Mahmut Taylan Kandemir (Pennsylvania State University), Wonil Choi (Hanyang University)‘Concurrency-Informed Orchestration for Serverless Functions’, Qichang Liu (University of Virginia), Yue Cheng (University of Virginia), Haiying Shen (University of Virginia), Ao Wang (Alibaba Group), Bharathan Balaji (Amazon)‘vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention’, Ramya Prabhu (Microsoft Research), Ajay Nayak (Indian Institute of Science), Jayashree Mohan (Microsoft Research), Ramachandran Ramjee (Microsoft Research), Ashish Panwar (Microsoft Research)
‘Aqua: Network-Accelerated Memory Offloading for LLMs in Scale-Up GPU Domains’, Abhishek Vijaya Kumar (Cornell University), Gianni Antichi (Politecnico di Milano), Rachee Singh (Cornell University)‘Simplifying and Accelerating NOR Flash I/O Stack for RAM-Restricted Microcontrollers’, Hao Huang (Harbin Institute of Technology, Shenzhen), Yanqi Pan (Harbin Institute of Technology, Shenzhen), Wen Xia (Harbin Institute of Technology, Shenzhen), Xiangyu Zou (Harbin Institute of Technology, Shenzhen), Darong Yang (Harbin Institute of Technology, Shenzhen), Liang Shi (East China Normal University), Hongwei Du (Harbin Institute of Technology, Shenzhen)‘Dilu: Enabling GPU Resourcing-on-Demand for Serverless DL Serving via Introspective Elasticity’, Cunchi Lv (ICT, CAS,UCAS), Xiao Shi (ICT, CAS,Nanjing Institute of InforSuperBahn), Zhengyu Lei (ICT, CAS,UCAS), Jinyue Huang (ICT, CAS, UCAS), Wenting Tan (ICT, CAS), Xiaohui Zheng (ICT, CAS), Xiaofang Zhao (ICT, CAS,IICT, Suzhou, CAS)‘Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology’, Konstantinos Kanellopoulos (ETH Zürich), Konstantinos Sgouras (ETH Zürich), Nisa Bostanci (ETH Zürich), Andreas Kosmas Kakolyris (ETH Zürich), Berkin Kerim Konar (ETH Zürich), Rahul Bera (ETH Zürich), Mohammad Sadrosadati (ETH Zürich), Rakesh Kumar (Norwegian University of Science and Technology (NTNU)), Nandita Vijaykumar (University of Toronto), Onur Mutlu (ETH Zürich)
‘Optimizing Datalog for the GPU’, Yihao Sun (Syracuse University), Ahmedur Rahman Shovon (University of Illinois, Chicago), Thomas Gilray (Washington State University), Sidharth Kumar (University of Illinois, Chicago), Kristopher Micinski (Syracuse University)‘A Software Caching Runtime for Embedded NVRAM Systems’, Harrison Williams (Virginia Tech), Matthew Hicks (Virginia Tech)‘Medusa: Accelerating Serverless LLM Inference with Materialization’, Shaoxun Zeng (Tsinghua University), Minhui Xie (Tsinghua University), Shiwei Gao (Tsinghua University), Youmin Chen (Tsinghua University), Youyou Lu (Tsinghua University)‘Instruction-Aware Cooperative TLB and Cache Replacement Policies’, Dimitrios Chasapis (Barcelona Supercomputing Center (BSC)), Georgios Vavouliotis (Unaffiliated), Daniel A. Jiménez (Texas A&M University), Marc Casas (Barcelona Supercomputing Center (BSC),Universitat Politècnica de Catalunya (UPC))