[ Software Research Lunch ]

The Stanford Software Research Lunch is a weekly event on Thursday where students and researchers present their latest work to peers. Talks are open to anybody, but regular attendees are expected to give a presentation on their work.

Mailing list: software-research-lunch@lists.stanford.edu (subscribe via mailman)

Calendar: ical

Format: The lunch is held every week during fall, winter and spring quarter. The first week of every quarter is an organizational lunch where people can sign up to give a talk. If you'd like to give a talk, please contact Matthew Sotoudeh or Anjiang Wei.

Past quarters: Fall 2023, Spring 2023, Winter 2023, Fall 2022, Winter 2021, Fall 2020, Winter 2020, Fall 2019, Spring 2019, Winter 2019, Fall 2018, Spring 2018, Winter 2018, Fall 2017, Spring 2017, Winter 2017, Fall 2016.

Upcoming quarters: Spring 2024.

Ordering Food: For suggestions for those ordering food for the lunch, see here.

4/4: Computation-Centric Networking

Time: Thursday, April 4, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Akshay Srivatsan

Abstract: We propose putting computation at the center of what networked computers and cloud services do for their users. We envision a shared representation of a computation: a deterministic procedure, run in an environment of well-specified dependencies. This suggests an end-to-end argument for serverless computing, shifting the service model from “renting CPUs by the second” to “providing the unambiguously correct result of a computation.” Accountability to these higher-level abstractions could permit agility and innovation on other axes.


4/11: Deegen: a compiler-compiler for high performance VMs at low engineering cost

Time: Thursday, April 11, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Haoran Xu

Abstract: Building a high-performance VM for a dynamic language has traditionally required a huge amount of time, money, and expertise. To reduce the high engineering cost, we present Deegen, a compiler-compiler that generates a high-performance VM automatically from a semantic description of the bytecodes. The generated VM has three execution tiers, similar to the state-of-the-art VMs like V8 or JSC: an optimized interpreter, a baseline JIT, and an optimizing JIT (the work for the optimizing JIT is still in progress). This allows the user to get a high-performance VM for their own language at an engineering cost similar to writing an interpreter. To demonstrate Deegen's capability in the real world, we implemented LuaJIT Remake (LJR), a standard-compliant VM for Lua 5.1. Across a variety of benchmarks, we demonstrated that LJR's interpreter significantly outperforms LuaJIT's interpreter, and LJR's baseline JIT generates high-quality code with a negligible compilation cost.


4/18: Zero-Knowledge, Maximum Security: Hardening Blockchain with Formal Methods

Time: Thursday, April 18, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Yu Feng

Abstract: Zero-knowledge proofs are powerful cryptographic protocols for enhancing privacy and scalability in blockchains. However, ensuring the correctness and security of zero-knowledge proofs is a challenging task due to its complex nature. This talk aims to address this challenge by leveraging formal methods with a pipeline of increasing confidence, ranging from a domain-specific solver for detecting under-constrained circuits (PLDI'23), to formal verification for functional correctness using refinement types (Oakland'24). By applying formal methods with complementary strength, we have been working on rigorous teques to detect vulnerabilities, verify correctness, and enhance the resilience of zero-knowledge proof systems against attacks.


4/25: Mapple: A High-Level Programming Interface for Distributed Mapping

Time: Thursday, April 25, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Anjiang Wei

Abstract: The mapping of data and computation to processors in a machine is crucial for performance in distributed programming. In this work, we propose Mapple, a high-level programming interface for distributed mapping, which overcomes the limitations of generality, programmability, and complexity in existing designs. The core abstraction of Mapple is the mapping from a multi-dimensional iteration space (defined by the logical loop structure of an application) onto a multi-dimensional processor space (defined by a hierarchical machine). Mapple features a set of transformation primitives to bridge the potential dimension difference between the two spaces, and also provides a way for users to write generic mappers that can perform well across different input sizes and machine sizes. We demonstrate that Mapple provides the programmability to concisely express mappings for six distributed matrix-multiplication algorithms, and show that an input-sensitive, machine-aware mapper in Mapple can outperform an input-agnostic mapper by up to 83% for applications with a nearest-neighbor communication pattern.


5/2: Composing distributed computations through task and kernel fusion

Time: Thursday, May 2, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Rohan Yadav

Abstract: We introduce Diffuse, a system that dynamically performs task and kernel fusion in distributed, task-based runtime systems. The key component of Diffuse is an intermediate representation of distributed computation that enables the necessary analyses for the fusion of distributed tasks to be performed in a scalable manner. We pair task fusion with a JIT compiler to fuse together the kernels within fused tasks. We show empirically that Diffuse’s intermediate representation is general enough to be a target for two real-world, task-based libraries (cuNumeric and Legate Sparse), letting Diffuse find optimization opportunities across function and library boundaries. Diffuse accelerates unmodified applications developed by composing task-based libraries by 1.86x on average (geo-mean), and by between 0.93x–10.7x on up to 128 GPUs. Diffuse also finds optimization opportunities missed by the original application developers, enabling high-level Python programs to match or exceed the performance of an explicitly parallel MPI library.


5/9: Compiling Geometric Queries on Accelerator Data Structures

Time: Thursday, May 9, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Alexander J. Root

Abstract: Geometric queries are everywhere in graphics: ray tracing, collision detection, closest point queries, and nearest neighbor queries are central to many import graphics algorithms and systems. The database community similarly cares about geometric queries, such as rectangle filters, range joins, and distance joins. Each of these queries can be accelerated via various bounding volume hierarchies (e.g. AABBs, Bounding Spheres, OBBs, SVVs, k-DOPs), each with various trade-offs. Further optimizations, such as parallelism, vectorization, restart trails, bit-packing, and quantization, are manually employed to achieve even greater performance for many of these queries, but typically require developers to hand-write low-level primitives (e.g. directly in assembly). These performance optimizations generally reduce portability, both across data structures and across architectures (e.g. various CPU and GPU types). In this (exploratory) work, we sketch an approach to achieve high-performance geometric queries that are portable across accelerator data structures and machines, by decoupling the query from the accelerator data structure and various performance optimizations, similar to Halide’s decoupling of algorithm and schedule. We draft a functional map-reduce-like programming language for geometric queries, describe an algorithm for compiling our query language onto an abstract bounding volume hierarchy, coupled with a scheduling language inspired by a long line of performance optimizations from the ray tracing community, and discuss an in-the-works language for optimizing memory layouts.


5/16: Efficient SAT-Based Bounded Model Checking of Evolving Systems

Time: Thursday, May 16, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Sophie Andrews

Abstract: SAT-based bounded model checking is a common technique used by industry practitioners to find bugs in computer systems. However, these systems are rarely designed in a single step: instead, designers repeatedly make small modifications, reverifying after each change. With current tools, this reverification step takes as long as a full, from-scratch verification, even if the design has only been modified slightly. We propose a novel technique for SAT-based bounded model checking that performs significantly better than the naive approach in the setting of evolving systems. The key idea is to store information learned during the verification of earlier versions of the system and then reuse this information to speed up the verification of later versions. We instantiate our technique in a bounded model checking tool for SystemVerilog code. We also provide and apply our technique to a new benchmark set based on real edit history for a set of open source RISC-V cores. Our tool shows strong performance on that benchmark set, more than halving the time required to verify properties on later versions of the cores compared to the current state-of-the-art, verify-from-scratch approach.


5/23: Building the Tools to Program a Quantum Computer

Time: Thursday, May 23, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Charles Yuan

Abstract: Bringing the promise of quantum computation into reality requires not only building a quantum computer but also correctly programming it to run a quantum algorithm. To obtain asymptotic advantage over classical algorithms, quantum algorithms rely on the ability of data in quantum superposition to exhibit phenomena such as interference and entanglement. In turn, an implementation of the algorithm as a program must correctly orchestrate these phenomena in the states of qubits. Otherwise, the algorithm would yield incorrect outputs or lose its computational advantage.
Given a quantum algorithm, what are the challenges and costs to realizing it as an executable program? In this talk, I answer this question by showing how basic programming abstractions – such as data structures and control flow – upon which many quantum algorithms rely can fail to work correctly or efficiently on a quantum computer. I then show how we can leverage insights from programming languages to re-invent these abstractions to meet the demands of quantum algorithms. This approach holds out a promise of expressive and efficient tools for programming a quantum computer.


5/30: Lightning Talks

Time: Thursday, May 30, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: Matthew Sotoudeh, Shiv Sundram, Scott Kovach, Jibiana Kajpor, (others TBD)

Abstract: If you would like to present a short lightning talk, please contact Matthew


6/6: TBD (David Broman)

Time: Thursday, June 6, 2024, 12 noon - 1pm
Location: Gates 415

Speaker: David Broman