Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10822 publications
    mmMUSE: An mmWave-based Motion-resilient Universal Speech Enhancement System
    Chenming He
    Yanyong Zhang
    Kai Wang
    Dequan Wang
    Lingyu Wang
    the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), ACM (2026) (to appear)
    Preview abstract Voice-based smart systems can greatly enhance user experiences by allowing higher-quality interactions through better voice perception. Speech enhancement can benefit such systems by isolating noise from speech. Recently, integrating millimeter-wave (mmWave) with audio for speech perception has gained increasing attention due to microphones' limitations in noisy environments. However, mmWave-based vocal extraction is severely affected by motion, which disperses vocal signals across ranges and introduces distortions. In this paper, we propose an mmWave-based motion-resilient universal speech enhancement system called mmMUSE, which fuses mmWave and audio signals. To mitigate motion interference, we develop a Doppler-based method for motion-robust vocal signal extraction. Moreover, by introducing the Vocal-Noise-Ratio metric to assess the prominence of vocal signals from mmWave, we achieve real-time voice activity detection that gains 3.81 dB of SISDR in noisy speeches. Additionally, we design a two-stage complex-valued network that includes an attention-based fusion network for cross-modal complementing and a time-frequency masking network for correcting amplitude and phase of speech to isolate noises. Using mmWave and audio datasets from 46 participants, mmMUSE outperforms the state-of-the-art speech enhancement models, achieving an average SISDR improvement of 3.12 dB. Additionally, mmMUSE achieves SISDR improvements of 16.51 dB, 17.93 dB, 14.93 dB, and 18.95 dB in controlled environments involving intense noise, extensive motion, multiple speakers, and various obstructive materials, respectively. Finally, we evaluate mmMUSE in real-world scenarios including running, public spaces, and driving, maintaining a word error rate (WER) below 10%. View details
    Productionizing Quantum Mass Production
    Bill Huggins
    Nathan Wiebe
    arXiv for now (2026) (to appear)
    Preview abstract For many practical applications of quantum computing, the slowest and most costly steps involve coherently accessing classical data. We help address this challenge by applying mass production techniques, which can sometimes allow us to perform operations many times in parallel for a cost that is comparable to a single execution[1-3]. We combine existing mass-production results with modern approaches for loading classical data using ``quantum read-only memory.'' We show that quantum mass production techniques offer no benefit when we consider a cost model that focuses purely on the number of non-Clifford gates. However, analyzing the constant factors in a more nuanced cost model, we find that it may be possible to obtain a reduction in cost of an order or magnitude or more for a variety reasonably-sized fault-tolerant quantum algorithms. We present several applications of quantum mass-production techniques beyond naive parallelization, including a strategy for reducing the cost of serial calls to the same data loading step. View details
    FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration
    Diganta Misra
    Yanqi Luo
    Anjali Sridhar
    Justine Gehring
    Silvio Soares Ribeiro Junior
    2026
    Preview abstract AI coding assistants are rapidly becoming integral to modern software development. A key challenge in this space is the continual need to migrate and modernize codebases in response to evolving software ecosystems. Traditionally, such migrations have relied on rule-based systems and human intervention. With the advent of powerful large language models (LLMs), AI-driven agentic frameworks offer a promising alternative—but their effectiveness remains underexplored. In this paper, we introduce FreshBrew, a novel benchmark for evaluating AI-based agentic frameworks on project-level Java migrations. We benchmark several such frameworks, powered by state-of-the-art LLMs, and compare their performance against established rule-based tools. Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, Gemini 2.5 Flash, can successfully migrate 56.5% of projects to JDK 17. Our empirical analysis reveals novel insights into the critical strengths and limitations of current agentic approaches, offering actionable insights into their real-world applicability. By releasing FreshBrew publicly upon acceptance, we aim to facilitate rigorous, reproducible evaluation and catalyze progress in AI-driven codebase modernization. View details
    Toward Sensor-In-the-Loop LLM Agent: Benchmarks and Implications
    Zhiwei Ren
    Junbo Li
    Minjia Zhang
    Di Wang
    Longfei Shangguan
    SenSys 2025 - The 23rd ACM Conference on Embedded Networked Sensor Systems (2025)
    Preview abstract This paper advocates for sensor-informed personal agents that can take advantage of sensor hints on wearables to enhance the personal agent's response. We demonstrate that such a sensor-in-the-loop design paradigm can be easily integrated into existing LLM agents by building a prototype named WellMax based on existing well-developed techniques such as structured prompt tuning and few-shot prompting. The head-to-head comparison with a non-sensor-informed agent across five use scenarios demonstrates that this sensor-in-the-loop design can effectively improve users' needs and their overall experience. The deep-dive into agents' replies and participants' feedback further reveals that sensor-in-the-loop agents not only provide more contextually relevant responses but also exhibit a greater understanding of user priorities and situational nuances. We further conduct two case studies to examine the potential pitfalls and distill key insights from this sensor-in-the-loop agent. We believe this work sets the stage for more intelligent, empathetic, and effective interactions in future AI-driven personal assistants. View details
    Non-stationary Bandit Convex Optimization: A Comprehensive Study
    Shirley Liu
    Dorian Baudry
    Patrick Rebeschini
    Arya Akhavan
    2025
    Preview abstract Bandit Convex Optimization is a fundamental class of sequential decision-making problems, where the learner selects actions from a continuous domain and observes a loss (but not its gradient) at only one point per round. We study this problem in non-stationary environments, and aim to minimize the regret under three standard measures of non-stationarity: the number of switches S in the comparator sequence, the total variation Delta of the loss functions, and the path-length P of the comparator sequence. We propose a polynomial-time algorithm, Tilted Exponentially Weighted Average with Sleeping Experts (TEWA-SE), which adapts the sleeping experts framework from online convex optimization to the bandit setting. For strongly convex losses, we prove that TEWA-SE is minimax-optimal with respect to known S and Delta by establishing matching upper and lower bounds. By equipping TEWA-SE with the Bandit-over-Bandit framework, we extend our analysis to environments with unknown non-stationarity measures. For general convex losses, we introduce a second algorithm, clipped Exploration by Optimization (cExO), based on exponential weights over a discretized action space. While not polynomial-time computable, this method achieves minimax-optimal regret with respect to known S and Delta, and improves on the best existing bounds with respect to P. View details
    Study of Arterials in the City of Rio de Janeiro for Traffic Coordination
    Ori Rottenstreich
    Eliav Buchnik
    Danny Veikherman
    Dan Karliner
    Tom Kalvari
    Shai Ferster
    Ron Tsibulsky
    Jack Haddad
    2025
    Preview abstract Urban traffic congestion is a growing challenge, and optimizing signal timing strategies is crucial for improving traffic flow and reducing emissions. The coordination of signalized intersections improves both traffic operations and environmental aspects. Coordination is particularly important along arterials, sequences of signalized intersections that serve as the primary routes and carry a high volume of traffic. In this paper we analyze real data from the city of Rio de Janeiro to study properties of arterials. We refer to their length, the distance between intersections and to the properties of the traffic light plans such as cycle time. We then study their in practice level of coordination in terms of number of stops and their common locations along the arterials. We dive into particular arterials and provide insights that can be useful for efficient design of arterials in additional cities. Based on the analysis, we show how simple traffic properties can indicate the potential upon coordinating two adjacent intersections as part of an arterial in improving traffic performance. View details
    Applying multimodal AI to physiological waveforms improves genetic prediction of cardiovascular traits
    Yuchen Zhou
    Mahantesh I. Biradar
    Jacqueline Shreibati
    Dongbing Lai
    Tae-Hwi Schwantes-An
    Robert Luben
    Zachary R. McCaw
    Jorgen Engmann
    Rui Providencia
    Amand Floriaan Schmidt
    Patricia B. Munroe
    Howard Yang
    Andrew Carroll
    Anthony Khawaja
    Babak Behsaz
    American Journal of Human Genetics, 112 (2025), pp. 1562 - 1579
    Preview abstract Electronic health records, biobanks, and wearable biosensors enable the collection of multiple health modalities from many individuals. Access to multimodal health data provides a unique opportunity for genetic studies of complex traits because different modalities relevant to a single physiological system (e.g., circulatory system) encode complementary and overlapping information. We propose a multimodal deep learning method, multimodal representation learning for genetic discovery on low-dimensional embeddings (M-REGLE), for discovering genetic associations from a joint representation of complementary electrophysiological waveform modalities. M-REGLE jointly learns a lower representation (i.e., latent factors) of multimodal physiological waveforms using a convolutional variational autoencoder, performs genome-wide association studies (GWASs) on each latent factor, then combines the results to study the genetics of the underlying system. To validate the advantages of M-REGLE and multimodal learning, we apply it to common cardiovascular modalities (photoplethysmogram [PPG] and electrocardiogram [ECG]) and compare its results to unimodal learning methods in which representations are learned from each data modality separately but are statistically combined for downstream genetic comparison. M-REGLE identifies 19.3% more loci on the 12-lead ECG dataset, 13.0% more loci on the ECG lead I + PPG dataset, and its genetic risk score significantly outperforms the unimodal risk score at predicting cardiac phenotypes, such as atrial fibrillation (Afib), in multiple biobanks. View details
    Benchmarking and improving algorithms for attributing satellite-observed contrails to flights
    Vincent Rudolf Meijer
    Rémi Chevallier
    Allie Duncan
    Kyle McConnaughay
    Atmospheric Measurement Techniques, 18 (2025), pp. 3495-3532
    Preview abstract Condensation trail (contrail) cirrus clouds cause a substantial fraction of aviation's climate impact. One proposed method for the mitigation of this impact involves modifying flight paths to avoid particular regions of the atmosphere that are conducive to the formation of persistent contrails, which can transform into contrail cirrus. Determining the success of such avoidance maneuvers can be achieved by ascertaining which flight formed each nearby contrail observed in satellite imagery. The same process can be used to assess the skill of contrail forecast models. The problem of contrail-to-flight attribution is complicated by several factors, such as the time required for a contrail to become visible in satellite imagery, high air traffic densities, and errors in wind data. Recent work has introduced automated algorithms for solving the attribution problem, but it lacks an evaluation against ground-truth data. In this work, we present a method for producing synthetic contrail detections with predetermined contrail-to-flight attributions that can be used to evaluate – or “benchmark” – and improve such attribution algorithms. The resulting performance metrics can be employed to understand the implications of using these observational data in downstream tasks, such as forecast model evaluation and the analysis of contrail avoidance trials, although the metrics do not directly quantify real-world performance. We also introduce a novel, highly scalable contrail-to-flight attribution algorithm that leverages the characteristic compounding of error induced by simulating contrail advection using numerical weather models. The benchmark shows an improvement of approximately 25 % in precision versus previous contrail-to-flight attribution algorithms, without compromising recall. View details
    Nearly Tight Regret Bounds for Revenue Maximization in Bilateral Trade
    Simone di Gregorio
    Paul Duetting
    Federico Fusco
    Chris Schwiegelshohn
    FOCS 2025
    Preview abstract Bilateral trade models the task of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. We study this problem from the perspective of a broker, in a regret minimization framework. At each time step, a new seller and buyer arrive, and the broker has to propose a mechanism that is incentive-compatible and individually rational, with the goal of maximizing profit. We propose a learning algorithm that guarantees a nearly tight regret in the stochastic setting when seller and buyer valuations are drawn i.i.d. from a fixed and possibly correlated unknown distribution. We further show that it is impossible to achieve sublinear regret in the non-stationary scenario where valuations are generated upfront by an adversary. Our ambitious benchmark for these results is the best incentive-compatible and individually rational mechanism. This separates us from previous works on efficiency maximization in bilateral trade, where the benchmark is a single number: the best fixed price in hindsight. A particular challenge we face is that uniform convergence for all mechanisms' profits is impossible. We overcome this difficulty via a careful chaining analysis that proves convergence for a provably near-optimal mechanism at (essentially) optimal rate. We further showcase the broader applicability of our techniques by providing nearly optimal results for the joint ads problem. View details
    Pragmatic Fairness: Evaluating ML Fairness Within the Constraints of Industry
    Jessie Smith
    Michael Madaio
    Robin Burke
    Casey Fiesler
    2025
    Preview abstract Machine learning (ML) fairness evaluation in real-world, industry settings presents unique challenges due to business-driven constraints that influence decision-making processes. While prior research has proposed fairness frameworks and evaluation methodologies, these approaches often focus on idealized conditions and may lack consideration for the practical realities faced by industry practitioners. To understand these practical realities, we conducted a semi-structured interview study with 21 experts from academia and industry specializing in ML fairness. Through this study, we explore three constraints of ML fairness evaluation in industry— balancing competing interests, lacking power/access, and getting buy-in—and how these constraints lead to satisficing, seeking satisfactory rather than ideal outcomes. We define the path from these constraints to satisficing as pragmatic fairness. Using recommender systems as a case study, we explore how practitioners navigate these constraints and highlight actionable strategies to improve fairness evaluations within these business-minded boundaries. This paper provides practical insights to guide fairness evaluations in industry while also showcasing how the FAccT community can better align research goals with the operational realities of practitioners. View details
    Balancing AI and Human Insights in Scientific Discovery: Challenges and Guidelines
    Javier García-Martínez
    Pilar Manchon
    Ricardo Vinuesa
    Sergio Hoyas
    The Innovation (2025)
    Preview abstract Recent advancements in large language models (LLMs) have enabled AI systems to autonomously assist in scientific research, from hypothesis generation to laboratory experimentation, transforming how research proposals are written and experiments are designed. Tools like AI "co-scientists" promise to enhance scientific productivity but raise concerns about diminishing human intuition, reinforcing incremental research, and concentrating power among a few entities. As LLMs become increasingly integrated into research processes, there is a risk of reduced creativity, ethical misconduct, and overreliance on AI-driven evaluation systems. To address these challenges, in this article we propose ethical guidelines focusing on transparency, accountability, fairness, and safeguarding transformative research. Ultimately, AI should be used to augment—not replace—human insight in scientific discovery.n View details
    Collaborative Diffusion Model for Recommender System
    Gyuseok Lee
    Yaochen Zhu
    Hwanjo Yu
    Yao Zhou
    Jundong Li
    2025
    Preview abstract Diffusion-based recommender systems (DR) have gained increasing attention for their advanced generative and denoising capabilities. However, existing DR face two central limitations: (i) a trade-off between enhancing generative capacity via noise injection and retaining the loss of personalized information. (ii) the underutilization of rich item-side information. To address these challenges, we present a Collaborative Diffusion model for Recommender System (CDiff4Rec). Specifically, CDiff4Rec generates pseudo-users from item features and leverages collaborative signals from both real and pseudo personalized neighbors identified through behavioral similarity, thereby effectively reconstructing nuanced user preferences. Experimental results on three public datasets show that CDiff4Rec outperforms competitors by effectively mitigating the loss of personalized information through the integration of item content and collaborative signals. View details
    Advances in QEC Experiments
    Alexandre Bourassa
    (2025)
    Preview abstract Quantum error correction (QEC) is critical for achieving useful quantum computers, since it allows us to combine many noisy physical qubits into one high-quality logical qubit with exponentially decreasing logical error rate. In this talk, we will discuss Google’s latest error correction results [1] where we achieved below threshold surface code performance with logical qubits at distances=(3, 5, 7). In a logical memory demonstration, we show that each increases in distance reduces errorsby a factor of 2.14. Additionally, we report the ability to decode these QEC experiments in real-time for up to 1 million rounds. Finally, we present a 10,000x reduction in the rare correlated errors by measuring the repetition code in the very low error regime. Ultimately, our results show device performance that, if scaled, could realize the operational requirements of large scale fault-tolerant quantum algorithms. [1] Quantum error correction below the surface code threshold, Google Quantum AI, arXiv:2408.13687 (2024) View details
    Preview abstract The proliferation of IoT in cities, combined with Digital Twins, creates a rich data foundation for Smart Cities aimed at improving urban life and operations. Generative AI (GenAI) significantly enhances this potential, moving beyond traditional AI analytics by processing multimodal content and generating novel outputs like text and simulations. Using specialized or foundational models, GenAI's natural language abilities such as Natural Language Understanding (NLU) and Generation (NLG) can power tailored applications and unified interfaces, dramatically lowering barriers for users interacting with complex smart city systems. In this paper, we focus on GenAI applications based on conversational interfaces within the context of three critical user archetypes in a Smart City - Citizens, Operators and Planners. We identify and review GenAI models and techniques that have been proposed or deployed for various urban subsystems in the contexts of these user archetypes. We also consider how GenAI can be built on the existing data foundation of official city records, IoT data streams and Urban Digital Twins. We believe this work represents the first comprehensive summarization of GenAI techniques for Smart Cities from the lens of the critical users in a Smart City. View details
    Preview abstract Despite the advent of legislation such as the General Data Protection Regulation (GDPR) with its associated "Right to be Forgotten" (RTBF), few, if any, studies have measured user reactions to realistic edge cases with public-interest content. Surveying both users covered by and excluded from RTBF, this vignette-based survey experiment sought to better understand how users think of delisting content from search engine results and what factors influence user perceptions. While leaving information accessible in search engine results generally leads to warmer feelings towards those search engines than delisting it, we find that users do prefer different outcomes depending on contextual elements specific to given cases. We also find that whether a country has active RTBF legislation does seem to be associated with both knowledge and attitudes about RTBF, but is unlikely to explain all of it. These results indicate a complex context around removing public-interest content from search engines’ results; it is essential that experts sensitive to local context perform the review in order to ensure that removal requests are handled in a way that meets users’ expectations. View details