FluidNexus: 3D Fluid Reconstruction and Prediction
from a Single Video

Yue Gao*1,2     Hong-Xing "Koven" Yu*1     Bo Zhu3     Jiajun Wu1    
1Stanford University    2Microsoft    3Georgia Institute of Technology   
*Contributed Equally
Teaser

Novel View Synthesis

From a single input video, we reconstruct the fluid and show novel view synthesis results below. We show two novel views for each dataset example.

Future Prediction

Counterfactual Interaction Simulation

Re-simulation on Novel Views

(Click to show/hide)

Re-simulation on In-the-wild Scene

(Click to show/hide)

Ablation Study on Novel-View Video Synthesizer

(Click to show/hide)

Ablation Study on Physics Losses

(Click to show/hide)

Abstract

We introduce FluidNexus, a framework for reconstructing and predicting 3D fluid dynamics from a single video by synthesizing novel-view videos for reference, integrating video generation with physics simulation. FluidNexus combines realistic video synthesis via diffusion-based refinement and a physics-integrated particle representation for fluid reconstruction and prediction. We validate our approach with two new real-world fluid datasets and enable dynamic novel view synthesis, future prediction, and interaction simulation. Code and datasets will be released.

Method Overview

Method
FluidNexus in reconstruction. We represent 3D fluid dynamics by a novel physics-integrated representation, two-layer fluid particles. During reconstruction, from a single video, we synthesize multiple novel-view videos as references for 3D fluid reconstruction and then optimize the two-layer particle fluid representations over time. The optimization is supervised using the multi-view video frames to compute the visual loss and the physics constraints to compute the physics loss. Our reconstruction output is the 3D fluid appearance and velocity fields over all input frames.

The FluidNexus-Smoke and FluidNexus-Ball Datasets

Our FluidNexus-Smoke and FluidNexus-Ball datasets include 120 scenes for each. Each scene has 5 synchronized multi-view videos where the cameras are placed along a horizontal arc of approximately 120°.

BibTex

@InProceedings{gao2025fluidnexus,
    title     = {FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video},
    author    = {Gao, Yue and Yu, Hong-Xing and Zhu, Bo and Wu, Jiajun},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
}