Dynamic MRI reconstruction, one of inverse problems, has seen a surge by the use of deep learning techniques. Especially, the practical difficulty of obtaining ground truth data has led to the emergence of unsupervised learning approaches. A recent promising method among them is implicit neural representation (INR), which defines the data as a continuous function that maps coordinate values to the corresponding signal values. This allows for filling in missing information only with incomplete measurements and solving the inverse problem effectively. Nevertheless, previous works incorporating this method have faced drawbacks such as long optimization time and the need for extensive hyperparameter tuning. To address these issues, we propose Dynamic-Aware INR (DA-INR), an INR-based model for dynamic MRI reconstruction that captures the spatial and temporal continuity of dynamic MRI data in the image domain and explicitly incorporates the temporal redundancy of the data into the model structure. As a result, DA-INR outperforms other models in reconstruction quality even at extreme undersampling ratios while significantly reducing optimization time and requiring minimal hyperparameter tuning.
Overall pipeline of DA-INR. A deformation network \( \Psi_t \) takes a spatio-temporal coordinate \( (x, y, t) \) as input to output deformation field \( \Delta \mathbf{x} = (\Delta x, \Delta y) \) based on a canonical space. A pretrained feature extractor extracts features from an undersampled data in the image domain. A canonical network \( \Psi_x \) takes the deformed coordinate \( \mathbf{x}' \) and the features \( \mathbf{f}' \) to predict \( t^{\text{th}} \) frame in the image domain, \( d_\theta \). These two models are optimized by L1 loss computation in the frequency domain with Non-uniform Fast Fourier Transform (NuFFT). "Sampling" means upscaling the coordinates or the features by nearest-neighborhood or bilinear interpolation. F.E and H.E mean Frequency Encoding and Hash Encoding.
Encoding Temporal Redundancy in DA-INR. In DA-INR, the cells of the image in the canonical space plays a regularization role to those of all other frames. The purplish lines between frame-by-frame indicate that DA-INR is continuous in time, but does not merely represent dynamic MRI data as 3D mass like existing methods.
Feature Extractor | PSNR (dB) | SSIM | GPU Memory Usage (GB) | Runtime (sec) |
---|---|---|---|---|
w/o Encoder | 29.59 | 0.8807 | 1.9 | 1332.80 |
EDSR [1] | 29.53 | 0.8790 | 3.5 | 2826.45 |
RDN [2] | 29.28 | 0.8750 | 12.0 | 6024.91 |
SwinIR [3] | 29.34 | 0.8816 | 18.3 | 5889.91 |
MDSR [4] (Ours) | 30.13 | 0.8835 | 3.5 | 1445.50 |
@misc{baik2025dynamicawarespatiotemporalrepresentationlearning,
title={Dynamic-Aware Spatio-temporal Representation Learning for Dynamic MRI Reconstruction},
author={Dayoung Baik and Jaejun Yoo},
year={2025},
eprint={2501.09049},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2501.09049},
}