A deep learning classification task for brain navigation during functional ultrasound imaging

Positioning and navigation are essential components of neuroimaging as they improve the quality and reliability of data acquisition, leading to advances in diagnosis, treatment outcomes, and fundamental understanding of the brain. Functional ultrasound (fUS) imaging is an emerging technology providing high-resolution images of the brain vasculature, allowing for the monitoring of brain activity. However, as the technology is relatively new, there is no standardized tool for inferring the position in the brain from the vascular images. This study presents a deep learning-based framework designed to address this challenge. Our approach uses an image classification task coupled with a regression on the resulting probabilities to determine the position of a single image. We conducted experiments using a dataset of 51 rat brain scans to evaluate its performance. The training positions were extracted at intervals of 375 µm, resulting in a positioning error of 176 µm. Further GradCAM analysis revealed that the predictions were primarily driven by subcortical vascular structures. Finally, we assessed the robustness of our method in a cortical stroke where the brain vasculature is severely impaired. Remarkably, no specific increase in the number of misclassifications was observed, confirming the method’s reliability in challenging conditions. Overall, our framework provides accurate and flexible positioning, not relying on a pre-registered reference but on conserved vascular patterns.

as the technology is relatively new, there is no standardized tool for inferring the position in the brain 23 from the vascular images. This study presents a deep learning-based framework designed to address 24 this challenge. Our approach uses an image classification task coupled with a regression on the

79
Our method uses a neural network trained on a classification task to find the position of a given image.

80
Three main elements are required for the implementation: 1) an input dataset of images aligned to a 81 predefined reference, 2) a neural network trained to classify images depending on their position, and 82 3) a probability-based regression that infers the position of a new input image. These parts are detailed 83 below in the context of rat brains and with a DenseNet121-CNN. Nevertheless, it is essential to note 84 that this operating principle is not species-nor neural network-specific.

87
In this study, we used a set of n=51 rat brains. Each brain was scanned with 77 micro-Doppler images 88 extending from Bregma +3.0 to -6.5 mm with an in-between image spacing of 125 µm ( Fig. 1-a). A 89 micro-Doppler image has an in-plane resolution of 100×110 µm and 300-µm slice thickness in this 90 setting 13,24,32 . Example micro-Doppler images on which major anatomical structures and vessels were 91 annotated are shown in Fig. 1-b, and Supplementary Fig. S1. Scans were aligned on reference  Fig. 1-c). The hyperparameters were tuned on a validation set of 100 13 rats, and their performance was assessed on a testing set of 13 rats. Fig. 1: Methodological approach and spacing selection a-Acquisition setup for large-scale micro-Doppler imaging of rat brains. The ultrasonic probe is moved along the postero-anterior axis using a motorized linear stage. The imaging was performed from Bregma -6.5 to +3.0 mm with a 125 µm spacing for a total number of 89 images. b-Set of micro-Doppler images extracted from a single scan overlaid with a simplified version of the Paxinos brain atlas 6 in white. Main anatomical structures are identified in black: Cortex (Ctx), Hippocampus (Hip), Thalamus (Tha), and Striatum (Str). The Bregma position (in mm) of the micro-Doppler images is shown in the lower right corner. Scale bar: 2 mm. c-Schematic representation of the position inference procedure. Left: A set of positions (K 1 , …, K N ) with spacing λ µm is defined over the brain. Center: An image with unknown position K is fed to a neural network trained to classify input microDoppler images depending on their position. Right: the network outputs a probability distribution over the positions (K 1 , …, K N ). It can be used either to determine K as being the most likely position within (K 1 , …, K N ) (classification) or to estimate K as a weighted sum of all positions using their respective probabilities (regression). d-Down-sampling procedure used to determine an optimal set of positions. Each scan is downsampled using 5 factors, corresponding to an increase in the spacing λ between two consecutive images: 250, 375, 500, 625, and 750 µm. e-Classification accuracy (%) of DenseNet121-CNN (black) and HOG-SVM (grey) models for each spacing (testing, n=13 rats). f-Regression error (µm) obtained from the DenseNet121-CNN model for each spacing (testing, n=13 rats). A: anterior, L: left, R: right, V: ventral.

108
To assess the performance of our approach, we evaluated the accuracy of the classifier and the 109 associated positioning error, analyzed their spatial dependence, and extracted the anatomical regions 110 supporting the inference.

112
Effect of the scanning spacing in the input dataset

113
To determine a suitable spacing to scan the brains in the input dataset, we created five datasets by  Fig. 1-d).

117
Both DenseNet121-CNN and HOG-SVM show an increase in the classification accuracy along with the 118 spacing (250 to 750 µm), from 56.2% to 98.2% and from 58% to 95.9%, respectively ( Fig. 1-e and   119 Supplementary Table S2). The training and validation accuracies follow similar trends, indicating that 120 the models did not overfit training data in a problematic way (Supplementary Fig. S2). As an additional 121 control, we performed a 5-fold cross-validation on the 375 µm dataset, which resulted in comparable 122 testing accuracy (80.9 ± 2.3%). It should be noted that the CNN did not converge with 125-µm spacing.

124
The testing accuracy is lower for the HOG-SVM than the DenseNet121-CNN, irrespective of the

131
Although better for the classifier accuracy, increasing the spacing decreases the resolution of the

140
Spatial dependency of the classification

141
To assess the reliability of the inference in different parts of the brain, we examined the classification 142 accuracies by position on the dataset with 375µm spacing. The values are non-uniformly distributed and range from 30.8 to 100% (Fig. 2-a). The DenseNet121-CNN and HOG-SVM displayed similar 144 results overall, and the anterior part of the brain generally elicited lower accuracies for both models (5 145 of the 8 positions with accuracy below the mean are between Bregma 0.0 to +3.0 mm). The

146
DenseNet121-CNN-associated confusion matrix reveals that misclassified images were mapped to 147 neighboring positions ( Fig. 2-b) and were not concentrated in a subset of rats.

149
We then computed the positioning error per position for the DenseNet121-CNN model (Fig 2.

162
the GradCAM results were averaged across animals, and a threshold was applied to mitigate the effect 163 of interpolating such low-resolution heatmaps ( Fig. 3-a).

165
The average maps were overlaid per position on the corresponding registered micro-Doppler images 166 ( Fig. 3-b). Interestingly, a single part of the micro-Doppler image is driving the classification regardless 167 of the location in the brain, except for Bregma -2.625 and -1.875 mm exhibiting two small, connected 168 areas. Furthermore, the GradCAM heatmaps are almost entirely located in the subcortex up to Bregma,

169
where the proportion starts to decrease in favor of the cortex (Fig. 3-c). It corresponds to the general 170 increase in the proportion of cortex in the image.

172
As the GradCAM maps appeared to be conserved across rats, we identified the associated local brain 173 vasculature. It revealed that several branches of large vessels play a significant role in the classification 174 process. Most of these vessels supply brain regions located in subcortical regions, such as the thalamus, the hippocampus, and the striatum 35,36 , as shown in Fig. 1-b, Fig. 3

198
The positions of the post-stroke images were predicted without prior retraining, and the proportion of  203   4-b). Likewise, the regression error did not display significant post-stroke change in average error (176 204 vs. 185 µm, respectively pre/post), with consistently higher positioning error from Bregma (Fig. 4-c).

210
In this study, we proposed a classification framework suited for accurate and robust brain positioning

232
Further analysis using the GradCAM visualization technique revealed that the classification was mainly 233 driven by highly consistent vascular structures in subcortical regions. Since ImageNet pretraining is 234 known to introduce a bias towards texture differences 44 , the richer vascular texture of the subcortex 235 may account for its prevalence compared to the cortex. It can also explain why the cortical curvature 236 and thickness variation across anatomical locations are not essential in our approach.

238
Finally, we validated the CNN's predictions in rats subjected to cortical stroke, which can be considered

253
Overall, automated brain positioning and neuro-navigation with micro-Doppler images is an issue 254 recently raised by new fUS users, but not yet widely investigated. To date, only one work has addressed 255 the positioning problem through the automated registration of a micro-Doppler scan to a reference 46 .

256
Our approach offers more flexibility as a single image is sufficient to find a position, while CNNs are 257 fast enough for real-time implementation in the neuro-navigation context.

259
These benefits come at the cost of a need for large datasets to train the classifier. For this proof-of-260 concept study, we used a standardized dataset with limited sample size and validated the performance 261 on a stroke dataset. Yet, datasets comprising more subjects and more variability (transducer orientation 262 and position, different animal preparation and imaging sequences, further data augmentation, …) will 263 be required to build a model suitable for routine use. In this regard, the increasing adoption of fUS 264 technology by the neuroscience community will facilitate the construction of large databases, especially 265 for mice, the leading mammalian model.

267
Although our work is validated on anesthetized rats, our strategy is directly applicable to awake imaging 268 and is not conceptually limited to rodents. Micro-Doppler imaging has been successfully applied to Hz. Each compound image is computed by adding nine plane waves (4.5 kHz) with angles from -12° 304 to 12° with a 3° step. The blood signal was extracted from 300 compound images using a single value 305 decomposition filter and removing the 30 first singular vectors 21 . The micro-Doppler image is computed 306 as the mean intensity of the blood signal in these 300 frames, which is an estimator of the cerebral 307 blood volume 13,24 . This sequence enables a temporal resolution of 0.6 sec, an in-plane resolution of 308 100×110 µm, and an off-plane (image thickness) of 300 µm 32 . Using these parameters, a scan of the 309 brain vasculature consisting of 89 coronal planes spaced by 125 µm was performed between Bregma 310 -6.5 to +3.0 mm.

313
The micro-Doppler 2D scans from all animals were aligned along the anteroposterior axis with respect 314 to recognizable anatomical and vascular patterns (Fig. 1-c). This alignment is necessary for correcting 315 potential shifts occurring either during surgery, imaging, or due to inter-animal variability. The process 316 was performed independently by two experts, and any disagreement was resolved post-hoc by 317 consensus. Each micro-Doppler image is then identified by its anatomical position with respect to the 318 Bregma reference point, e.g., Bregma -3.0 mm.

320
Datasets generation 321 Several datasets have been extracted from the initial scans using a down-sampling factor ranging from 322 2 to 5. This corresponds to an artificial increase in the spacing between two consecutive micro-Doppler 323 images. To create the dataset associated with a given factor F, we extracted images from Bregma -3.0 324 mm with a spacing of Fx125 µm, within the limits of the cranial windows, thus resulting in 77 common 325 positions across animals ( Fig. 1-b).

351
For each of the datasets used in this work, images were resized to 224×320 pixels by bicubic 352 interpolation, and their grey channel extended in RGB to fit the ImageNet format imposed by the pre-353 training 55 . All the data were normalized with the mean and standard deviation of the full dataset. We 354 augmented the size of the training set with rotations of ±4°and ±8° applied to all micro-Doppler images.

355
The weights of the network were optimized with the Adam algorithm using a cross-entropy loss function.

356
The hyperparameters were selected through a random search (see Supplementary Table S3 and S4).

367
The positioning error is defined as , where denotes the standard deviation.

370
We extracted the pixels in the input image, driving the classification using the Gradient-weighted Class image, and a threshold of 0.7 was applied to limit the effect of the interpolation.

378
GradCAM registration on the atlas for anatomical extraction

379
We used a digital version of the rat brain Paxinos atlas 6,43 to extract the anatomical regions associated