Virtual brain grafting: Enabling whole brain parcellation in the presence of large lesions

Brain atlases and templates are at the heart of neuroimaging analyses, for which they fa-cilitate multimodal registration, enable group comparisons and provide anatomical reference. However, as atlas-based approaches rely on correspondence mapping between images they perform poorly in the presence of structural pathology. Whilst several strategies exist to over-come this problem, their performance is often dependent on the type, size and homogeneity of any lesions present. We therefore propose a new solution, referred to as Virtual Brain Graft-ing (VBG), which is a fully-automated, open-source workflow to reliably parcellate MR images in the presence of a broad spectrum of focal brain pathologies, including large, bilateral, intra- and extra-axial, heterogeneous lesions with and without mass effect. The core of the VBG approach is the generation of a lesion-free T1-weighted input im-age which enables further image processing operations that would otherwise fail. Here we vali-dated our solution based on Freesurfer recon-all parcellation in a group of 10 patients with heterogeneous gliomatous lesions, and a realistic synthetic cohort of glioma patients (n=100) derived from healthy control data and patient data. We demonstrate that VBG outperforms a non-VBG approach assessed qualitatively by expert neuroradiologists and Mann-Whitney U tests to compare corresponding parcellations (real patients U(6,6) = 33, z = 2.738, P < .010, synthetic patients U(48,48) = 2076, z = 7.336, P < .001). Results were also quantitatively evaluated by comparing mean dice scores from the syn-thetic patients using one-way ANOVA (unilateral VBG = 0.894, bilateral VBG = 0.903, and non-VBG = 0.617, P < .001). Additionally, we used linear regression to show the influence of lesion volume, lesion overlap with, and distance from the Freesurfer volumes of interest, on labelling accuracy. VBG may benefit the neuroimaging community by enabling automated state-of-the-art MRI analyses in clinical populations, for example by providing input data for automated solu-tions for fiber tractography or resting-state fMRI analyses that could also be used in the clinic. To fully maximize its availability, VBG is provided as open software under a Mozilla 2.0 license (https://github.com/KUL-Radneuron/KUL_VBG).

of structural pathology. Whilst several strategies exist to overcome this problem, their performance is of-23 ten dependent on the type, size and homogeneity of any lesions present. We therefore propose a new 24 solution, referred to as Virtual Brain Grafting (VBG), which is a fully-automated, open-source workflow to 25 reliably parcellate MR images in the presence of a broad spectrum of focal brain pathologies, including 26 large, bilateral, intra-and extra-axial, heterogeneous lesions with and without mass effect. 27 The core of the VBG approach is the generation of a lesion-free T1-weighted input image which 28 enables further image processing operations that would otherwise fail. Here we validated our solution 29 based on Freesurfer recon-all parcellation in a group of 10 patients with heterogeneous gliomatous lesions, 30 and a realistic synthetic cohort of glioma patients (n=100) derived from healthy control data and patient 31 data. 32 We demonstrate that VBG outperforms a non-VBG approach assessed qualitatively by expert neu-33 roradiologists and Mann-Whitney U tests to compare corresponding parcellations (real patients U(6,6) = 34 33, z = 2.738, P < .010, synthetic patients U(48,48) = 2076, z = 7.336, P < .001). Results were also quantita-35 tively evaluated by comparing mean dice scores from the synthetic patients using one-way ANOVA (unilat-36 eral VBG = 0.894, bilateral VBG = 0.903, and non-VBG = 0.617, P < .001). Additionally, we used linear re-37 gression to show the influence of lesion volume, lesion overlap with, and distance from the Freesurfer 38 volumes of interest, on labelling accuracy. 39 VBG may benefit the neuroimaging community by enabling automated state-of-the-art MRI anal-40 yses in clinical populations, for example by providing input data for automated solutions for fiber tractog-41 raphy or resting-state fMRI analyses that could also be used in the clinic. To fully maximize its availability, 42 VBG is provided as open software under a Mozilla 2.0 license (https://github.com/KUL-Radneu-43 ron/KUL_VBG). 44 Figure 1: Schematic representation of VBG: Part 1 generates the lesion-free image. It starts in the top left with basic processing and Atropos segmentation (light blue), donor image generation and initial filling (yellow), then final lesion filling (pink). Part 2 is concerned with parcellation (orange). Lastly a text report is generated detailing lesion overlap with various labelled brain structures.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. prior TPMs, subject's brain image and TPMs with the lesion excluded. First the input brain images are 166 normalized by a two-pass mean division, and the template (source) TPMs are binarized at a lower 167 threshold of 0.1 and used as masks to isolate each tissue from the source image. The target TPMs are 168 binarized at 95% probability and used to calculate target tissue specific mean intensity with mrstats 169 (Tournier et al., 2019). Each source tissue map is multiplied by the corresponding target tissue mean, 170 and a forced correction of CSF signal intensity is used to scale its maximum to 0.2 of the gray matter 171 mean signal intensity. Finally, all tissues are combined using scalar addition with ImageMath "addto- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. If the lesion is unilateral, the lesioned hemisphere is replaced with a synthetic hemisphere 177 retaining the non-lesioned native hemisphere. The resulting image is called a stitched brain. An initial 178 filled brain image is generated by replacing the lesion with healthy tissue from this stitched brain. 179 Next, the stitched brain is further deformed to match the initial filled one, then segmented into dif-180 ferent tissues with ANTs (Avants et al., 2011) Atropos (Avants et al., 2011). If the lesion is bilateral the 181 synthetic brain is directly used to derive the initial filled brain, which is then segmented with Atropos 182 (Avants et al., 2011). 183

C. Final donor image generation and fill 184
This constitutes the last stage in lesion filling. Here the initial filled brain is warped back to 185 native space and sharpened with ANTs (Avants et al., 2011) to create the final donor brain image. The 186 lesion replacement graft is harvested using a 2mm FWHM smoothed mask and inserted into the re-187 cipient image (the reoriented input brain). The skull and noise are added back to generate a realistic 188 lesion-free whole head T1 WI. Finally, the initial transformation is reversed to generate the image in 189 native orientation. Whole head, brain extracted, and brain mask images are saved in original and 190 standard orientations. 191 Part 2, brain parcellation: This involves running recon-all (FreeSurferWiki, 2020a) on the output 192 lesion-free image in native orientation. First recon-all (FreeSurferWiki, 2020a) is run until the brain 193 extraction stage, then the VBG generated brain mask is applied to the recon-all (FreeSurferWiki,194 2020a) results, after which it is restarted and run to the end. Finally, the real T1 map and the lesion 195 mask are converted to .mgz, the output aparc+aseg parcellations are converted to .nii, a version is 196 created with a zero-filled lesion patch (aparc+aseg_minL.nii.gz), and another with the lesion mask 197 given the value 99 (aparc+aseg+L.nii.gz), which remains unused in the Freesurfer (Fischl, 2012) v6.0 198 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. analyses of segmentation accuracy in the presence of a ground truth. In the case of brain pathology, 205 a manual delineation is typically used as the ground truth. However, for the whole brain this process 206 would be highly time consuming and was not feasible for the current work. Thus, we resorted to two 207 approaches: first, we processed and parcellated T1 WIs from 10 clinical patient participants (real-pa-208 tients) with gliomas, with and without the proposed method. We included patients with gliomatous 209 lesions of different sizes and locations. A group of healthy control (HC) volunteers (N=10) was also 210 included. We generated a synthetic cohort (N=200), consisting of two groups. First, a lesion-free group 211 (N=100), which was created by non-linear deformation of the HC images to match the mass effect of 212 the patients, referred to henceforth as the synthetic-mass-effect group. Second, a synthetic-patients 213 group (N=100) were generated containing both the mass effect and the lesions. All synthetic-mass-214 effect images were parcellated with recon-all (FreeSurferWiki, 2020a) and the synthetic-patients im-215 ages were parcellated after VBG filling. We also attempted to parcellate the real-patients and syn-216 thetic-patients' images without VBG. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint All completed non-VBG parcellations (HC, real-patients, and synthetic-mass-effect), real-pa-220 tients' non-VBG uVBG, synthetic-patients non-VBG and corresponding synthetic patients' uVBG par-221 cellations were qualitatively evaluated (section 3.4.1.). For the synthetic-mass-effect parcellations this 222  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020.   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. components into a single binary mask, which is required for VBG. The remainder of the work used the 262 original unprocessed pre-contrast T1-weighted images. The pathological mass effect of the real pa-263 tients was subjectively rated (by AR) into none, mild, moderate, and severe for descriptive purposes. 264

265
Following lesion segmentation, the uVBG approach was applied to generate lesion-free T1 266 brain images from the real-patients' data. None were excluded upon visual inspection. All images were 267 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint included with their original resolution since VBG is designed to accommodate varying spatial 268 resolutions. The lesion-free images generated here were used to create the synthetic cohort. 269 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Synthetic cohort creation
The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint brain mask inserted, as done in VBG. The real-patients (N=10) and synthetic-patients (N=100) data 293 were parcellated with uVBG and attempted after only zero-filling the lesion patch similar to CFM, and 294 also without VBG filling (non-VBG). bVBG was applied to a subset of synthetic-patients (N=25), since 295 there were no true bilateral lesions in our sample. Non-VBG real-patients and synthetic-patients par-296 cellations were used as control parcellations to compare recon-all (FreeSurferWiki, 2020a) suc-297 cess/failure rates and parcellation quality with and without VBG. A total of 455 recon-all 298 (FreeSurferWiki, 2020a) analyses were attempted, each with a runtime cap of 8 hours using GNU 299 timeout. Those that quit with an error or exceeded the timeout duration were considered to have 300 failed, and all parcellations were allowed two attempts in case of failure. We imposed the 8-hour time 301 limit after an initial test of non-VBG recon-all (FreeSurferWiki, 2020a) on the first 18 SP images using 302 2 cores per subject continued for 24 hours with less than 50% completion rate, while VBG driven 303 recon-all tended to finish in under 7 hours. 304 tions and assigned a quality score. We attempted to minimize rater bias, particularly for the evaluation 308 of lesioned brain parcellations. More specifically, our main concern was blinding the raters to whether 309 the parcellation came from VBG or not, and whether the source was real or synthetic data. To this end 310 we resorted to (a) coding and mixing datasets together, and (b) standardizing the parcellations by 311 using the lesion free T1-weighted images as underlays and subtracting the lesion patch from all par-312 cellations except the HC group, which we used to imply a gold standard. This allowed us to blind our 313 expert raters to the source image of all parcellations of interest. Images were given to the raters as 314 high-resolution multi-frame panels in axial, coronal and sagittal with 10 mm interslice gap, generated 315 using fsleyes render (McCarthy, 2020). We used Gwet's AC2 (gamma) (Gwet, 2016) as a measure of 316 inter-observer agreement for the parcellations evaluated by both experts using Matlab r2018a and 317 . CC-BY 4.0 International license It is made available under a perpetuity.

Evaluation
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. The qualitative evaluation protocol we used can be described as follows: 321 A defect (error) may be minor, intermediate or major. A minor defect was defined as an unla-322 beled cluster of voxels (e.g. 10) belonging to gray matter, or non-brain tissue e.g. dura labelled as gray 323 matter, or a more subtle focal underestimation of the cortical ribbon thickness. An intermediate de-324 fect/error was a larger falsely labelled or unlabeled area on the scale of the inferior frontal gyrus pars 325 orbitalis, or anterior temporal pole. Finally, a major defect was any defect on the level of a whole gyrus 326 or larger. We defined four categories for the quality of a parcellation: 3 -Good, up to 3 minor uncon-327 nected defects, which can happen with structurally normal data (FreeSurferWiki, 2020b; Guenette et 328  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. This is referred to simply as "distance". Lastly, we calculated the percent of volume by which each VOI 347 overlaps with the lesion mask, referred to for simplicity as "percent overlap", another factor we hy-348 pothesized would influence accuracy. 349

Statistical analyses 350
To ensure the validity of the synthetic-mass-effect group parcellations we first compared their 351 visual scores to those of the HCs plus the a-posteriori warped HC parcellations using an unpaired 352 Mann-Whitney U test. Secondly, we asked "Are VBG driven parcellations qualitatively rated higher 353 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint tions using a paired Mann-Whitney U test. 356 The remaining analyses focused on the synthetic-patients quantitative evaluation results. 357 Here, we asked "Do VBG driven parcellations have higher dice scores than non-VBG driven parcella-

375
First, the parcellations derived from the synthetic-mass-effect group were comparable in vis-376 ual quality scores (88 scored 3, 11 scored 2, and 1 scored 1) to the HC group and a-posteriori warped 377 HC parcellations (15 scored 3, and 5 scored 2). A two-tailed Mann-Whitney U test comparing both 378 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint overall agreement between both experts using all common parcellations (gamma = 0.943, observed 381 agreement = 0.960, chance agreement = 0.318). Thirdly, we report on the results of parcellation, qual-382 itative and quantitative evaluations, as well as the paired statistical comparison of parcellation meth-383 ods and exploratory analyses (see supplementary information for results of the unpaired comparison). 384  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint non-VBG patients parcellations. We used the minimum of the visual scores assigned for the common 393 parcellations by both experts, and the remaining unique scores from each expert. 394 Qualitative evaluation showed real-patients uVBG parcellations scored significantly higher 395 than corresponding non-VBG ones, U(6,6) = 33, z = 2.738, P < .01. uVBG also outperformed non-VBG 396 parcellations in the synthetic-patients group. This was confirmed on a paired U-test comparing cor-397 responding parcellations from both methods U(48,48) = 2076, z = 7.336, P < .001. Figure 5 shows 398 plotted visual scores from all synthetic-patients non-VBG and uVBG parcellations plus the average 399 DSC scores from the synthetic-patients common to the three parcellation approaches. Unpaired 400 analysis results are detailed in supplementary information. 401 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint t-critical two-tailed = 2.684). One-way ANOVA and post hoc paired t-tests comparing the common 404 parcellations from the three approaches (uVBG, bVBG and non-VBG, N = 20 each) showed significant 405 differences between the three groups (P < .001), with both VBG approaches scoring significantly 406 higher than non-VBG. 407 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020.   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint

424
The first aim of this work was to propose and explain VBG as a workflow for heterogeneous 425 brain lesion filling and optional subsequent structural mapping using Freesurfer (Fischl, 2012) recon-426 all (FreeSurferWiki, 2020a). Our second aim for this study was to test and evaluate the quality and 427 accuracy of the VBG driven whole brain parcellations. We chose a test sample of preoperative patients 428 with gliomas for this work as they provide a variety of lesion sizes, locations, and mass effect. 429 Our evaluation shows a significant benefit from using VBG in this group, both qualitatively and 430 quantitively in the real and synthetic cohorts. Put simply, VBG allows an accurate parcellation for pa-431 tients where recon-all (FreeSurferWiki, 2020a) would otherwise fail to complete. Our exploratory anal-432 ysis partially explained the variation in VOI dice values in the VBG driven parcellations of the synthetic 433 cohort. In intuitive terms, our analysis showed that VOIs lost 0.004 DSC for every 100 mL increase in 434 lesion, .05 DSC was gained for a 10-fold increase in distance to the lesion, and 0.003 DSC is lost for 435 every 1% label volume lost to overlap with the lesion. However, there is an inherent multicollinearity 436 between the three parameters as in the case of larger lesions there is less probability for VOIs to be 437 more distant, and more chance for overlap with the lesion mask. 438 Only 1 of 135 VBG recon-all (FreeSurferWiki, 2020a) runs failed, a real patient's postcontrast 439 T1 weighted image that was appropriately filled by VBG but lagged in the automated topographical 440 error correction (FreeSurferWiki, 2020c) of recon-all (FreeSurferWiki, 2020a). We hypothesized that 441 the cause was the gadolinium signal confounding cortical surface morphology. This was confirmed 442 upon inspecting recon-all (FreeSurferWiki, 2020a) inflated surfaces, which showed major defects in 443 the vicinity of the bright blood vessels (supplementary figure 5). Thus, we repeated recon-all 444 (FreeSurferWiki, 2020a) without a time-limit. This finished without error after 18 hours and the da-445 taset was included. Such cases would be better addressed using another modality to assist with the 446 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint cortical surface reconstruction, e.g. T2 or FLAIR. Alternatively, a different surface reconstruction ap-447 proach, e.g. FastSurfer (Henschel et al., 2020) could be used, which may probably be adopted in the 448 next version of VBG as this also promises to significantly shorten runtime. 449 Adequate lesion filling must conform to the healthy brain tissue morphology and image inten-450 sity pattern. Furthermore, sharp interfaces between the lesion fill graft and surrounding recipient 451 brain image must be avoided. VBG achieves this by generating a synthetic donor image that matches 452 the input brain image to ensure the filling tissue does not carry concomitant pathologies or parts of 453 the main lesion from the relatively non-lesioned side. In bilateral lesions, there is no healthy hemi-454 sphere to rely on, and the unilateral approach would simply duplicate the whole lesion into the filling 455 patch (supplementary figure 6). Thus, the synthetic images are used directly for the initial and final 456 lesion filling. Here, we did not observe a significant difference in performance between unilateral and 457 bilateral approaches, as quantified with mean DSC. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 2, 2020. . https://doi.org/10.1101/2020.09.30.20204701 doi: medRxiv preprint