Optimal mass transport (OMT) theory is used to preprocess brain tumor datasets for the first time. The goal is to move any irregular 3D object (i.e., the brain) without causing significant distortion. The first stage of a two-stage OMT (TSOMT) procedure transforms the brain onto the unit solid ball. The sphere shape boundary constraint is necessary to ensure OMT convergence. The second stage is to transform the unit ball to a unit cube, as it is easier to apply a 3D convolutional neural network (CNN) to rectangular coordinates. Small variations in the local mass-measured stretch ratio among all the brain tumor datasets show the robustness of the transform. Additionally, the distortion is kept at a minimum (<0.1) with a reasonable transport cost. The original 240 × 240 × 155 × 4 dataset is thus reduced to a cube of 128 × 128 × 128 × 4, which is a 76.6% reduction in the total number of voxels, without losing much detail. Two mass-preserving OMT functions (one is fluid-attenuated inversion recovery (FLAIR) only and the other is 4 modalities on average) are used to double the amount of training data, which helps to reduce overfitting when training a vast machine learning (ML) model. Three typical U-Nets are trained separately to predict the whole tumor (WT), tumor core (TC), and contrast-enhancing tumor (CET) from the cube. An impressive 0.9901 training accuracy in the WT cube is achieved at 1000 epochs. Inverse TSOMT is performed on the predicted cube so that brain results are obtained. The conversion loss from OMT to inverse OMT is found to be less than one percent. For training, the Dice scores (0.9852 for the WT, 0.9743 for the TC, and 0.9433 for the CET) can be obtained. Significant improvements in brain tumor detection and the segmentation accuracy have been achieved. For validation, postprocessing using rotating techniques is added to the TSOMT, U-Net prediction, and inverse TSOMT method for an accuracy improvement of a few percent. It takes 200 seconds to complete the whole segmentation process on each new brain tumor dataset. Finally, the false prediction for the worst WT, TC and CET validation cases is discussed. Again, the TSOMT and inverse TSOMT do not induce any new failure mode because of the small conversion loss.