Hypothesis 1.2 - Early detection of Mild Cognitive Impairment from structural MRI scans

Organization

Alzhiemer’s disease is a type of dementia that affects memory, thinking and behavior. It is also the most common cause of dementia [1]. Alzhiemer’s disease (AD) accounts for 60-80% of dementia cases. AD is also a progressive disease meaning that in early stages the memory loss is mild but with late-stage AD the individuals lose the ability to carry on a conversation and respond to their environment.

Alzheimer's has no cure, but some treatments demonstrate that removing beta-amyloid plaques, one of the important reasons of Alzheimer’s disease due to their accumulation between neuron connections, from the brain reduces cognitive and functional decline in people living with early Alzheimer’s. Other treatments can temporarily slow the worsening of dementia symptoms and improve quality of life for those with Alzheimer's and their caregivers.

Based on the reference paper [2], the progression of Alzheimer’s disease can be broken down into 3 general stages - Cognitive normal, Mild Cognitive impairment (MCI) and dementia. It is also more accurately described in a 7 stage model as mentioned in [2].

No impairment
Very mild cognitive impairment
Mild cognitive impairment (MCI)
Moderate cognitive decline (early stage dementia)
Moderately severe cognitive decline (early mid-stage dementia)
Severe cognitive decline (late mid-stage dementia)
Very severe cognitive decline (late -stage dementia)

It is also mentioned in [2] that Alzheimer’s disease diagnosis is made at stage 2 or 3 of the above model. In this stage the individual can still function independently and free of dementia. This phase of Alzheimer’s disease can be detected well before the onset of dementia symptoms, up to 8 years in some cases. Thus we need an approach to Early detect the Mild Cognitive impairment (MCI) stage so that it will be possible to adopt treatments to slow down the progression of this disease.

The rest of the page is organzied as below

Description
Modeling
Results

Note: The background information of Deep learning techniques and Quantum Computing principles are added HERE!

Description

Dataset availability

The dataset used to train our model is taken from ADNI [3] (Alzheimer’s disease neuroimaging initiative). The ADNI dataset is an open sourced dataset containing various reports, scan images and other features for many cognition based diseases. We will be using this ADNI dataset and gather the structural MRI scan images of various patients which are being grouped together into 3 classes - Cognitive Normal, Mild Cognitive Impairment and Alzheimer’s disease. Around 3370 images containing the Raw structural MRI scans, have been gathered from this ADNI dataset for training the model.

Dataset Pre-Processing

The data pre-processing of raw 3D structural MRI scans (done based on reference [4] and [5]) includes the following

Spatial Normalization - This ensures the spatial structure of all images in the dataset are similar as possible. Thus, as a first step, the images are resampled to an isotropic resolution of 2mm^3. This means that every voxel (3D pixel) would represent 2mm^2 of space in the real world.

Registration - This step is to transform images into the coordinate system of templates by FSL FLIRT. We will be using this MNI152_T1_1mm.nii as a reference for registration. A detailed explanation of Image registration is provided in reference paper [6].

Skull Stripping - This step is carried out to remove irrelevant information from the image and leave only the portion of the brain tissues. This is being done using a hyperparameter called the fractional intensity threshold which is being set as 0.4 in our case.

Bias Field correction - At higher field strengths [7], sometimes structural images acquire an intensity gradient across the image making some parts of the image brighter than others. This intensity gradient can influence segmentation algorithms erroneously, therefore a method has been developed to remove this intensity gradient from the image, it is known as bias field correction.

Enhancement - At this stage, we normalize the histogram of the image to provide enhancement. This step adjusts the contrast of an image by using its histogram.

Segmentation - In this stage, we segment out the tissue portion of the brain exactly and remove any of the background information surrounding it.

After this preprocessing step, the output will be a segmented 3D MRI image with shape 182x218x182.

Approaches for final dataset generation

We will be using 4 different approaches of data generation to train the model. They are being mentioned below.

16 images in a 2D plane - Based on the evidence provided in reference [8], it can be seen that the middle images convey more information about the disease than slices that are far away in the top and bottom of the structural MRI. The reference [8] also provided a method to extract the middle 16 slices and align them in a 2D image in 4 x 4 dimension.

Middle Image - Based on the evidence in reference [8], we will use only one of the middle 16 slices to analyze the performance. We will be using only the middle image from the 3D structural image in this case.

Middle 16 images trained individually - We will also explore the possibility of training the middle 16 images (which has the most information as mentioned in [8]) individually so as to reduce the size of the 2D plane image in the first case.

3D Image - We will also explore using the 3D representation directly in the model and analyze the performance.

Data Augmentation

In one of the approaches, we will use data augmentation techniques to enhance the performance of the model by adding some newer images which are augmented from the train set. We will be using the middle images dataset for augmentation purposes. We will be using the Roboflow [9] framework for this purpose.

The Roboflow framework uses 2465 images from the Middle image dataset to perform the augmentation. The images are resized to 64x64 and adaptive equalization is applied to auto-adjust the contrast. The augmentation techniques are listed below.

Flipping horizontal and vertical
Rotation 90 degree clockwise, counterclockwise and upside down.
Brightness variation -15% to 15%
Exposure variation-20% to 20%
Blur slight blur

The New size is described as - Train - Val - Test

5.2K - 490 - 248 (Augmentation won’t be applied for val and test).

Modeling

We will be using various Classical Deep learning techniques along with Quantum Computation to formulate the model required to produce the desired prediction. The models which are being used are described in the section below. The introduction about the deep learning techniques and quantum computing principles is added in the models page HERE!

Deep Residual learning - Resnet18

These kinds of classical models use Deep learning techniques to train and perform inference on the dataset. The main advantage of using Deep Residual learning [10] comes from its architecture. Previously, as the depth of the neural network increases to obtain more trainable parameters, the gradient propagation from the final layer to the initial layer either vanishes causing a vanishing gradient problem or explodes causing an exploding gradient problem. To overcome this issue, Deep Residual networks come into play. Deep Residual networks contain various skip connections between the neural network layers. The skip connections in the architecture are denoted by the below figure.

During forward propagation, it moves through all the layers without skipping and produces the prediction. But during backpropagation, the gradients flow through the skip connections and thereby avoiding vanishing and exploding gradient problems while training.

One example of a Deep Residual learning network is Resnet18. The architecture is shown in the below figure . Here we can note the skip connections which are present to solve the gradients vanishing and exploding problem.

VGG - 11 - Batch Normalization

The VGG (Visual Geometry Group) architecture is a standard deep Convolutional neural networks architecture with multiple layers. It can have 16 or 19 convolutional layers based on the model (VGG16 or VGG19). The batch normalization layers are usually inserted after the neural network layers and before the nonlinear layer. The main purpose is to provide a regularization effect white training with smaller batch sizes. The architecture of VGG11-bn is shown below. Note that there are no skip connections as the model is not that deeper.

Densenet121

Densenet is a Convolutional neural network architecture, where each layer is connected to all other layers that are deeper in the network. The architectural diagram is shown below. We can think of the features as a global state of the network and each layer adds K features on top to the global state.

Hybrid Quantum Classical Machine learning model

The figure below shows the architecture of the hybrid quantum classical model. The classical model acts as a feature extractor for the dataset. The final set of neurons from this classical model are being downsized to the number of qubits being used in the quantum model. The measurement output from the quantum model is again passed to linear neurons based on the number of output classes.

The figure below shows the architecture of the quantum model [11] being used. The quantum model architecture comprises 4 sections - initial state preparation, transformation, trainable ansatz and measurement.

▪ Initial State preparation - In this stage, the initial qubit state is being prepared. It can be in all 0 states or it can be in superposition of all the 2^n states for n-qubit circuits.

▪ Transformation - In this stage, the classical data is being transformed into the hilbert space. In this hybrid architecture, we use angle embedding to achieve this. The classical data from the classical model are taken as the angle of rotation within the Hilbert space. The embedding can be done by either X, Y or Z rotation in the Hilbert space.

▪ Trainable Ansatz - Here, we perform the training operation. The Ansatz contains a set of gates along with linear entanglement. The angles of these gates are being optimized to converge into the final solution.

▪ Measurement - This is the final stage of the model where we convert the data from the hilbert space to the classical world.

We will be using the Angle Embedding and Amplitude embedding techniques ([11], [12], [13] and [14]) as the transformation circuit followed by a basic entangler trainable ansatz. The depth and number of qubits are being varied and performance is analyzed.

Hybrid Quanvolutional Neural Networks

The Quanvolutional [15] model works in a similar convolution as like in the classical system. The specified window size (say 2x2) with some weights are being traversed through the image to extract the features from the image. The key difference here is that the window operation is being performed by a quantum circuit.

For a 2x2 window size, the flatten 4 pixels from the image are passed to a quantum circuit with standard or trainable weights. The transformation is directly through angle embedding. Now a random trainable or standard ansatz is applied to the transformation and the results are measured. The 4 possible values from the qubit measurement are added to 4 feature maps and the window keeps moving through the image with a stride of 1. The process is being described below.

A sample result for a single middle image after Quanvoltuion operation using a 2x2 window with random ansatz is shown below.

Quantum Image Similarity detection using FRQI (Flexible Representation of Quantum Images)

The FRQI representation [17] is an efficient way to transform an image into the Hilbert space. Previously, for transforming an 64x64 image, amplitude embedding technique was being used. This requires the image to be flattened into 4906 values thereby removing any locality relationships. The amplitude embedding required 12 qubit and the depth of the circuit will increase because of multi qubit rotation gates in the embedding structure.

Using FRQI, we can simply represent the position of the image pixels along with the pixel value. In case of a gray scale 2x2 image from [14].

For an image with 64x64 (2^n x 2 ^n ) it requires 2n + 1 qubit [8] (2^6 = 64 so 2(6) + 1 = 13 qubits) but the depth remains minimal compared to amplitude embedding.

Based on [18], the quantum kernel SVM works by creating a kernel which measures between the 2 points in the Hilbert space for the classical data which is transformed. This distance kernel is being used to train SVM. Based on that idea, these kernel methods can be used to find the image similarity between two images which are being embedded using FRQI. The sample circuit [11] for the approach is shown below.

If the output is zero then the images are similar or else they are different. In our approach the clustering for each class of the image from the middle image dataset is done to find out the distinct images. These are used as reference images and compared with the test set to find out the class.

Results

The dataset contains the pre-processed structural MRI scans of various patients being labeled based on their class. The hypothesis was being tested on the 4 approaches of data preprocessing as mentioned below

16 images in a 2D plane
Middle images
Middle 16 images trained individually
3D images

We also test this hypothesis by adding data augmentation using the Roboflow framework. The model training and the evaluation code is being developed using Python along with the Pytorch and Pennylane packages. It’s been developed on a Google Colaboratory based on Jupyter notebook and has been run on Google Cloud Instance with T4 GPU and CUDA enabled. The hyperparameters used are mentioned below

Batch Size - 8 for 16 images in 2D slice and 32 for other approaches
Learning Rate - 1E-5
Epochs - maximum 30
Loss (Cost) Function - Cross Entropy or Weighted Cross Entropy
Optimizer - Adam

Packages Used

FLIRT - fsl
Nipype
ANT
Medfilt
SimpleITK

Approach 1 - 16 images in a 2D plane

We have used the dataset with 3370 MRI scan images being labeled under 3 classes. We will be taking images from 30th slice to 60th with a step size of 2 and 16 total images are obtained. In this case, we use a slightly lower section than the middle section. The results are summarized in the table below.

Classical Models	16 images in 2d plane classification Accuracy (%)
VGG11_bn Epoch 17	93.24%
Resnet18 Epoch 11	91.7%
Densenet121 Epoch 20	92.29%

We can see that the classical VGG11_bn model has achieved the maximum accuracy compared to the other two models.

The results of using Hybrid QML are summarized in the table below.

Hybrid - based on Resnet18	16 images in 2d plane classification Accuracy (%)
5 qubits depth 1	64.92%
5 qubits depth 4	66.18%
5 qubits depth 1 Balanced dataset	62.5%
5 qubits depth 4 Balanced dataset	60.95%
10 qubits depth 1	62.23%
10 qubits depth 4	66%
15 qubits depth 1	67.62%
4 qubits depth 2 RY angle embed and basic RY ansatz	81.834%

From the above table we can see that all the accuracy values except the last one are very poor. The reason being that, in the initial assumption of the hybrid QML model, the angle embedding was performed in Y axis and the trainable ansatz had rotation gates in X axis. The reason for this assumption is that, having trainable rotation in a perpendicular direction will enable classifying the points in the Hilbert space by inserting planes.

This assumption will only be true if the points that are transformed in Y axis by angle embedding are on different sides of Hilbert space. Unfortunately, that was not the case and so rotating the trainable ansatz in Y solved the problem. It can also be seen that the Hybrid QML model was not able to gain much advantages than the classical model.

The reason for this is the preprocessing strategy. The 2D plane image has a bigger size and contains a lot of important features. This complexity was not being handled by our quantum model with depth 2. We may need more trainable parameters, but that comes under the cost of a bottleneck in trainability of the Quantum Model because of the Barren plateau problem.

Resnet18 based approaches	16 images in 2d plane classification Accuracy (%)
downsized to 5 (no non linearity) then 5 to 3 classes	92.88%
downsized to 5 (with Relu non linearity) then 5 to 3 classes	92.29%
Using classical weights of Resnet18 model weights Training only QML Epoch 24 5 qubits depth 8	91.63%
Amplitude embedding 5 qubits 32 input features 25 epochs	65.65%

Furthermore from the above table, we can understand that, even though having a downsizing layer in the classical Resnet18 model with and without non linearity, it was able to perform similarly to the non downsized case. Thus, from this it is clear that downsizing is not the problem. The problem lies in the complexity of the features inside the Hilbert space preventing the Hybrid approach to gain more accuracy.

Approach 2 - Middle Images

We have used the dataset with 3370 MRI scan images being labeled under 3 classes. The results are summarized in the table below.

Classical Models	Middle Image classification Accuracy (%)
VGG11_bn Epoch 19	89.16%
Resnet18 Epoch 21	86.85%
Densenet121 Epoch 21	85.51%

It can be seen that the VGG11-bn model again performs better compared to the rest of the models. The summarized results of using Hybrid QML models are shown below.

It can be seen that the Hybrid QML model performs similarly to the classical Resnet18 model in this case. The main thing to note is that the Angle Embedding was performed in Y axis and the trainable ansatz also rotates in Y axis.

We have also gathered the results with 2294 images dataset and summarized in the below table.

It is also clear that the number of train samples in the dataset also impacts the performance of the model in inference. Using a lesser number of train images, even the classical model achieves lower accuracy.

We also summarized the results of using Quanvolutional models, classical and hybrid models using Roboflow data augmented dataset, and FRQI based image similarity in the below table.

Final Model

16 images in a 2D plane preprocessed MRI
Vgg11_bn
Accuracy - 93.24%

Middle Images
Vgg11_bn
Accuracy - 89.16%

The heatmap after a successful prediction of MCI is shown below

References

What is alzheimer’s? Alzheimer’s Disease and Dementia. (n.d.). https://www.alz.org/alzheimers-dementia/what-is-alzheimers
Rasmussen J, Langerman H. Alzheimer's Disease - Why We Need Early Diagnosis. Degener Neurol Neuromuscul Dis. 2019 Dec 24;9:123-130. doi: 10.2147/DNND.S228939. PMID: 31920420; PMCID: PMC6935598.
Alzheimer’s disease neuroimaging initiative. ADNI. (n.d.). https://adni.loni.usc.edu/
Plasencia, O. D. (2021, June 4). Alzheimer diagnosis with Deep Learning: Data preprocessing. Medium. https://towardsdatascience.com/alzheimer-diagnosis-with-deep-learning-data-preprocessing-4521d6e6ebeb
Quqixun. (n.d.). Quqixun/BrainPrep: Preprocessing pipeline on Brain Mr Images through FSL and ants, including registration, skull-stripping, bias field correction, enhancement and segmentation. GitHub. https://github.com/quqixun/BrainPrep
Mehta, S. (2022, May 6). What is image registration and how does it work?. Analytics India Magazine. https://analyticsindiamag.com/what-is-image-registration-and-how-does-it-work/
10. structural image bias field correction. Legacy Bioimage Suite. (n.d.). https://medicine.yale.edu/bioimaging/suite/manual/guide/correction/
A deep learning model to predict a diagnosis of alzheimer disease by ... (n.d.-a). https://pubs.rsna.org/doi/pdf/10.1148/radiol.2018180958
Give your software the power to see objects in images and video. Roboflow. (n.d.). https://roboflow.com/
He, K., Zhang, X., Ren, S., & Sun , J. (2015). Deep Residual Learning for Image Recognition. arXiv. https://doi.org/https://doi.org/10.48550/arXiv.1512.03385
Pennylane. PennyLane. (n.d.). https://pennylane.ai/
Mottonen, M., Vartiainen, J. J., Bergholm, V., & Salomaa, M. M. (2005). Transformation of quantum states using uniformly controlled rotations. Quantum Information and Computation, 5(6), 467–473. https://doi.org/10.26421/qic5.6-5.
Schuld, M., & Petruccione, F. (2019). Supervised learning with Quantum Computers. Springer.
Qiskit. (n.d.). https://qiskit.org
Quanvolutional neural networks: Powering image recognition ... - arxiv.org. (n.d.-f). https://arxiv.org/pdf/1904.04767.pdf
Quantum Image Processing - arXiv.org. (n.d.-g). https://arxiv.org/pdf/2203.01831
Improved FRQI on superconducting processors and its restrictions in the ... (n.d.-e). https://arxiv.org/pdf/2110.15672.pdf
T S arxiv:1804.11326v2 [quant-ph] 5 jun 2018. (n.d.-i). https://arxiv.org/pdf/1804.11326.pdf
PrinceJavier. (n.d.). GitHub. https://github.com/PrinceJavier/qnt_alzmrs_pred
Contents. FLIRT - FslWiki. (n.d.). https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT
Neuroimaging in python - pipelines and interfaces - nipy pipeline and interfaces package. (n.d.). https://nipype.readthedocs.io/en/latest/
Advanced normalization tools (ants). Advanced Normalization Tools (ANTs) - Andy’s Brain Book 1.0 documentation. (n.d.). https://andysbrainbook.readthedocs.io/en/latest/ANTs/ANTs_Overview.html
Scipy.signal.medfilt#. scipy.signal.medfilt - SciPy v1.11.3 Manual. (n.d.). https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.medfilt.html
Home. SimpleITK. (n.d.). https://simpleitk.org/
Imagej. (n.d.). https://imagej.net/ij/