Thursday, September 23, 2010

Compression: The Smaller The Better

Now, all the processes we have studied can be applied to millions of images with endless possibilities. But now comes storage. Images can sometimes be a hassle to store with 1 even reaching up to hundreds of images each camera.

But how can we compress the images? Luckily, there is such a method called Principal Component Analysis (PCA). In PCA, you transform the coordinates of an image to a coordinate system that the image is dependent on only a few axis.

So how do we do it?

In Scilab, using Principal Component Analysis is as simple as typing pca(x). Granted there is some sort of prep that goes into it. First, we divide the image into subplots of 10x10 pixels. Next, we arrange these 10x10 subplots into a single row and concatenate these rows of 100 elements. It is now ready to be entered in the pca function. Now, the pca function returns to you three variables, the eigenvalues, the eigenvectors and the principal components. Now, by looking at the eigenvalues in percent, we can determine how many eigenvectors we need to reconstruct the image with losing too much information. Next, we use the principal components and multiply it to their corresponding eigenvectors. For every element in the principal components produces a 10x10 matrix. If we choose to use more than 1 eigenvector, we just superimpose the two 10x10 matrices and so far and so on.

Reconstruction of Eigenvectors

So let us explore what happens when we reduce the number of eigenvectors to be used.


Comparisson of Reconstructions (A) is the Original Image, (B) is the Grayscaled Image, (C) is using only 1 Eigenvector, (D) is using only 5 Eigenvectors, (E) is using only 10 Eigenvectors and (F) is using all 100 Eigenvectors

We can see that there is a reduction in the resolution of the reconstruction as we reduce the number of eigenvectors, but also there is a maximum in the number of eigenvectors that contribute to the image. It is very obvious how pixelated the usage of only one eigenvector is compared to the usage of 5 eigenvectors. But it is also noticeable that there is no visible difference between using 10 eigenvectors and 100 eigenvectors.

If we compute the number of bytes that the compression uses, we arrive at 7250 bytes for 10 eigenvectors, 3625 bytes for 5 eigenvectors, 725 bytes for 1 eigenvector and 72500 bytes for 100 eigenvectors. If we do not compress an image, a 250x250 pixel image will take up 62500 bytes. Notice how 100 eigenvectors do not compress the image but increase the size. But even using 10 eigenvectors which showed no visible difference with the usage of 100 eigenvectors and the gray scale image reduces the file size by 88.4%. If that reduction is performed for thousands of images the number of images that you will be able to save will more than quadruple.

Also, for a 250x250 image, more than 86 eigenvectors will result to an enlarging of the image rather than compression.

For this exercise I give myself a grade of 9 because of the lateness of my submission.

No comments:

Post a Comment