Thread Subject:
3D Image Registration to sub pixel accuracy

Subject: 3D Image Registration to sub pixel accuracy

From: Sean

Date: 7 Jan, 2010 21:39:04

Message: 1 of 11

Hi all,

I am working with 3D image registration and I do not completely understand how to get sub pixel accuracy.
I am trying to find the displacement vector of I1 (16x16x16 ) into I2 (32x32x32) block surrounding the same location as I1. All images are uint8 CT scans of concrete.

1. embed the 16x16x16 I1 in zeros(32,32,32)
2. cast both I1 and I2 to double
3. FTsmall = conj(fftn(I1));
4. FTbig = fftn(I2);
5. FTR = FTbig.*FTsmall;
6. FTRN = FTR/.abs(FTR);
7. peak_correlation = ifftn(FTRN);
This works to pixel accuracy
I understand that the traditional way to achieve sub pixel accuracy is to embed the FTRN in zeros of size(FTRN*1/(%accuracy)) before taking the IFT. However, I can’t get this to work and it would take forever with the net size of the entire image, approximately 900x900x900, with this calculation being done sparsely at every other voxel.

I also tried interpolating the values around the peak_correlation; but this didn’t yield accurate results.

The solution of Guizar et al. (dftregistration.m) on the file exchange seems to be exactly what I want; however, it only works in 2D. It also doesn’t seem to be extendable to 3D because they compute the DFT from kernels and matrix multiplication. The corresponding paper also does not mention 3D. Is this method of computing the DFT undefined in 3D? (Sorry I’m still inexperienced with Fourier transforms)
If not how would I go about selecting a few voxels to zero pad and IFT?

Any help would be much appreciated.
Thank you in advance,
-Sean

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 7 Jan, 2010 21:47:03

Message: 2 of 11

"Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hi5k9o$2hj$1@fred.mathworks.com>...
> Hi all,
>
> I am working with 3D image registration and I do not completely understand how to get sub pixel accuracy.
> I am trying to find the displacement vector of I1 (16x16x16 ) into I2 (32x32x32) block surrounding the same location as I1. All images are uint8 CT scans of concrete.
>
> 1. embed the 16x16x16 I1 in zeros(32,32,32)
> 2. cast both I1 and I2 to double
> 3. FTsmall = conj(fftn(I1));
> 4. FTbig = fftn(I2);
> 5. FTR = FTbig.*FTsmall;
> 6. FTRN = FTR/.abs(FTR);
> 7. peak_correlation = ifftn(FTRN);
> This works to pixel accuracy

If you do the ifft with zero padding on FTRN, your peak_correlation array will be upsampled to below pixel accuracy. Something like the following perhaps


peak_correlation = abs( ifftn(FTRN,[320 320 320]) );

Subject: 3D Image Registration to sub pixel accuracy

From: Sean

Date: 8 Jan, 2010 16:59:05

Message: 3 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hi5kon$3fc$1@fred.mathworks.com>...
> "Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hi5k9o$2hj$1@fred.mathworks.com>...
> > Hi all,
> >
> > I am working with 3D image registration and I do not completely understand how to get sub pixel accuracy.
> > I am trying to find the displacement vector of I1 (16x16x16 ) into I2 (32x32x32) block surrounding the same location as I1. All images are uint8 CT scans of concrete.
> >
> > 1. embed the 16x16x16 I1 in zeros(32,32,32)
> > 2. cast both I1 and I2 to double
> > 3. FTsmall = conj(fftn(I1));
> > 4. FTbig = fftn(I2);
> > 5. FTR = FTbig.*FTsmall;
> > 6. FTRN = FTR/.abs(FTR);
> > 7. peak_correlation = ifftn(FTRN);
> > This works to pixel accuracy
>
> If you do the ifft with zero padding on FTRN, your peak_correlation array will be upsampled to below pixel accuracy. Something like the following perhaps
>
>
> peak_correlation = abs( ifftn(FTRN,[320 320 320]) );


Thank you Matt,
That works really accurately! (finally!)
However, it very time consuming to get the accuracy I want, probably 0.2pixel. What is the correlation between FTRN and the peak_correlation as far as location?
I.e. if I take the ifftn(FTRN) at pixel accuracy, can I then segment part of FTRN and only IFT a zero padded area around this point?
e.g:
peak_correlation = 32x32x32 with let's say a peak at (10,12,10). Is this point (10,12,10) and surrounding points the part of FTRN that I need?
i.e:
>>abs(ifftn(FTRN(10-x:10+x,12-x:12+x,10-x:12+x), [10*x 10*x 10*x]));

Thanks again!

 

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 8 Jan, 2010 19:03:04

Message: 4 of 11

"Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hi7o8p$8su$1@fred.mathworks.com>...

> However, it very time consuming to get the accuracy I want, probably 0.2pixel. What is the correlation between FTRN and the peak_correlation as far as location?
> I.e. if I take the ifftn(FTRN) at pixel accuracy, can I then segment part of FTRN and only IFT a zero padded area around this point?
> e.g:
> peak_correlation = 32x32x32 with let's say a peak at (10,12,10). Is this point (10,12,10) and surrounding points the part of FTRN that I need?

Hi Sean,

No, you can't obtain the same result by working with only a portion of FTRN, but what you can do is evaluate the IDFT of FTRN (zero-padded) at select locations, to cut down on the computation. To do this for 0.2 pixel accuracy, first zero-pad FTRN to size 160x160x160, which will give you a factor of 5 upsampling.

Staying with your example, this means that if the coarse peak is at (10,12,10) in peak_correlation, you might want to probe the 3x3x3 pixel box containing this point. That is, the box with corners (9, 11,9) and (11,13,11). After upsampling by a factor of 5, these corners will be at new indices (41,51,41) and (51,61,51).

If you were taking a 1D IDFT of a vector F of length 160 between indices n1 and n2, you would construct an IDFT matrix

Q=exp( -j* (2*pi/160) * (0:160).' * (n1:n2) );

Then Q*F would evaluate the 1D IDFT between n1 and n2.

Since this is in the 3D, however you need to construct a matrix like this for each of the different axes of FTRN

Qx=exp( -j* (2*pi/160) * (0:160).' * (41:51) );
Qz=Qx;
Qy=exp( -j* (2*pi/160) * (0:160).' * (51:61) );


So now you must multiply all columns of FTRN by Qx, all rows by Qy, and all slices by Qz.


To do this, you might find it helpful to use my KronProd class:

http://www.mathworks.com/matlabcentral/fileexchange/25969-efficient-object-oriented-kronecker-product-manipulation

In your particular case, this would look like


Q=KronProd( {Qx,Qy}, [1 2 1], [160 160 160]); %The tensor product of Qx, Qy,Qz

peak_correlation_zoomed=abs(Q*FTRN);


You can now probe the zoomed version of peak_correlation to get a finely sampled fix on the peak in that local 3x3x3 pixel box.

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 8 Jan, 2010 19:17:05

Message: 5 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hi7vh8$t4h$1@fred.mathworks.com>...

> If you were taking a 1D IDFT of a vector F of length 160 between indices n1 and n2, you would construct an IDFT matrix
>
> Q=exp( -j* (2*pi/160) * (0:160).' * (n1:n2) );

A few small mistakes here. This should be

 Q=(1/160) * exp( -j* (2*pi/160) * (n1-1:n2-1).' * (0:159) );

and similarly for Qx, Qy, and Qz in the rest of the post.

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 8 Jan, 2010 20:35:04

Message: 6 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hi80bh$m28$1@fred.mathworks.com>...
> "Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hi7vh8$t4h$1@fred.mathworks.com>...
>
> > If you were taking a 1D IDFT of a vector F of length 160 between indices n1 and n2, you would construct an IDFT matrix
> >
> > Q=exp( -j* (2*pi/160) * (0:160).' * (n1:n2) );
>
> A few small mistakes here. This should be
>
> Q=(1/160) * exp( -j* (2*pi/160) * (n1-1:n2-1).' * (0:159) );
>
> and similarly for Qx, Qy, and Qz in the rest of the post.

Sorry, I just realized one more thing. If you omit certain columns from the above Q matrix, you will not only reduce the size of Q and the associated computations, but you also can get the same effect as zero-padding FTRN without actually doing so.

So what I mean is you would actually compute Q as

Q=(1/160) * exp( -j* (2*pi/160) * (n1-1:n2-1).' * (k1-1:k2-1) );

where k1:k2 is the range of indices that the non-zero portion of FTRN would occupy if it were zero-padded.

If you construct Q as above, you would only need to do the following


Q=KronProd( {Qx,Qy}, [1 2 1], [32 32 32]); %The tensor product of Qx, Qy,Qz

peak_correlation_zoomed=abs(Q*FTRN); %FTRN here is NOT zero-padded here!!!!


and of course it will all be much faster...

Subject: 3D Image Registration to sub pixel accuracy

From: Sean

Date: 8 Jan, 2010 21:47:03

Message: 7 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message
> So what I mean is you would actually compute Q as
>
> Q=(1/160) * exp( -j* (2*pi/160) * (n1-1:n2-1).' * (k1-1:k2-1) );
>
> where k1:k2 is the range of indices that the non-zero portion of FTRN would occupy if it were zero-padded.
>
> If you construct Q as above, you would only need to do the following
>
>
> Q=KronProd( {Qx,Qy}, [1 2 1], [32 32 32]); %The tensor product of Qx, Qy,Qz
>
> peak_correlation_zoomed=abs(Q*FTRN); %FTRN here is NOT zero-padded here!!!!
>
>
> and of course it will all be much faster...

Thank you for your continued assistance,

I've implemented what you have here and KronProd is returning an error. The error occurs on lines 299/300 and seems to be because the first statement:
>> isnum(M.opset{ii}) fails
M.opset{:} are 3x11 doubles
size(2) = 11
and M.domainsizes(cc) = 32
it was called with:
and M.domainsizes(cc) = 32
>> Q = KronProd({Qx,Qy,Qz},[1 2 3],[32 32 32]);
and Qx,Qy,Qz are all complex 3x11 doubles.

Thanks again for your assistance and have a great weekend!
-Sean

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 8 Jan, 2010 23:26:05

Message: 8 of 11

"Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hi894n$b3m$1@fred.mathworks.com>...

> I've implemented what you have here and KronProd is returning an error. The error occurs on lines 299/300 and seems to be because the first statement:
> >> isnum(M.opset{ii}) fails
> M.opset{:} are 3x11 doubles
> size(2) = 11
> and M.domainsizes(cc) = 32
> it was called with:
> and M.domainsizes(cc) = 32
> >> Q = KronProd({Qx,Qy,Qz},[1 2 3],[32 32 32]);
> and Qx,Qy,Qz are all complex 3x11 doubles.

Hi Sean,

When reporting errors in this NG, you should get into the habit of copy/pasting the error messages you see into your posts. This is to minimize the guesswork needed on our part and so that nothing gets lost in translation.

In the meantime, it is not clear to me how you derived Qx, Qy, Qz, etc... to be 3x11, so you should probably post your reasoning on that one, too.

If this is indeed their size, I do expect that you would get an error message, but I would expect it to look something like this:

??? Error using ==> KronProd.KronProd at 303
Dimension length #1 not consistent with operator.

This error is expected because your 3rd argument to KronProd() is
domainsize=[32 32 32] which tells KronProd() that it should expect size(Qx,2) to be 32 and not 11 and similarly for Qx and Qy. The only time it wouldn't complain about this inconsistency is if Qx were a scalar, for reasons explained in the help doc.

If FTRN is indeed 32x32x32, as we've been hypothesizing, and if you are indeed upsampling a 3x3x3 box by a factor of 5, then I would expect all the Q matrices to be 11x32. Remember that the formula for the Q matrix that we landed on is

Q=(1/160) * exp( j* (2*pi/160) * (n1-1:n2-1).' * (k1-1:k2-1) );

Here, k1:k2 is the range of indices covered by FTRN. Since we now agree that we shouldn't be zero-padding FTRN, you would want k1=1 and k2=32.

Conversely, n1:n2 is the range of indices in the UPSAMPLED version of peak_correlation at which we want to know the IDFT. In our above discussion, we had, for Qx, that the range of indices is given by n1=41 and n2=51, assuming you had a coarse peak at (10,12,10). Plugging this data into the above formula for Q, I obtain

>> Qx=(1/160) * exp( j* (2*pi/160) * (41-1:51-1).' * (1-1:32-1) ); whos Qx
  Name Size Bytes Class Attributes

  Qx 11x32 5632 double complex

which has the dimensions of 11x32 that I expected. If the coarse peak is at a different location, though, I would still expect this size since the separation n2-n1 will be the same whenever you upsample a 3x3x3 box by a factor of 5.


Anyway, stew over this for a while and clarify things for me as needed. Hope you have a good weekend as well, hopefully not working on this. :-)


-Matt


PS: you will not be able to execute isnum() from the command line as in the following

> >> isnum(M.opset{ii}) fails

The reason is because isnum.m resides in a private/ directory and so is only visible to the files in its parent directory (see the MATLAB documentation on private directories).

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 12 Jan, 2010 16:07:04

Message: 9 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hi8euc$fcc$1@fred.mathworks.com>...
> "Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hi894n$b3m$1@fred.mathworks.com>...
>
> > I've implemented what you have here and KronProd is returning an error. The error occurs on lines 299/300 and seems to be because the first statement:
> > >> isnum(M.opset{ii}) fails
> > M.opset{:} are 3x11 doubles
> > size(2) = 11
> > and M.domainsizes(cc) = 32
> > it was called with:
> > and M.domainsizes(cc) = 32
> > >> Q = KronProd({Qx,Qy,Qz},[1 2 3],[32 32 32]);
> > and Qx,Qy,Qz are all complex 3x11 doubles.

Hi Sean,

Don't know if you're still wrestling with this, but I just wanted to mention that I've posted an update to KronProd() on the FEX that removes a lot of the hassle with the "domainsizes" parameter.

So now, you can construct Q just by doing,

Q = KronProd({Qx,Qy,Qz},[1 2 3]);

Of course, you still have to make the dimensions of Qx,Qy, and Qz compatible with the dimensions of FTRN.

So, if size(FTRN)=[32,32,32], then you must have size(Qx,2)=32, and similarly for Qy and Qz.

Subject: 3D Image Registration to sub pixel accuracy

From: Sean

Date: 12 Jan, 2010 18:44:07

Message: 10 of 11

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message
> Hi Sean,
>
> Don't know if you're still wrestling with this, but I just wanted to mention that I've posted an update to KronProd() on the FEX that removes a lot of the hassle with the "domainsizes" parameter.
>
> So now, you can construct Q just by doing,
>
> Q = KronProd({Qx,Qy,Qz},[1 2 3]);
>
> Of course, you still have to make the dimensions of Qx,Qy, and Qz compatible with the dimensions of FTRN.
>
> So, if size(FTRN)=[32,32,32], then you must have size(Qx,2)=32, and similarly for Qy and Qz.

Hi Matt,

I did figure out what was wrong with my dimensions before. I had the (k1-1,k2-1) and (n1-1,n2-1) swapped. Also, the error you specified before was the one I was seeing.

I still have an issue that I'm trying to figure out and that's why I've been waiting to post.

max(abs(peak_correlation_zoomed(:))) gives the indices [x y z] of the sub pixel shift.
So with peak_correlation_zoomed = 11x11x11 this seems to mean that the shift to add to the integer pixel values would be shift = [x y z]*0.2 - 0.2 -1.
e.g. if [x y z] = [4 6 8];
The amount to add would be [-0.4 0 0.4] so total shift = [10 12 10] + [-0.4 0 0.4] = [9.6 12 10.4] This doesn't seem to work; however.
Instead the result is half of the sub pixel part of the translation.
So the translation (linearly interpolated affine transformation) of [1.8 2.6 -4.4] yields:
[2.0(returned 0 shift), 2.8, -4.2]
If instead I use an accuracy of 0.1 pixel and multiply the fractional part by 2 it works great (1.8 2.6 -4.4)!. Is this what you would expect or is there something wrong with my understanding?

actual code:
            Q = KronProd({Qx, Qy, Qz},[1 2 3],[Bsize Bsize Bsize]); %Bsize=32
            pc_zoomed = abs(Q*FTRN);
            [m, idl] = max(pc_zoomed(:));
            [xfract, yfract, zfract] = ind2sub(size(pc_zoomed), idl);
            fpart = [xfract yfract zfract];
            disp_vec_fract = (fpart*accuracy - accuracy - 1);%*2; %accuracy = 0.2 for our example

I also have another 'computational efficiency' type question:
Since the integer pixel shift returns the maximum in each dimension of peak_correlation, is it necessary to extend all the way to the next adjacent pixel when creating the Q matrices?
So with our example: instead of having, for dimension 1, (n1-1:n2-1) = 40:50 could I use 42:48 since if the maximum was in location 41 the integer pixel shift would've returned 9 instead of 10? I'm not sure if this is justified or not, it was just something I noticed. It's probably not even time intensive enough to worry about...

Once again,
Thanks for everything you've done!

-Sean

Subject: 3D Image Registration to sub pixel accuracy

From: Matt J

Date: 12 Jan, 2010 20:54:03

Message: 11 of 11

"Sean " <sean.dewolski@Idontwantspam.umit.maine.edu> wrote in message <hiiftm$oit$1@fred.mathworks.com>...

> I still have an issue that I'm trying to figure out and that's why I've been waiting to post.
>
> max(abs(peak_correlation_zoomed(:))) gives the indices [x y z] of the sub pixel shift.
> So with peak_correlation_zoomed = 11x11x11 this seems to mean that the shift to add to the integer pixel values would be shift = [x y z]*0.2 - 0.2 -1.
==========

That looks right to me.

> e.g. if [x y z] = [4 6 8];
> The amount to add would be [-0.4 0 0.4] so total shift = [10 12 10] + [-0.4 0 0.4] = [9.6 12 10.4] This doesn't seem to work; however.
> Instead the result is half of the sub pixel part of the translation.
> So the translation (linearly interpolated affine transformation) of [1.8 2.6 -4.4] yields:
> [2.0(returned 0 shift), 2.8, -4.2]
===================

So clearly you're off by +0.2 pixels in all coordinates, or in other words 1 sub-pixel spacing. That seems to suggest that your fpart is inaccurate by +1

One thing that's not clear to me is what formula you're using to derive the translation from the location of the peak. Since your translation is capable of having a negative sign, e.g. -4.4, it must mean that a translation of [0 0 0] must correpond to array index coordinates somewhere near the middle of pc_zoomed. If so, what are those coordinates and how are they derived?
 

>
> actual code:
> Q = KronProd({Qx, Qy, Qz},[1 2 3],[Bsize Bsize Bsize]); %Bsize=32
> pc_zoomed = abs(Q*FTRN);
> [m, idl] = max(pc_zoomed(:));
> [xfract, yfract, zfract] = ind2sub(size(pc_zoomed), idl);
> fpart = [xfract yfract zfract];
> disp_vec_fract = (fpart*accuracy - accuracy - 1);%*2; %accuracy = 0.2 for our example
=========================

One thing to be careful of in
[m, idl] = max(pc_zoomed(:));
is that it doesn't allow for the possibility that there may be multiple maxima. The max() function will only report the first maximizer it finds.

So for example, if the maximum in pc_zoomed were attained not only at idl, but also at a neighbouring coordinate, that could also explain why you are biased away from the true translation by a whole sub-pixel.
 

> I also have another 'computational efficiency' type question:
> Since the integer pixel shift returns the maximum in each dimension of peak_correlation, is it necessary to extend all the way to the next adjacent pixel when creating the Q matrices?
> So with our example: instead of having, for dimension 1, (n1-1:n2-1) = 40:50 could I use 42:48 since if the maximum was in location 41 the integer pixel shift would've returned 9 instead of 10? I'm not sure if this is justified or not, it was just something I noticed. It's probably not even time intensive enough to worry about...
===================

No, when you interpolate using iffts (which is also equivalent to sinc-interpolation) there is no way to derive the location of the interpolated maximum from the surrounding pixel values. The interpolated sub-pixel samples do not very monotonically between the pixel samples with sinc-interpolation (unlike tri-linear interpolation for example).

It would certainly be a reasonable strategy to "guess" that the maximum is in the location 42:48 and if it turns out to be at an interior point 44, say, you'll know you were right, and you can stop. If the maximum is at the boundary coordinates 42 or 48 however, you would have to search the neighbouring regions outside this interval. Statistically, though you would be saving yourself some computation because you know the peak will be in 42:48 at least some of the time.

Another strategy you might consider doing is to search first with a larger sub-pixel size(e.g., 0.5 pixels ) in the zoomed area and then sub-divide further once you've localized the peak to a .5 pixel interval...

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
kronprod class Matt J 8 Jan, 2010 14:17:18
registration Sean 7 Jan, 2010 16:39:16
3d Sean 7 Jan, 2010 16:39:16
fourier transform Sean 7 Jan, 2010 16:39:16
phase correlation Sean 7 Jan, 2010 16:39:16
sub pixel Sean 7 Jan, 2010 16:39:15
rssFeed for this Thread

Contact us