Showing posts with label matlab. Show all posts
Showing posts with label matlab. Show all posts

19 December 2012

294. Bruker 1D processing using octave/matlab

I wanted a set of scripts that behaved a little bit like the commands in bruker xwin-nmr/topspin, so that I could do some quick processing for visual inspection without having to do too much coding.

I also wanted some simple modules that I can plug into automated processing routines for large numbers of spectra (e.g. when doing kinetics).

So here are a few simple octave routines which should work in matlab as well. They won't change the world, but should be good enough for some basic 1D processing.

Because of the groupdelay (GRPDLY) used in by Bruker (see e.g. here (16th of June post) and here), you need to use the bruk2ana converter. There's little science behind the values which are applied since they are hardware specific.



Example workflow:

./bin2ascii experiment_1/1 fid
getpar experiment_1/1 my.par

octave:1> [fid,pars]=loadfid('fid.ascii','my.par');
octave:2> [zfid,pars]=zf(fid,pars);
octave:3> [test,phc1]=bruk2ana(zfid,pars);
octave:4> test=em(test,0.5);
octave:5> plot(test(:,1),test(:,3));
octave:6> spectrum=ft(test,pars);
octave:7> phased=apk(spectrum,phc1);
octave:8> final=absd(spectrum);
octave:9> pltspec(final)




Linux shell-scripts

bin2ascii
#!/bin/bash
#bin2ascii dir fid
cp $1/$2 $1/$2.bak
ls $1/$2.bak | cpio -o | cpio -i --swap -u
od -An -t dI -v -w8 $1/$2.bak| gawk '{print NR,$1,$2}' > $2.ascii

getpars
 #!/bin/bash
 #getpars $1 $2
 # $1 is the location (directory or .) and $2 is the root of the output file name
 SW=`cat $1/acqus | grep 'SW_h' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
 TD=`cat $1/acqus | grep 'TD=' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
 O=`cat $1/acqus | grep '$O1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
 SFO=`cat $1/acqus | grep 'SFO1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
 DECIM=`cat $1/acqus | grep 'DECIM=' | sed 's/\=/\t/g' | gawk '{print $2}'`
 DSPFVS=`cat $1/acqus | grep 'DSPFVS=' | sed 's/\=/\t/g' | gawk '{print $2}'`
 echo $SW > $2
 echo $TD >> $2
 echo $O >> $2
 echo $SFO >> $2
 echo $DECIM >> $2
 echo $DSPFVS >> $2

Octave scripts

loadfid.m
function [fid,pars]=loadfid(infile,parfile)
%%Usage: [fid,pars]=loadfile(infile,parfile)
%%reads a pts, re, im ascii array 
%%generated by bin2ascii
%%and a parfile generated with genpar
 fid=load(infile);
 pars=load(parfile);
 t=linspace(0,(1/(pars(1)/(pars(2)/2))),pars(2)/2);
 fid=[t' fid(:,2) fid(:,3)];
end

zf.m
function [newfid,pars]=zf(fid,pars)
%% Usage:[newfid,updatedpars]=zf(fid,pars)
%% Doubles the number of points by zerofilling
 dims=size(fid);
 newfid=[fid' zeros(3,dims(1))]';
 pars(2)=pars(2)*2;
end

bruk2ana
function [fid,phc1]=bruk2ana(fid,pars)
%% Usage: [fid,phc1]=bruk2ana(fid,pars)
%% where phc1 is the first-order phase correction
%% Using https,//nmrglue.googlecode.com/svn-history/r44/trunk/nmrglue/fileio/bruker.py
%% and https,//ucdb.googlecode.com/hg/application/ProSpectND/html/dmx_digital_filters.html
%% The short version is: bruker fid data needs pre-processing and it's hardware dependent

%%D contains the digital filter parameters
D=[[10,2, 44.75];
[10,3, 33.5];
[10,4, 66.625];
[10,6, 59.083333333333333];
[10,8, 68.5625];
[10,12, 60.375];
[10,16, 69.53125];
[10,24, 61.020833333333333];
[10,32, 70.015625];
[10,48, 61.34375];
[10,64, 70.2578125];
[10,96, 61.505208333333333];
[10,128, 70.37890625];
[10,192, 61.5859375];
[10,256, 70.439453125];
[10,384, 61.626302083333333];
[10,512, 70.4697265625];
[10,768, 61.646484375];
[10,1024, 70.48486328125];
[10,1536, 61.656575520833333];
[10,2048,70.492431640625];
[11,2, 46.];
[11,3, 36.5];
[11,4, 48.];
[11,6, 50.166666666666667];
[11,8, 53.25];
[11,12, 69.5];
[11,16, 72.25];
[11,24, 70.166666666666667];
[11,32, 72.75];
[11,48, 70.5];
[11,64, 73.];
[11,96, 70.666666666666667];
[11,128, 72.5];
[11,192, 71.333333333333333];
[11,256, 72.25];
[11,384, 71.666666666666667];
[11,512, 72.125];
[11,768, 71.833333333333333];
[11,1024, 72.0625];
[11,1536, 71.916666666666667];
[11,2048, 72.03125];
[12,2, 46. ];
[12,3, 36.5];
[12,4, 48.];
[12,6, 50.166666666666667];
[12,8, 53.25];
[12,12, 69.5];
[12,16, 71.625];
[12,24, 70.166666666666667];
[12,32, 72.125];
[12,48, 70.5];
[12,64, 72.375];
[12,96, 70.666666666666667];
[12,128, 72.5];
[12,192, 71.333333333333333];
[12,256, 72.25];
[12,384, 71.666666666666667];
[12,512, 72.125];
[12,768, 71.833333333333333];
[12,1024, 72.0625];
[12,1536, 71.916666666666667];
[12,2048, 72.03125];
[13,2, 2.75]; 
[13,3, 2.8333333333333333];
[13,4, 2.875];
[13,6, 2.9166666666666667];
[13,8, 2.9375];
[13,12, 2.9583333333333333];
[13,16, 2.96875];
[13,24, 2.9791666666666667];
[13,32, 2.984375];
[13,48, 2.9895833333333333];
[13,64, 2.9921875];
[13,96, 2.9947916666666667];];

 h=find(D(:,2)==pars(5));
 j=find(D(h,1)==pars(6));
 magickey=D(h(j),3);
 chop=floor(magickey);

 phc1=(magickey-chop)*2*pi; %the first-order phase correction gets mangled by bruker

 tmp=size(fid); %matlab workaround. rows/columns would be more elegant
 
 newfid=[fid(chop:tmp(1),2:3)' fid(1:chop-1,2:3)']';
 fid=[fid(:,1) newfid(:,1) newfid(:,2)];
 
end

em.m
function fid=em(fid,lb)
%%Usage: fid=em(fid,lb)
%%Exponential multiplication window function
%%Increases Signal-to-noise at the expense
%%of resolution
 fid(:,2)=fid(:,2).*exp(-lb.*fid(:,1));
 fid(:,3)=fid(:,3).*exp(-lb.*fid(:,1));
end

gm.m
function fid=gm(fid,lb)
%%Usage: fid=gm(fid,lb)
%%Gaussian multiplication window function
%%Increases Signal-to-noise at the expense
%%of resolution
 fid(:,2)=fid(:,2).*exp(-(lb.*fid(:,1)).^2);
 fid(:,3)=fid(:,3).*exp(-(lb.*fid(:,1)).^2);
end

de.m
function fid=de(fid,lb,gm)
%%Usage fid=de(fid,lb,gm)
%%Double-exponential window function
%%Increases resolution at the expense of
%%signal-to-noise
 at=max(fid(:,1));
 defun= @(lb,gm,t) (exp(-(t.*lb-gm*at))).^2;
 fid(:,2)=fid(:,2).*defun(lb,gm,fid(:,1));
 fid(:,3)=fid(:,3).*defun(lb,gm,fid(:,1));
end

traf.m
function fid=traf(fid,lb)
%%Usage: fid=traf(fid,lb)
%%TRAF window function
%%Increases resolution at the expense of
%%the signal-to-noise
 at=max(fid(:,1));
 traffun= @(lb,t) (exp(-t.*lb)).^2./((exp(-t.*lb)).^3+(exp(-at*lb)).^3);
 fid(:,2)=fid(:,2).*traffun(lb,fid(:,1));
 fid(:,3)=fid(:,3).*traffun(lb,fid(:,1));
end

ft.m
function spectrum=ft(fid,pars)
%%Usage: spectrum=ft(fid,pars)
%% Spectrum is a complex array with the frequency in
%%the first column and the real and imaginary parts
%%in the second column
%%pars(3)=centrefreq, pars(1)=SW
 spectrum=fftshift(fft(fid(:,3)+i*fid(:,2)));
 tmp=size(spectrum);%matlab workaround
 freq=linspace(pars(3)+pars(1)/2,pars(3)-pars(1)/2,tmp(1));
 spectrum=[freq' spectrum];
endfunction

apk.m
function spectrum=apk(spectrum,phc1)
%%Usage spectrum=apk(spectrum,phc1)
%%Spectrum is a complex matrix with
%%the frequency in the first column
%%and the complex spectrum in the 
%%second column. phc1 is the first order
%%phase correection

 tmp=size(spectrum);
 m=720;
 ph=linspace(-2*pi,2*pi,m);
 maxsig=0;k=1;
 minsig=-inf;
 for n=1:m;
  spex=real( (spectrum(:,2)).*exp(i*(ph(n)+phc1*i/tmp(1))) );
  localmin=min(spex);
        localmax=max(spex);
        if (localmin>minsig) 
                minsig=localmin;
                k=n;
        end
 end
 ph0=ph(k);
 spectrum(:,2)=spectrum(:,2).*exp(i*(ph0+phc1*i/tmp(1)));
end

altapk.m
function [spectrum,ph]=altapk(spectrum,phc0,phc1)
%%Usage -spectrum,ph]=altapk(spectrum,phc0,phc1)
%%Spectrum is a complex matrix with the frequency in the first column
%%and the complex spectrum in the second column. phc0 and phc1 are the first order
%%phase correction parameters, respectively, and are used as initial guesses.
%%This is an implementation of Chen, Weng, Goh and Garland, J. Mag. Res., 2002, 158, 164-168 and depends on entropy.m.

    ph=[phc0;phc1];
        ph=minimize("entropy",{ph,spectrum});

        %compute spectrum with optimal phase params
        pts=linspace(1,size(spectrum(:,2),1),size(spectrum(:,2),1));
        phi=(ph(1)+ph(2).*pts./max(pts))';
        spectrum(:,2)=spectrum(:,2).*exp(i*phi);
end

entropy.m
function E=entropy(ph,spectrum)
%%Used by altapk.m
    pts=linspace(1,size(spectrum(:,2),1),size(spectrum(:,2),1));
        penalty=5.53;

    phi=(ph(1)+ph(2).*pts./max(pts))';
        size(phi);
        spectrum(:,2)=spectrum(:,2).*exp(i*phi);
        R=real(spectrum(:,2));
        size(R);
    Rm=firstderiv(R);
        size(Rm);
    h=abs(Rm)/sum(abs(Rm));
        size(h);

        negs= imag((R).^(1/2));
        negs(find(negs>1))=1;
    P= @(R) penalty.*sum((negs).*R.^2);

    E=-sum(h.*log(h))+P(R);
end

absd.m
function spectrum=absd(spectrum)
%%Usage spectrum=absd(spectrum)
%%Simple (linear) baseline correction
        bsline=@(m) sum(abs(real(spectrum(:,2))-m));
        guess=0;
        p=0;
        newm=minimize(bsline,p);
        spectrum(:,2)=spectrum(:,2)-newm;
end

pltspec.m
function pltspec(spectrum)
%%Usage: pltspec(spectrum)
%%Where spectrum is a complex matrix
%%with the frequency in the first column
%%and the complex spectrum (a+i*b) in the
%%second column
 plot(spectrum(:,1),real(spectrum(:,2)))
end

29 September 2012

249. Quick but precise isotopic pattern (isotope envelope) calculator in Octave

UPDATE: Below is an accurate calculator,  but it is impractically slow for large molecules. A practical AND accurate calculator is found here:http://verahill.blogspot.com.au/2012/10/isotopic-pattern-caculator-in-python.html

Use the post below to learn about the fundamental theory, but then look at the other post to understand how to implement it.

Old post:
Getting fast and accurate isotopic patterns can be tricky using tools available online, for download or which form part of commercial packages. A particular problem is that different tools give slightly different values -- so which one to trust?

The answer: the tool for which you know that the algorithm is sound.

The extreme conclusion of that way of thinking is to write your own calculator.
Below is the conceptual process of calculating the isotopic pattern of a molecule using GNU Octave.

You need the linear algebra package:
sudo apt-get install octave octave-linear-algebra

b is the isotopic distribution for an element, and bb are the masses of those isotopes.

Once you've got a computational engine it's not too difficult to expand it for more general cases, account for charge, and instrument resolution.


Molecule: Cl4

b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(b,b,b,b),2);
ee=sum(cartprod(bb,bb,bb,bb),2);
n=4;
g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
plot(h,100.*distr/max(distr))
[h' (100.*distr/max(distr))']
Here's the output for n=1:
   139.87540    78.22048
   140.87540     0.00000
   141.87540   100.00000
   142.87540     0.00000
   143.87540    47.94141
   144.87540     0.00000
   145.87540    10.21502
   146.87540     0.00000
   147.87540     0.81620

And here's the output from Matt Monroe's calculator:
Isotopic Abundances for Cl4
  Mass/Charge Fraction  Intensity
   139.87541 0.3297755   78.22
   140.87541 0.0000000    0.00
   141.87541 0.4215974  100.00
   142.87541 0.0000000    0.00
   143.87541 0.2021197   47.94
   144.87541 0.0000000    0.00
   145.87541 0.0430662   10.22
   146.87541 0.0000000    0.00
   147.87541 0.0034411    0.82


Another molecule: Li2Cl2

Here's the code:
a=[0.0759,0.9241];
aa=[6.01512,7.01512];
b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(a,a,b,b),2);
ee=sum(cartprod(aa,aa,bb,bb),2);
n=1;
g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
plot(h,100.*distr/max(distr))
[h' (100.*distr/max(distr))']

ans =

    81.96794     0.67170
    82.96794    16.35626
    83.96794   100.00000
    84.96794    10.45523
    85.96794    63.71604
    86.96794     1.67079
    87.96794    10.17116

vs Matt Monroe's calculator:
Isotopic Abundances for Li2Cl2
  Mass/Charge Fraction  Intensity
    81.96795 0.0033082    0.67
    82.96795 0.0805564   16.36
    83.96795 0.4925109  100.00
    84.96795 0.0514932   10.46
    85.96795 0.3138084   63.72
    86.96795 0.0082288    1.67
    87.96795 0.0500941   10.17

We can then expand the code to allow for plotting
a=[0.0759,0.9241];
aa=[6.01512,7.01512];
b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(a,a,b,b),2);
ee=sum(cartprod(aa,aa,bb,bb),2);
n=1;

g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
gauss= @(x,c,r,s) r.*1./(s.*sqrt(2*pi)).*exp(-0.5*((x-c)./s).^2);
k=100.*distr/max(distr);

npts=1000;
resolution=0.25;

x=linspace(min(ee)-1,max(ee)+1,npts);
l=cumsum(gauss(x,h',k',resolution));
l=100*l./max(l(rows(l),:));
plot(x,l(rows(l),:))

which gives:

Compare with Matt Monroe's calculator: