Home > . > enz_cent_RCM_dir_quatexelero.m

enz_cent_RCM_dir_quatexelero

PURPOSE ^

Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites which could be used as an input for QuateXelero Algorithm.

SYNOPSIS ^

function [Output] = enz_cent_RCM_dir_quatexelero(fileName)

DESCRIPTION ^

 Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites which could be used as an input for QuateXelero Algorithm.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 The function reads a Metabolic Network SBML file, 
 and builds Directed Enzyme-Enzyme Networks which is compatible with QuateXelero Algorithm.
 The QuateXelero is one of the best motif finding algorithms which is recently developed by Kavosh developer team.
 http://lbb.ut.ac.ir/Download/LBBsoft/QuateXelero
 So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
 The Remove Currency Metabolites (RCM) algorithm removes currency metabolites in the metabolic network automatically.
 Note: COBRA Toolbox must be installed in MATLAB before running this function

 [Output] = enz_cent_RCM_dir_quatexelero(fileName)

INPUTS
 fileName                                 The metabolic Network in the SBML format
 
OUTPUTS
 *_Enzyme_Cent_RCM_Dir_Index.dat          Matrix Indeces of Enzyme-Enzyme Connections
 *_Enzyme_Cent_RCM_Dir_QuateXelero.dat    Directed-Enzyme-Enzyme Network - QuateXelero Compatible
 
 Yazdan Asgari 12/07/2012              http://lbb.ut.ac.ir
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [Output] = enz_cent_RCM_dir_quatexelero(fileName)
0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites which could be used as an input for QuateXelero Algorithm.
0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0004 % The function reads a Metabolic Network SBML file,
0005 % and builds Directed Enzyme-Enzyme Networks which is compatible with QuateXelero Algorithm.
0006 % The QuateXelero is one of the best motif finding algorithms which is recently developed by Kavosh developer team.
0007 % http://lbb.ut.ac.ir/Download/LBBsoft/QuateXelero
0008 % So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
0009 % The Remove Currency Metabolites (RCM) algorithm removes currency metabolites in the metabolic network automatically.
0010 % Note: COBRA Toolbox must be installed in MATLAB before running this function
0011 %
0012 % [Output] = enz_cent_RCM_dir_quatexelero(fileName)
0013 %
0014 %INPUTS
0015 % fileName                                 The metabolic Network in the SBML format
0016 %
0017 %OUTPUTS
0018 % *_Enzyme_Cent_RCM_Dir_Index.dat          Matrix Indeces of Enzyme-Enzyme Connections
0019 % *_Enzyme_Cent_RCM_Dir_QuateXelero.dat    Directed-Enzyme-Enzyme Network - QuateXelero Compatible
0020 %
0021 % Yazdan Asgari 12/07/2012              http://lbb.ut.ac.ir
0022 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0023 
0024 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0025 % check validity of input file format
0026 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0027 check=regexp(fileName,'.xml');
0028 assert(~isempty(check),'The SBML fileName must contain .xml at its end')
0029 
0030 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0031 % start time evaluation of program
0032 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0033 tic;
0034 
0035 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0036 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix
0037 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0038 model=readCbModel(fileName);
0039 [m,n]=size(model.S);
0040 
0041 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0042 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates)
0043 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0044 S_bin=zeros(size(model.S));
0045 S_bin(find(model.S))=1;
0046 CB=sum(S_bin,1);
0047 A=zeros(m,n);
0048 B=zeros(m,1);
0049 N3=zeros(m,1);
0050 
0051 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0052 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix.
0053 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i.
0054 % for each row, sums the binary S-matrix over all columns.
0055 % B(q) means how many enzymes the metabolite q correlates.
0056 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0057 for q=1:m
0058     for i=1:n
0059         if S_bin(q,i)~=0
0060             A(q,i)=CB(1,i)-S_bin(q,i);
0061         end
0062         B(q,1)=B(q,1)+S_bin(q,i);
0063     end
0064 end
0065 
0066 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0067 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme.
0068 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case.
0069 % Therefore, every N3(q) value equal to 3 will be marked for further analysis.
0070 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0071 for q=1:m
0072     for i=1:n
0073         if A(q,i)==3
0074             N3(q,1)=N3(q,1)+1;
0075         end
0076     end
0077 end
0078 
0079 s=0;
0080 for q=1:m
0081     if N3(q,1)~=0
0082         s=1;
0083     end
0084 end
0085 
0086 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0087 % If there is any value for N3 array, the RCM algorithm will be done.
0088 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C)
0089 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated.
0090 % The algorithm is ended if there is not any suspicious case.
0091 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0092 while s==1
0093     C=zeros(m,1);
0094     max1=max(N3,[],1);
0095     for q=1:m
0096         if N3(q,1)==max1
0097             C(q,1)=B(q,1);
0098         else
0099             C(q,1)=0;
0100         end
0101     end
0102     max2=max(C,[],1);
0103     for q=1:m
0104         if ( (N3(q,1)==max1) && (C(q,1)==max2) )
0105             for i=1:n
0106                 S_bin(q,i)=0;
0107                 model.S(q,i)=0;
0108             end
0109         end
0110     end    
0111     CB=sum(S_bin,1);
0112     A=zeros(m,n);
0113     B=zeros(m,1);
0114     N3=zeros(m,1);
0115     for q=1:m
0116         for i=1:n
0117             if S_bin(q,i)~=0
0118                 A(q,i)=CB(1,i)-S_bin(q,i);
0119             end
0120             B(q,1)=B(q,1)+S_bin(q,i);
0121         end
0122     end
0123     for q=1:m
0124         for i=1:n
0125             if A(q,i)==3
0126                 N3(q,1)=N3(q,1)+1;
0127             end
0128         end
0129     end
0130     s=0;
0131     for q=1:m
0132         if N3(q,1)~=0
0133             s=1;
0134         end
0135     end
0136 end
0137 
0138 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0139 % building the output file name
0140 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0141 outname1=strrep(fileName,'.xml','_Enzyme_Cent_RCM_Dir_Index.dat')
0142 fout1 = fopen(outname1, 'w+');
0143 
0144 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0145 % finds non-zero elements of the S-matrix (in order to make the algorithm faster),
0146 % parses through each row, and considers an edge for every unlike-signs,
0147 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0148 num=zeros(size(model.rxns));
0149 for j=1:m
0150     indices=find(model.S(j,:));
0151     [a,b]=size(indices);
0152     r=0;
0153     if b~=0
0154         r=1;
0155     end 
0156     while r<b
0157         i=1;
0158         while i<(b-r+1)
0159             if model.S(j,indices(1,r))<0 && model.S(j,indices(1,r+i))>0
0160                 fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0161                 num(1,indices(1,r))=1;
0162                 num(1,indices(1,r+i))=1;
0163             elseif model.S(j,indices(1,r))>0 && model.S(j,indices(1,r+i))<0
0164                 fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0165                 num(1,indices(1,r))=1;
0166                 num(1,indices(1,r+i))=1;
0167             end
0168             i=i+1;
0169         end
0170         r=r+1;
0171     end
0172 end
0173 
0174 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0175 % considering nodes which do not contain any edges
0176 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0177 for k=1:n
0178     if num(k,1)==0
0179         fprintf(fout1,'%d\n',k);
0180     end
0181 end
0182 fclose(fout1);
0183 
0184 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0185 % building the output file name
0186 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0187 outname2=strrep(fileName,'.xml','_Enzyme_Cent_RCM_Dir_QuateXelero.dat')    
0188 fout2=fopen(outname2,'w+');
0189 
0190 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0191 % reading the constructed Enzyme-Enzyme network file and re-format it to a Kavosh-compatible file.
0192 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0193 fid = fopen(outname1);
0194 fgetl(fid);
0195 C=fscanf(fid,'%d');
0196 i=1;
0197 while isinteger(fid)
0198     C(i)=fscanf(fid,'%d',C);
0199     i=i+1;
0200 end
0201 g=size(C);
0202 A=size(unique(C));
0203 if g~=0
0204     n=1;
0205 else
0206     disp('Error in reading the file, No Edge detected')
0207 end
0208 k=1;
0209 j=2;
0210 last=g/2;
0211 fprintf(fout2,'%d\n',A(1,1));   % total number of uniques nodes in the network (needed for Kavosh Algorithm)
0212 for i=1:last
0213     fprintf(fout2,'%d\t%d\n ',C(k),C(j));
0214     k=k+2;
0215     j=j+2;
0216 end
0217 fclose(fid);
0218 fclose(fout2);
0219 
0220 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0221 % End of time evaluation of program
0222 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0223 toc;

Generated on Thu 13-Dec-2012 14:17:37 by m2html © 2005