Home > scan_toolbox > enz_cent_RCMLib_dir_quatexelero.m

enz_cent_RCMLib_dir_quatexelero

PURPOSE ^

Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for QuateXelero Algorithm.

SYNOPSIS ^

function [Output] = enz_cent_RCMLib_dir_quatexelero(fileName1,fileName2)

DESCRIPTION ^

 Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for QuateXelero Algorithm.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 The function reads a Metabolic Network SBML file, 
 and builds Directed Enzyme-Enzyme Networks which is compatible with QuateXelero Algorithm.
 The QuateXelero is one of the best motif finding algorithms which is recently developed by Kavosh developer team.
 http://lbb.ut.ac.ir/Download/LBBsoft/QuateXelero
 So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
 The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites 
 in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
 Note: COBRA Toolbox must be installed in MATLAB before running this function

 [Output] = enz_cent_RCMLib_dir_quatexelero(fileName1,fileName2)

INPUTS
 fileName1                                   The Library file includes pre-defined currency metabolites (in .txt format)
 Note: Library text file must include one metabolites per line (all in one column) 
 fileName2                                   The metabolic Network in the SBML format
 
OUTPUTS
 *_Enzyme_Cent_RCMLib_Dir_Index.dat          Matrix Indeces of Enzyme-Enzyme Connections 
 *_Enzyme_Cent_RCMLib_Dir_QuateXelero.dat    Directed-Enzyme-Enzyme Network - QuateXelero Compatible
 
 Yazdan Asgari 07/16/2016                 http://yazdan59.ir/scan
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [Output] = enz_cent_RCMLib_dir_quatexelero(fileName1,fileName2)
0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for QuateXelero Algorithm.
0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0004 % The function reads a Metabolic Network SBML file,
0005 % and builds Directed Enzyme-Enzyme Networks which is compatible with QuateXelero Algorithm.
0006 % The QuateXelero is one of the best motif finding algorithms which is recently developed by Kavosh developer team.
0007 % http://lbb.ut.ac.ir/Download/LBBsoft/QuateXelero
0008 % So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
0009 % The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites
0010 % in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
0011 % Note: COBRA Toolbox must be installed in MATLAB before running this function
0012 %
0013 % [Output] = enz_cent_RCMLib_dir_quatexelero(fileName1,fileName2)
0014 %
0015 %INPUTS
0016 % fileName1                                   The Library file includes pre-defined currency metabolites (in .txt format)
0017 % Note: Library text file must include one metabolites per line (all in one column)
0018 % fileName2                                   The metabolic Network in the SBML format
0019 %
0020 %OUTPUTS
0021 % *_Enzyme_Cent_RCMLib_Dir_Index.dat          Matrix Indeces of Enzyme-Enzyme Connections
0022 % *_Enzyme_Cent_RCMLib_Dir_QuateXelero.dat    Directed-Enzyme-Enzyme Network - QuateXelero Compatible
0023 %
0024 % Yazdan Asgari 07/16/2016                 http://yazdan59.ir/scan
0025 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0026 
0027 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0028 % check validity of input files format
0029 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0030 check1=regexp(fileName1,'.txt');
0031 assert(~isempty(check1),'Error in the first input: The fileName1 must contain .txt at its end')
0032 check2=regexp(fileName2,'.xml');
0033 assert(~isempty(check2),'Error in the second input: The fileName2 must contain .xml at its end')
0034 
0035 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0036 % start time evaluation of program
0037 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0038 tic;
0039 
0040 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0041 % reading the Library text file and construct array of currency metabolites
0042 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0043 fid = fopen(fileName1);
0044 tline = fgetl(fid);
0045 i=1;
0046 Curr_met={};
0047 while ischar(tline)
0048     Curr_met{i,1}=tline;
0049     tline = fgetl(fid);
0050     i=i+1;
0051 end
0052 fclose(fid);
0053 [h,g]=size(Curr_met);
0054 
0055 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0056 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix
0057 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0058 model=readCbModel(fileName2);
0059 [m,n]=size(model.S);
0060 
0061 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0062 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates)
0063 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0064 S_bin=zeros(size(model.S));
0065 S_bin(find(model.S))=1;
0066 CB=sum(S_bin,1);
0067 A=zeros(m,n);
0068 B=zeros(m,1);
0069 N3=zeros(m,1);
0070 N_curr=zeros(m,1);
0071 
0072 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0073 % reading the Metabolites array and check their availability in the library text file
0074 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0075 for q=1:m
0076     for i=1:h
0077         if strcmp(model.metNames{q},Curr_met{i,1})==1
0078             N_curr(q,1)=N_curr(q,1)+1;
0079         end
0080     end
0081 end
0082 
0083 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0084 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix.
0085 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i.
0086 % for each row, sums the binary S-matrix over all columns.
0087 % B(q) means how many enzymes the metabolite q correlates.
0088 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0089 for q=1:m
0090     for i=1:n
0091         if S_bin(q,i)~=0
0092             A(q,i)=CB(1,i)-S_bin(q,i);
0093         end
0094         B(q,1)=B(q,1)+S_bin(q,i);
0095     end
0096 end
0097 
0098 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0099 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme.
0100 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case.
0101 % Therefore, every A(q) value equal to 3 will be marked for further analysis.
0102 % In addition, the availability of the metabolite in the library file will be checked.
0103 % So, the metabolites which do not exist in the library file, will not select for further analysis.
0104 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0105 for q=1:m
0106     for i=1:n
0107         if A(q,i)==3 && N_curr(q,1)~=0
0108             N3(q,1)=N3(q,1)+1;
0109         end
0110     end
0111 end
0112 
0113 s=0;
0114 for q=1:m
0115     if N3(q,1)~=0
0116         s=1;
0117     end
0118 end
0119 
0120 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0121 % If there is any value for N3 array, the RCM algorithm will be done.
0122 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C)
0123 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated.
0124 % The algorithm is ended if there is not any suspicious case.
0125 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0126 while s==1
0127     C=zeros(m,1);
0128     max1=max(N3,[],1);
0129     for q=1:m
0130         if N3(q,1)==max1
0131             C(q,1)=B(q,1);
0132         else
0133             C(q,1)=0;
0134         end
0135     end
0136     max2=max(C,[],1);
0137     for q=1:m
0138         if ( (N3(q,1)==max1) && (C(q,1)==max2) )
0139             for i=1:n
0140                 S_bin(q,i)=0;
0141                 model.S(q,i)=0;
0142             end
0143         end
0144     end
0145     
0146     CB=sum(S_bin,1);
0147     A=zeros(m,n);
0148     B=zeros(m,1);
0149     N3=zeros(m,1);
0150     for q=1:m
0151         for i=1:n
0152             if S_bin(q,i)~=0
0153                 A(q,i)=CB(1,i)-S_bin(q,i);
0154             end
0155             B(q,1)=B(q,1)+S_bin(q,i);
0156         end
0157     end
0158     for q=1:m
0159         for i=1:n
0160             if A(q,i)==3 && N_curr(q,1)~=0
0161                 N3(q,1)=N3(q,1)+1;
0162             end
0163         end
0164     end
0165     s=0;
0166     for q=1:m
0167         if N3(q,1)~=0
0168             s=1;
0169         end
0170     end
0171 end
0172 
0173 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0174 % building the output file name
0175 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0176 outname1=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_Index.dat')
0177 fout1 = fopen(outname1, 'w+');
0178 
0179 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0180 % finds non-zero elements of the S-matrix (in order to make the algorithm faster),
0181 % parses through each row, and considers an edge for every unlike-signs.
0182 % It also consider Reversibility for Enzyme-Enzyme network.
0183 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0184 num=zeros(size(model.rxns));
0185 for j=1:m
0186     indices=find(model.S(j,:));
0187     [a,b]=size(indices);
0188     r=0;
0189     if b~=0
0190         r=1;
0191     end
0192     while r<b
0193         i=1;
0194         while i<(b-r+1)
0195             if model.S(j,indices(1,r))>0 && model.S(j,indices(1,r+i))<0
0196                 if model.rev(indices(r+i))==1 && model.rev(indices(r))==1
0197                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0198                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0199                     num(1,indices(1,r))=1;
0200                     num(1,indices(1,r+i))=1;
0201                 else
0202                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0203                     num(1,indices(1,r))=1;
0204                     num(1,indices(1,r+i))=1;
0205                 end
0206             elseif model.S(j,indices(1,r))<0 && model.S(j,indices(1,r+i))>0
0207                 if model.rev(indices(r+i))==1 && model.rev(indices(r))==1
0208                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0209                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0210                     num(1,indices(1,r))=1;
0211                     num(1,indices(1,r+i))=1;
0212                 else
0213                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0214                     num(1,indices(1,r))=1;
0215                     num(1,indices(1,r+i))=1;
0216                 end
0217             end
0218             i=i+1;
0219         end
0220         r=r+1;
0221     end
0222 end
0223 
0224 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0225 % building the output file name
0226 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0227 outname2=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_QuateXelero.dat')    
0228 fout2=fopen(outname2,'w+');
0229 
0230 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0231 % reading the constructed Enzyme-Enzyme network file and re-format it to a Kavosh-compatible file.
0232 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0233 fid = fopen(outname1);
0234 C=fscanf(fid,'%d');
0235 i=1;
0236 while isinteger(fid)
0237     C(i)=fscanf(fid,'%d',C);
0238     i=i+1;
0239 end
0240 g=size(C);
0241 A=size(unique(C));
0242 if g~=0
0243     n=1;
0244 else
0245     disp('Error in reading the file, No Edge detected')
0246 end
0247 k=1;
0248 j=2;
0249 last=g/2;
0250 fprintf(fout2,'%d\n',A(1,1));   % total number of uniques nodes in the network (needed for Kavosh Algorithm)
0251 for i=1:last
0252     fprintf(fout2,'%d\t%d\n ',C(k),C(j));
0253     k=k+2;
0254     j=j+2;
0255 end
0256 fclose(fid);
0257 fclose(fout2);
0258 
0259 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0260 % End of time evaluation of program
0261 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0262 toc;

Generated on Sat 16-Jul-2016 17:47:22 by m2html © 2003