Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites which could be used as an input for Kavosh Algorithm. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The function reads a Metabolic Network SBML file, and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm. The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh. http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/ & http://www.ncbi.nlm.nih.gov/pubmed/19799800 http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network. The Remove Currency Metabolites (RCM) algorithm removes currency metabolites in the metabolic network automatically. Note: COBRA Toolbox must be installed in MATLAB before running this function [Output] = enz_cent_RCM_dir_kavosh(fileName) INPUTS fileName The metabolic Network in the SBML format OUTPUTS *_Enzyme_Cent_RCM_Dir_Index.dat Matrix Indeces of Enzyme-Enzyme Connections *_Enzyme_Cent_RCM_Dir_Kavosh.dat Directed-Enzyme-Enzyme Network - Kavosh Compatible Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0001 function [Output] = enz_cent_RCM_dir_kavosh(fileName) 0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites which could be used as an input for Kavosh Algorithm. 0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0004 % The function reads a Metabolic Network SBML file, 0005 % and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm. 0006 % The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh. 0007 % http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/ & http://www.ncbi.nlm.nih.gov/pubmed/19799800 0008 % http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html 0009 % So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network. 0010 % The Remove Currency Metabolites (RCM) algorithm removes currency metabolites in the metabolic network automatically. 0011 % Note: COBRA Toolbox must be installed in MATLAB before running this function 0012 % 0013 % [Output] = enz_cent_RCM_dir_kavosh(fileName) 0014 % 0015 %INPUTS 0016 % fileName The metabolic Network in the SBML format 0017 % 0018 %OUTPUTS 0019 % *_Enzyme_Cent_RCM_Dir_Index.dat Matrix Indeces of Enzyme-Enzyme Connections 0020 % *_Enzyme_Cent_RCM_Dir_Kavosh.dat Directed-Enzyme-Enzyme Network - Kavosh Compatible 0021 % 0022 % Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir 0023 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0024 0025 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0026 % check validity of input file format 0027 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0028 check=regexp(fileName,'.xml'); 0029 assert(~isempty(check),'The SBML fileName must contain .xml at its end') 0030 0031 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0032 % start time evaluation of program 0033 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0034 tic; 0035 0036 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0037 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix 0038 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0039 model=readCbModel(fileName); 0040 [m,n]=size(model.S); 0041 0042 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0043 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates) 0044 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0045 S_bin=zeros(size(model.S)); 0046 S_bin(find(model.S))=1; 0047 CB=sum(S_bin,1); 0048 A=zeros(m,n); 0049 B=zeros(m,1); 0050 N3=zeros(m,1); 0051 0052 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0053 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix. 0054 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i. 0055 % for each row, sums the binary S-matrix over all columns. 0056 % B(q) means how many enzymes the metabolite q correlates. 0057 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0058 for q=1:m 0059 for i=1:n 0060 if S_bin(q,i)~=0 0061 A(q,i)=CB(1,i)-S_bin(q,i); 0062 end 0063 B(q,1)=B(q,1)+S_bin(q,i); 0064 end 0065 end 0066 0067 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0068 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme. 0069 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case. 0070 % Therefore, every N3(q) value equal to 3 will be marked for further analysis. 0071 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0072 for q=1:m 0073 for i=1:n 0074 if A(q,i)==3 0075 N3(q,1)=N3(q,1)+1; 0076 end 0077 end 0078 end 0079 0080 s=0; 0081 for q=1:m 0082 if N3(q,1)~=0 0083 s=1; 0084 end 0085 end 0086 0087 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0088 % If there is any value for N3 array, the RCM algorithm will be done. 0089 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C) 0090 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated. 0091 % The algorithm is ended if there is not any suspicious case. 0092 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0093 while s==1 0094 C=zeros(m,1); 0095 max1=max(N3,[],1); 0096 for q=1:m 0097 if N3(q,1)==max1 0098 C(q,1)=B(q,1); 0099 else 0100 C(q,1)=0; 0101 end 0102 end 0103 max2=max(C,[],1); 0104 for q=1:m 0105 if ( (N3(q,1)==max1) && (C(q,1)==max2) ) 0106 for i=1:n 0107 S_bin(q,i)=0; 0108 model.S(q,i)=0; 0109 end 0110 end 0111 end 0112 CB=sum(S_bin,1); 0113 A=zeros(m,n); 0114 B=zeros(m,1); 0115 N3=zeros(m,1); 0116 for q=1:m 0117 for i=1:n 0118 if S_bin(q,i)~=0 0119 A(q,i)=CB(1,i)-S_bin(q,i); 0120 end 0121 B(q,1)=B(q,1)+S_bin(q,i); 0122 end 0123 end 0124 for q=1:m 0125 for i=1:n 0126 if A(q,i)==3 0127 N3(q,1)=N3(q,1)+1; 0128 end 0129 end 0130 end 0131 s=0; 0132 for q=1:m 0133 if N3(q,1)~=0 0134 s=1; 0135 end 0136 end 0137 end 0138 0139 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0140 % building the output file name 0141 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0142 outname1=strrep(fileName,'.xml','_Enzyme_Cent_RCM_Dir_Index.dat') 0143 fout1 = fopen(outname1, 'w+'); 0144 0145 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0146 % finds non-zero elements of the S-matrix (in order to make the algorithm faster), 0147 % parses through each row, and considers an edge for every unlike-signs, 0148 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0149 num=zeros(size(model.rxns)); 0150 for j=1:m 0151 indices=find(model.S(j,:)); 0152 [a,b]=size(indices); 0153 r=0; 0154 if b~=0 0155 r=1; 0156 end 0157 while r<b 0158 i=1; 0159 while i<(b-r+1) 0160 if model.S(j,indices(1,r))<0 && model.S(j,indices(1,r+i))>0 0161 fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i)); 0162 num(1,indices(1,r))=1; 0163 num(1,indices(1,r+i))=1; 0164 elseif model.S(j,indices(1,r))>0 && model.S(j,indices(1,r+i))<0 0165 fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r)); 0166 num(1,indices(1,r))=1; 0167 num(1,indices(1,r+i))=1; 0168 end 0169 i=i+1; 0170 end 0171 r=r+1; 0172 end 0173 end 0174 0175 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0176 % considering nodes which do not contain any edges 0177 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0178 for k=1:n 0179 if num(k,1)==0 0180 fprintf(fout1,'%d\n',k); 0181 end 0182 end 0183 fclose(fout1); 0184 0185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0186 % building the output file name 0187 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0188 outname2=strrep(fileName,'.xml','_Enzyme_Cent_RCM_Dir_Kavosh.dat') 0189 fout2=fopen(outname2,'w+'); 0190 0191 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0192 % reading the constructed Enzyme-Enzyme network file and re-format it to a Kavosh-compatible file. 0193 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0194 fid = fopen(outname1); 0195 fgetl(fid); 0196 C=fscanf(fid,'%d'); 0197 i=1; 0198 while isinteger(fid) 0199 C(i)=fscanf(fid,'%d',C); 0200 i=i+1; 0201 end 0202 g=size(C); 0203 A=size(unique(C)); 0204 if g~=0 0205 n=1; 0206 else 0207 disp('Error in reading the file, No Edge detected') 0208 end 0209 k=1; 0210 j=2; 0211 last=g/2; 0212 fprintf(fout2,'%d\n',A(1,1)); % total number of uniques nodes in the network (needed for Kavosh Algorithm) 0213 for i=1:last 0214 fprintf(fout2,'%d\t%d\n ',C(k),C(j)); 0215 k=k+2; 0216 j=j+2; 0217 end 0218 fclose(fid); 0219 fclose(fout2); 0220 0221 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0222 % End of time evaluation of program 0223 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0224 toc;