Builds Binary Stoichiometric Matrix and Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The function reads a Metabolic Network SBML file and builds Enzyme-Enzyme Networks. The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file. Note: COBRA Toolbox must be installed in MATLAB before running this function [Output] = enz_cent_RCMLib(fileName1,fileName2) INPUTS fileName1 The Library file includes pre-defined currency metabolites (in .txt format) Note: Library text file must include one metabolites per line (all in one column) fileName2 The metabolic Network in the SBML format OUTPUTS *_Removed_Mets_RCMLib.dat file contains removed metabolits from the original model *_Enzyme_Cent_RCMLib.dat Undirected-Enzyme-Enzyme Network (comma separated Format) *_Enzyme_Cent_RCMLib_Cyt.dat Undirected-Enzyme-Enzyme Network - Cytoscape Compatible Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0001 function [Output] = enz_cent_RCMLib(fileName1,fileName2) 0002 % Builds Binary Stoichiometric Matrix and Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file) 0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0004 % The function reads a Metabolic Network SBML file and builds Enzyme-Enzyme Networks. 0005 % The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites 0006 % in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file. 0007 % Note: COBRA Toolbox must be installed in MATLAB before running this function 0008 % 0009 % [Output] = enz_cent_RCMLib(fileName1,fileName2) 0010 % 0011 %INPUTS 0012 % fileName1 The Library file includes pre-defined currency metabolites (in .txt format) 0013 % Note: Library text file must include one metabolites per line (all in one column) 0014 % fileName2 The metabolic Network in the SBML format 0015 % 0016 %OUTPUTS 0017 % *_Removed_Mets_RCMLib.dat file contains removed metabolits from the original model 0018 % *_Enzyme_Cent_RCMLib.dat Undirected-Enzyme-Enzyme Network (comma separated Format) 0019 % *_Enzyme_Cent_RCMLib_Cyt.dat Undirected-Enzyme-Enzyme Network - Cytoscape Compatible 0020 % 0021 % Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir 0022 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0023 0024 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0025 % check validity of input files format 0026 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0027 check1=regexp(fileName1,'.txt'); 0028 assert(~isempty(check1),'Error in the first input: The fileName1 must contain .txt at its end') 0029 check2=regexp(fileName2,'.xml'); 0030 assert(~isempty(check2),'Error in the second input: The fileName2 must contain .xml at its end') 0031 0032 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0033 % start time evaluation of program 0034 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0035 tic; 0036 0037 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0038 % reading the Library text file and construct array of currency metabolites 0039 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0040 fid = fopen(fileName1); 0041 tline = fgetl(fid); 0042 i=1; 0043 Curr_met={}; 0044 while ischar(tline) 0045 Curr_met{i,1}=tline; 0046 tline = fgetl(fid); 0047 i=i+1; 0048 end 0049 fclose(fid); 0050 [h,g]=size(Curr_met); 0051 0052 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0053 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix 0054 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0055 model=readCbModel(fileName2); 0056 [m,n]=size(model.S); 0057 0058 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0059 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates) 0060 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0061 S_bin=zeros(size(model.S)); 0062 S_bin(find(model.S))=1; 0063 CB=sum(S_bin,1); 0064 A=zeros(m,n); 0065 B=zeros(m,1); 0066 N3=zeros(m,1); 0067 N_curr=zeros(m,1); 0068 0069 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0070 % reading the Metabolites array and check their availability in the library text file 0071 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0072 for q=1:m 0073 for i=1:h 0074 if strcmp(model.metNames{q},Curr_met{i,1})==1 0075 N_curr(q,1)=N_curr(q,1)+1; 0076 end 0077 end 0078 end 0079 0080 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0081 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix. 0082 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i. 0083 % for each row, sums the binary S-matrix over all columns. 0084 % B(q) means how many enzymes the metabolite q correlates. 0085 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0086 for q=1:m 0087 for i=1:n 0088 if S_bin(q,i)~=0 0089 A(q,i)=CB(1,i)-S_bin(q,i); 0090 end 0091 B(q,1)=B(q,1)+S_bin(q,i); 0092 end 0093 end 0094 0095 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0096 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme. 0097 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case. 0098 % Therefore, every N3(q) value equal to 3 will be marked for further analysis. 0099 % In addition, the availability of the metabolite in the library file will be checked. 0100 % So, the metabolites which do not exist in the library file, will not select for further analysis. 0101 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0102 for q=1:m 0103 for i=1:n 0104 if A(q,i)==3 && N_curr(q,1)~=0 0105 N3(q,1)=N3(q,1)+1; 0106 end 0107 end 0108 end 0109 0110 s=0; 0111 for q=1:m 0112 if N3(q,1)~=0 0113 s=1; 0114 end 0115 end 0116 0117 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0118 % building the output file name for writing removed metabolites 0119 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0120 outname1=strrep(fileName2,'.xml','_Removed_Mets_RCMLib.dat') 0121 fout1 = fopen(outname1, 'w+'); 0122 fprintf(fout1, 'Metabolite\t\tMetabolite Name\t\tMax1\t\tMax2\n'); 0123 fprintf(fout1, '----------------------------------------------\n'); 0124 0125 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0126 % If there is any value for N3 array, the RCM algorithm will be done. 0127 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C) 0128 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated. 0129 % The algorithm is ended if there is not any suspicious case. 0130 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0131 while s==1 0132 C=zeros(m,1); 0133 max1=max(N3,[],1); 0134 for q=1:m 0135 if N3(q,1)==max1 0136 C(q,1)=B(q,1); 0137 else 0138 C(q,1)=0; 0139 end 0140 end 0141 max2=max(C,[],1); 0142 for q=1:m 0143 if ( (N3(q,1)==max1) && (C(q,1)==max2) ) 0144 for i=1:n 0145 S_bin(q,i)=0; 0146 end 0147 fprintf(fout1,'%s\t\t%s\t\t%d\t\t%d\n',model.mets{q},model.metNames{q},max1,max2); 0148 end 0149 end 0150 0151 CB=sum(S_bin,1); 0152 A=zeros(m,n); 0153 B=zeros(m,1); 0154 N3=zeros(m,1); 0155 for q=1:m 0156 for i=1:n 0157 if S_bin(q,i)~=0 0158 A(q,i)=CB(1,i)-S_bin(q,i); 0159 end 0160 B(q,1)=B(q,1)+S_bin(q,i); 0161 end 0162 end 0163 for q=1:m 0164 for i=1:n 0165 if A(q,i)==3 && N_curr(q,1)~=0 0166 N3(q,1)=N3(q,1)+1; 0167 end 0168 end 0169 end 0170 s=0; 0171 for q=1:m 0172 if N3(q,1)~=0 0173 s=1; 0174 end 0175 end 0176 end 0177 0178 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0179 % construction of Undirected-Enzyme-Enzyme Network based on the new binary S-matrix(comma separated Format) 0180 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0181 Aenz=S_bin'*S_bin; 0182 outname2=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib.dat') 0183 dlmwrite(outname2,full(Aenz)); 0184 0185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0186 % re-format of Undirected-Enzyme-Enzyme Network it to a Cytoscape-compatible file. 0187 % One could import the file using "File/Import/Network from Table(Text/MS Excel)..." 0188 % Select "first column" as "Source Interaction" and "second column" as "Target Interaction" 0189 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0190 [m,n]=size(Aenz); 0191 outname3=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Cyt.dat') 0192 fout2 = fopen(outname3, 'w+'); 0193 for row=1:m 0194 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0195 % because cell(i,j)=cell(j,i) we must delete duplicate entries by putting 0196 % col=row:n in the second if command. since we must ignor diagonal elements, 0197 % the counter will be col=row+1:n 0198 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0199 for col=row+1:n 0200 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0201 % edge are those which includes number not equal to zero 0202 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0203 if Aenz(row,col)~=0 0204 fprintf(fout2, '%s\t%s\t%d\n',model.rxns{row},model.rxns{col},Aenz(row,col)); 0205 end 0206 end 0207 end 0208 fclose(fout1); 0209 fclose(fout2); 0210 0211 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0212 % End of time evaluation of program 0213 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0214 toc; 0215