Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file) and considering single nodes (without any edges) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The function reads a Metabolic Network SBML file and builds Enzyme-Enzyme Networks. The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file. This file also contains single nodes (without any edges) in Cytoscape-compatible files. Note: COBRA Toolbox must be installed in MATLAB before running this function [Output] = enz_cent_RCMLib_single_node(fileName1,fileName2) INPUTS fileName1 The Library file includes pre-defined currency metabolites (in .txt format) Note: Library text file must include one metabolites per line (all in one column) fileName2 The metabolic Network in the SBML format OUTPUTS *_Removed_Mets_RCMLib.dat file contains removed metabolits from the original model *_Enzyme_Cent_RCMLib.dat Undirected-Enzyme-Enzyme Network (comma separated Format) *_Enzyme_Cent_RCMLib_single_node_Cyt.dat Undirected-Enzyme-Enzyme Network - Cytoscape Compatible Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0001 function [Output] = enz_cent_RCMLib_single_node(fileName1,fileName2) 0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file) and considering single nodes (without any edges) 0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0004 % The function reads a Metabolic Network SBML file and builds Enzyme-Enzyme Networks. 0005 % The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites 0006 % in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file. 0007 % This file also contains single nodes (without any edges) in Cytoscape-compatible files. 0008 % Note: COBRA Toolbox must be installed in MATLAB before running this function 0009 % 0010 % [Output] = enz_cent_RCMLib_single_node(fileName1,fileName2) 0011 % 0012 %INPUTS 0013 % fileName1 The Library file includes pre-defined currency metabolites (in .txt format) 0014 % Note: Library text file must include one metabolites per line (all in one column) 0015 % fileName2 The metabolic Network in the SBML format 0016 % 0017 %OUTPUTS 0018 % *_Removed_Mets_RCMLib.dat file contains removed metabolits from the original model 0019 % *_Enzyme_Cent_RCMLib.dat Undirected-Enzyme-Enzyme Network (comma separated Format) 0020 % *_Enzyme_Cent_RCMLib_single_node_Cyt.dat Undirected-Enzyme-Enzyme Network - Cytoscape Compatible 0021 % 0022 % Yazdan Asgari 12/07/2012 http://lbb.ut.ac.ir 0023 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0024 0025 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0026 % check validity of input files format 0027 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0028 check1=regexp(fileName1,'.txt'); 0029 assert(~isempty(check1),'Error in the first input: The fileName1 must contain .txt at its end') 0030 check2=regexp(fileName2,'.xml'); 0031 assert(~isempty(check2),'Error in the second input: The fileName2 must contain .xml at its end') 0032 0033 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0034 % start time evaluation of program 0035 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0036 tic; 0037 0038 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0039 % reading the Library text file and construct array of currency metabolites 0040 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0041 fid = fopen(fileName1); 0042 tline = fgetl(fid); 0043 i=1; 0044 Curr_met={}; 0045 while ischar(tline) 0046 Curr_met{i,1}=tline; 0047 tline = fgetl(fid); 0048 i=i+1; 0049 end 0050 fclose(fid); 0051 [h,g]=size(Curr_met); 0052 0053 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0054 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix 0055 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0056 model=readCbModel(fileName2); 0057 [m,n]=size(model.S); 0058 0059 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0060 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates) 0061 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0062 S_bin=zeros(size(model.S)); 0063 S_bin(find(model.S))=1; 0064 CB=sum(S_bin,1); 0065 A=zeros(m,n); 0066 B=zeros(m,1); 0067 N3=zeros(m,1); 0068 N_curr=zeros(m,1); 0069 0070 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0071 % reading the Metabolites array and check their availability in the library text file 0072 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0073 for q=1:m 0074 for i=1:h 0075 if strcmp(model.metNames{q},Curr_met{i,1})==1 0076 N_curr(q,1)=N_curr(q,1)+1; 0077 end 0078 end 0079 end 0080 0081 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0082 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix. 0083 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i. 0084 % for each row, sums the binary S-matrix over all columns. 0085 % B(q) means how many enzymes the metabolite q correlates. 0086 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0087 for q=1:m 0088 for i=1:n 0089 if S_bin(q,i)~=0 0090 A(q,i)=CB(1,i)-S_bin(q,i); 0091 end 0092 B(q,1)=B(q,1)+S_bin(q,i); 0093 end 0094 end 0095 0096 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0097 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme. 0098 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case. 0099 % Therefore, every N3(q) value equal to 3 will be marked for further analysis. 0100 % In addition, the availability of the metabolite in the library file will be checked. 0101 % So, the metabolites which do not exist in the library file, will not select for further analysis. 0102 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0103 for q=1:m 0104 for i=1:n 0105 if A(q,i)==3 && N_curr(q,1)~=0 0106 N3(q,1)=N3(q,1)+1; 0107 end 0108 end 0109 end 0110 0111 s=0; 0112 for q=1:m 0113 if N3(q,1)~=0 0114 s=1; 0115 end 0116 end 0117 0118 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0119 % building the output file name for writing removed metabolites 0120 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0121 outname1=strrep(fileName2,'.xml','_Removed_Mets_RCMLib.dat') 0122 fout1 = fopen(outname1, 'w+'); 0123 fprintf(fout1, 'Metabolite\t\tMetabolite Name\t\tMax1\t\tMax2\n'); 0124 fprintf(fout1, '----------------------------------------------\n'); 0125 0126 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0127 % If there is any value for N3 array, the RCM algorithm will be done. 0128 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C) 0129 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated. 0130 % The algorithm is ended if there is not any suspicious case. 0131 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0132 while s==1 0133 C=zeros(m,1); 0134 max1=max(N3,[],1); 0135 for q=1:m 0136 if N3(q,1)==max1 0137 C(q,1)=B(q,1); 0138 else 0139 C(q,1)=0; 0140 end 0141 end 0142 max2=max(C,[],1); 0143 for q=1:m 0144 if ( (N3(q,1)==max1) && (C(q,1)==max2) ) 0145 for i=1:n 0146 S_bin(q,i)=0; 0147 end 0148 fprintf(fout1,'%s\t\t%s\t\t%d\t\t%d\n',model.mets{q},model.metNames{q},max1,max2); 0149 end 0150 end 0151 0152 CB=sum(S_bin,1); 0153 A=zeros(m,n); 0154 B=zeros(m,1); 0155 N3=zeros(m,1); 0156 for q=1:m 0157 for i=1:n 0158 if S_bin(q,i)~=0 0159 A(q,i)=CB(1,i)-S_bin(q,i); 0160 end 0161 B(q,1)=B(q,1)+S_bin(q,i); 0162 end 0163 end 0164 for q=1:m 0165 for i=1:n 0166 if A(q,i)==3 && N_curr(q,1)~=0 0167 N3(q,1)=N3(q,1)+1; 0168 end 0169 end 0170 end 0171 s=0; 0172 for q=1:m 0173 if N3(q,1)~=0 0174 s=1; 0175 end 0176 end 0177 end 0178 0179 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0180 % construction of Undirected-Enzyme-Enzyme Network based on the new binary S-matrix(comma separated Format) 0181 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0182 Aenz=S_bin'*S_bin; 0183 outname2=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib.dat') 0184 dlmwrite(outname2,full(Aenz)); 0185 0186 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0187 % re-format of Undirected-Enzyme-Enzyme Network it to a Cytoscape-compatible file. 0188 % One could import the file using "File/Import/Network from Table(Text/MS Excel)..." 0189 % Select "first column" as "Source Interaction" and "second column" as "Target Interaction" 0190 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0191 [m,n]=size(Aenz); 0192 outname3=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_single_node_Cyt.dat') 0193 fout2 = fopen(outname3, 'w+'); 0194 for row=1:m 0195 num=0; 0196 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0197 % because cell(i,j)=cell(j,i) we must delete duplicate entries by putting 0198 % col=row:n in the second if command. since we must ignor diagonal elements, 0199 % the counter will be col=row+1:n 0200 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0201 for col=row+1:n 0202 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0203 % edge are those which includes number not equal to zero 0204 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0205 if Aenz(row,col)~=0 0206 fprintf(fout2, '%s\t%s\t%d\n',model.rxns{row},model.rxns{col},Aenz(row,col)); 0207 num=num+1; 0208 end 0209 end 0210 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0211 % considering nodes which do not contain any edges 0212 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0213 if num==0 0214 fprintf(fout2,'%s\n',model.rxns{row}); 0215 end 0216 end 0217 fclose(fout1); 0218 fclose(fout2); 0219 0220 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0221 % End of time evaluation of program 0222 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0223 toc