Home > . > enz_cent_RCMLib_dir_kavosh.m

enz_cent_RCMLib_dir_kavosh

PURPOSE ^

Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.

SYNOPSIS ^

function [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)

DESCRIPTION ^

 Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 The function reads a Metabolic Network SBML file, 
 and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm.
 The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh.
 http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/  &   http://www.ncbi.nlm.nih.gov/pubmed/19799800
 http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html
 So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
 The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites 
 in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
 Note: COBRA Toolbox must be installed in MATLAB before running this function

 [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)

INPUTS
 fileName1                               The Library file includes pre-defined currency metabolites (in .txt format)
 Note: Library text file must include one metabolites per line (all in one column) 
 fileName2                               The metabolic Network in the SBML format
 
OUTPUTS
 *_Enzyme_Cent_RCMLib_Dir_Index.dat      Matrix Indeces of Enzyme-Enzyme Connections 
 *_Enzyme_Cent_RCMLib_Dir_Kavosh.dat     Directed-Enzyme-Enzyme Network - Kavosh Compatible
 
 Yazdan Asgari 12/07/2012             http://lbb.ut.ac.ir
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)
0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.
0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0004 % The function reads a Metabolic Network SBML file,
0005 % and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm.
0006 % The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh.
0007 % http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/  &   http://www.ncbi.nlm.nih.gov/pubmed/19799800
0008 % http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html
0009 % So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
0010 % The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites
0011 % in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
0012 % Note: COBRA Toolbox must be installed in MATLAB before running this function
0013 %
0014 % [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)
0015 %
0016 %INPUTS
0017 % fileName1                               The Library file includes pre-defined currency metabolites (in .txt format)
0018 % Note: Library text file must include one metabolites per line (all in one column)
0019 % fileName2                               The metabolic Network in the SBML format
0020 %
0021 %OUTPUTS
0022 % *_Enzyme_Cent_RCMLib_Dir_Index.dat      Matrix Indeces of Enzyme-Enzyme Connections
0023 % *_Enzyme_Cent_RCMLib_Dir_Kavosh.dat     Directed-Enzyme-Enzyme Network - Kavosh Compatible
0024 %
0025 % Yazdan Asgari 12/07/2012             http://lbb.ut.ac.ir
0026 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0027 
0028 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0029 % check validity of input files format
0030 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0031 check1=regexp(fileName1,'.txt');
0032 assert(~isempty(check1),'Error in the first input: The fileName1 must contain .txt at its end')
0033 check2=regexp(fileName2,'.xml');
0034 assert(~isempty(check2),'Error in the second input: The fileName2 must contain .xml at its end')
0035 
0036 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0037 % start time evaluation of program
0038 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0039 tic;
0040 
0041 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0042 % reading the Library text file and construct array of currency metabolites
0043 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0044 fid = fopen(fileName1);
0045 tline = fgetl(fid);
0046 i=1;
0047 Curr_met={};
0048 while ischar(tline)
0049     Curr_met{i,1}=tline;
0050     tline = fgetl(fid);
0051     i=i+1;
0052 end
0053 fclose(fid);
0054 [h,g]=size(Curr_met);
0055 
0056 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0057 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix
0058 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0059 model=readCbModel(fileName2);
0060 [m,n]=size(model.S);
0061 
0062 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0063 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates)
0064 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0065 S_bin=zeros(size(model.S));
0066 S_bin(find(model.S))=1;
0067 CB=sum(S_bin,1);
0068 A=zeros(m,n);
0069 B=zeros(m,1);
0070 N3=zeros(m,1);
0071 N_curr=zeros(m,1);
0072 
0073 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0074 % reading the Metabolites array and check their availability in the library text file
0075 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0076 for q=1:m
0077     for i=1:h
0078         if strcmp(model.metNames{q},Curr_met{i,1})==1
0079             N_curr(q,1)=N_curr(q,1)+1;
0080         end
0081     end
0082 end
0083 
0084 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0085 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix.
0086 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i.
0087 % for each row, sums the binary S-matrix over all columns.
0088 % B(q) means how many enzymes the metabolite q correlates.
0089 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0090 for q=1:m
0091     for i=1:n
0092         if S_bin(q,i)~=0
0093             A(q,i)=CB(1,i)-S_bin(q,i);
0094         end
0095         B(q,1)=B(q,1)+S_bin(q,i);
0096     end
0097 end
0098 
0099 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0100 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme.
0101 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case.
0102 % Therefore, every N3(q) value equal to 3 will be marked for further analysis.
0103 % In addition, the availability of the metabolite in the library file will be checked.
0104 % So, the metabolites which do not exist in the library file, will not select for further analysis.
0105 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0106 for q=1:m
0107     for i=1:n
0108         if A(q,i)==3 && N_curr(q,1)~=0
0109             N3(q,1)=N3(q,1)+1;
0110         end
0111     end
0112 end
0113 
0114 s=0;
0115 for q=1:m
0116     if N3(q,1)~=0
0117         s=1;
0118     end
0119 end
0120 
0121 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0122 % If there is any value for N3 array, the RCM algorithm will be done.
0123 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C)
0124 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated.
0125 % The algorithm is ended if there is not any suspicious case.
0126 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0127 while s==1
0128     C=zeros(m,1);
0129     max1=max(N3,[],1);
0130     for q=1:m
0131         if N3(q,1)==max1
0132             C(q,1)=B(q,1);
0133         else
0134             C(q,1)=0;
0135         end
0136     end
0137     max2=max(C,[],1);
0138     for q=1:m
0139         if ( (N3(q,1)==max1) && (C(q,1)==max2) )
0140             for i=1:n
0141                 S_bin(q,i)=0;
0142                 model.S(q,i)=0;
0143             end
0144         end
0145     end
0146     
0147     CB=sum(S_bin,1);
0148     A=zeros(m,n);
0149     B=zeros(m,1);
0150     N3=zeros(m,1);
0151     for q=1:m
0152         for i=1:n
0153             if S_bin(q,i)~=0
0154                 A(q,i)=CB(1,i)-S_bin(q,i);
0155             end
0156             B(q,1)=B(q,1)+S_bin(q,i);
0157         end
0158     end
0159     for q=1:m
0160         for i=1:n
0161             if A(q,i)==3 && N_curr(q,1)~=0
0162                 N3(q,1)=N3(q,1)+1;
0163             end
0164         end
0165     end
0166     s=0;
0167     for q=1:m
0168         if N3(q,1)~=0
0169             s=1;
0170         end
0171     end
0172 end
0173 
0174 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0175 % building the output file name
0176 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0177 outname1=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_Index.dat')
0178 fout1 = fopen(outname1, 'w+');
0179 
0180 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0181 % finds non-zero elements of the S-matrix (in order to make the algorithm faster),
0182 % parses through each row, and considers an edge for every unlike-signs,
0183 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0184 num=zeros(size(model.rxns));
0185 for j=1:m
0186     indices=find(model.S(j,:));
0187     [a,b]=size(indices);
0188     r=0;
0189     if b~=0
0190         r=1;
0191     end 
0192     while r<b
0193         i=1;
0194         while i<(b-r+1)
0195             if model.S(j,indices(1,r))<0 && model.S(j,indices(1,r+i))>0
0196                 fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0197                 num(1,indices(1,r))=1;
0198                 num(1,indices(1,r+i))=1;
0199             elseif model.S(j,indices(1,r))>0 && model.S(j,indices(1,r+i))<0
0200                 fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0201                 num(1,indices(1,r))=1;
0202                 num(1,indices(1,r+i))=1;
0203             end
0204             i=i+1;
0205         end
0206         r=r+1;
0207     end
0208 end
0209 
0210 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0211 % considering nodes which do not contain any edges
0212 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0213 for k=1:n
0214     if num(k,1)==0
0215         fprintf(fout1,'%d\n',k);
0216     end
0217 end
0218 fclose(fout1);
0219 
0220 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0221 % building the output file name
0222 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0223 outname2=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_Kavosh.dat')    
0224 fout2=fopen(outname2,'w+');
0225 
0226 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0227 % reading the constructed Enzyme-Enzyme network file and re-format it to a Kavosh-compatible file.
0228 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0229 fid = fopen(outname1);
0230 fgetl(fid);
0231 C=fscanf(fid,'%d');
0232 i=1;
0233 while isinteger(fid)
0234     C(i)=fscanf(fid,'%d',C);
0235     i=i+1;
0236 end
0237 g=size(C);
0238 A=size(unique(C));
0239 if g~=0
0240     n=1;
0241 else
0242     disp('Error in reading the file, No Edge detected')
0243 end
0244 k=1;
0245 j=2;
0246 last=g/2;
0247 fprintf(fout2,'%d\n',A(1,1));   % total number of uniques nodes in the network (needed for Kavosh Algorithm)
0248 for i=1:last
0249     fprintf(fout2,'%d\t%d\n ',C(k),C(j));
0250     k=k+2;
0251     j=j+2;
0252 end
0253 fclose(fid);
0254 fclose(fout2);
0255 
0256 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0257 % End of time evaluation of program
0258 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0259 toc;

Generated on Thu 13-Dec-2012 14:17:37 by m2html © 2005