Home > scan_toolbox > enz_cent_RCMLib_dir_kavosh.m

enz_cent_RCMLib_dir_kavosh

PURPOSE ^

Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.

SYNOPSIS ^

function [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)

DESCRIPTION ^

 Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 The function reads a Metabolic Network SBML file, 
 and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm.
 The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh.
 http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/  &   http://www.ncbi.nlm.nih.gov/pubmed/19799800
 http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html
 So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
 The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites 
 in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
 Note: COBRA Toolbox must be installed in MATLAB before running this function

 [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)

INPUTS
 fileName1                               The Library file includes pre-defined currency metabolites (in .txt format)
 Note: Library text file must include one metabolites per line (all in one column) 
 fileName2                               The metabolic Network in the SBML format
 
OUTPUTS
 *_Enzyme_Cent_RCMLib_Dir_Index.dat      Matrix Indeces of Enzyme-Enzyme Connections 
 *_Enzyme_Cent_RCMLib_Dir_Kavosh.dat     Directed-Enzyme-Enzyme Network - Kavosh Compatible
 
 Yazdan Asgari 07/16/2016             http://yazdan59.ir/scan
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)
0002 % Builds Directed Enzyme-Enzyme Networks Removing Currency Metabolites (based-on a Library file), which could be used as an input for Kavosh Algorithm.
0003 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0004 % The function reads a Metabolic Network SBML file,
0005 % and builds Directed Enzyme-Enzyme Networks which is compatible with Kavosh Algorithm.
0006 % The Kavosh is one of the best motif finding algorithms. Its Cytoscape plugins is also called CytoKavosh.
0007 % http://lbb.ut.ac.ir/Download/LBBsoft/Kavosh/  &   http://www.ncbi.nlm.nih.gov/pubmed/19799800
0008 % http://lbb.ut.ac.ir/Download/LBBsoft/CytoKavosh/CytoKavosh-Manual/cytoKavoshTutorial.html
0009 % So, one could easily use this algorithm in order to find motifs in different sizes for the metabolic network.
0010 % The Remove Currency Metabolites based-on Library (RCMLib) algorithm removes currency metabolites
0011 % in the metabolic network automatically IF AND ONLY IF the currency metabolits exists in the Library file.
0012 % Note: COBRA Toolbox must be installed in MATLAB before running this function
0013 %
0014 % [Output] = enz_cent_RCMLib_dir_kavosh(fileName1,fileName2)
0015 %
0016 %INPUTS
0017 % fileName1                               The Library file includes pre-defined currency metabolites (in .txt format)
0018 % Note: Library text file must include one metabolites per line (all in one column)
0019 % fileName2                               The metabolic Network in the SBML format
0020 %
0021 %OUTPUTS
0022 % *_Enzyme_Cent_RCMLib_Dir_Index.dat      Matrix Indeces of Enzyme-Enzyme Connections
0023 % *_Enzyme_Cent_RCMLib_Dir_Kavosh.dat     Directed-Enzyme-Enzyme Network - Kavosh Compatible
0024 %
0025 % Yazdan Asgari 07/16/2016             http://yazdan59.ir/scan
0026 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0027 
0028 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0029 % check validity of input files format
0030 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0031 check1=regexp(fileName1,'.txt');
0032 assert(~isempty(check1),'Error in the first input: The fileName1 must contain .txt at its end')
0033 check2=regexp(fileName2,'.xml');
0034 assert(~isempty(check2),'Error in the second input: The fileName2 must contain .xml at its end')
0035 
0036 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0037 % start time evaluation of program
0038 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0039 tic;
0040 
0041 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0042 % reading the Library text file and construct array of currency metabolites
0043 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0044 fid = fopen(fileName1);
0045 tline = fgetl(fid);
0046 i=1;
0047 Curr_met={};
0048 while ischar(tline)
0049     Curr_met{i,1}=tline;
0050     tline = fgetl(fid);
0051     i=i+1;
0052 end
0053 fclose(fid);
0054 [h,g]=size(Curr_met);
0055 
0056 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0057 % reading the SBML file using COBRA Toolbox Command, and sets size of the S matrix
0058 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0059 model=readCbModel(fileName2);
0060 [m,n]=size(model.S);
0061 
0062 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0063 % calculate summation of each columns (i.e. How many metabolites each enzyme correlates)
0064 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0065 S_bin=zeros(size(model.S));
0066 S_bin(find(model.S))=1;
0067 CB=sum(S_bin,1);
0068 A=zeros(m,n);
0069 B=zeros(m,1);
0070 N3=zeros(m,1);
0071 N_curr=zeros(m,1);
0072 
0073 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0074 % reading the Metabolites array and check their availability in the library text file
0075 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0076 for q=1:m
0077     for i=1:h
0078         if strcmp(model.metNames{q},Curr_met{i,1})==1
0079             N_curr(q,1)=N_curr(q,1)+1;
0080         end
0081     end
0082 end
0083 
0084 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0085 % for each binary S-matrix element, subtracts its value from the column summation and put the result in the A matrix.
0086 % A(q) means the metabolite q connects to how many other metabolites through the enzyme i.
0087 % for each row, sums the binary S-matrix over all columns.
0088 % B(q) means how many enzymes the metabolite q correlates.
0089 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0090 for q=1:m
0091     for i=1:n
0092         if S_bin(q,i)~=0
0093             A(q,i)=CB(1,i)-S_bin(q,i);
0094         end
0095         B(q,1)=B(q,1)+S_bin(q,i);
0096     end
0097 end
0098 
0099 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0100 % Assumption: Generally, every metabolite is connected to the other one through a specific enzyme.
0101 % If a metabolite connects to more than one metabolite through an enzyme, this will be considered as a suspicious case.
0102 % Therefore, every A(q) value equal to 3 will be marked for further analysis.
0103 % In addition, the availability of the metabolite in the library file will be checked.
0104 % So, the metabolites which do not exist in the library file, will not select for further analysis.
0105 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0106 for q=1:m
0107     for i=1:n
0108         if A(q,i)==3 && N_curr(q,1)~=0
0109             N3(q,1)=N3(q,1)+1;
0110         end
0111     end
0112 end
0113 
0114 s=0;
0115 for q=1:m
0116     if N3(q,1)~=0
0117         s=1;
0118     end
0119 end
0120 
0121 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0122 % If there is any value for N3 array, the RCM algorithm will be done.
0123 % This algorithm will be deleted the most probable metabolite among all (i.e. the one with the maximum value of N3 and C)
0124 % The selected metabolite will be deleted in the binary S-Matrix, and the "WHILE LOOP" repeated.
0125 % The algorithm is ended if there is not any suspicious case.
0126 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0127 while s==1
0128     C=zeros(m,1);
0129     max1=max(N3,[],1);
0130     for q=1:m
0131         if N3(q,1)==max1
0132             C(q,1)=B(q,1);
0133         else
0134             C(q,1)=0;
0135         end
0136     end
0137     max2=max(C,[],1);
0138     for q=1:m
0139         if ( (N3(q,1)==max1) && (C(q,1)==max2) )
0140             for i=1:n
0141                 S_bin(q,i)=0;
0142                 model.S(q,i)=0;
0143             end
0144         end
0145     end
0146     
0147     CB=sum(S_bin,1);
0148     A=zeros(m,n);
0149     B=zeros(m,1);
0150     N3=zeros(m,1);
0151     for q=1:m
0152         for i=1:n
0153             if S_bin(q,i)~=0
0154                 A(q,i)=CB(1,i)-S_bin(q,i);
0155             end
0156             B(q,1)=B(q,1)+S_bin(q,i);
0157         end
0158     end
0159     for q=1:m
0160         for i=1:n
0161             if A(q,i)==3 && N_curr(q,1)~=0
0162                 N3(q,1)=N3(q,1)+1;
0163             end
0164         end
0165     end
0166     s=0;
0167     for q=1:m
0168         if N3(q,1)~=0
0169             s=1;
0170         end
0171     end
0172 end
0173 
0174 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0175 % building the output file name
0176 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0177 outname1=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_Index.dat')
0178 fout1 = fopen(outname1, 'w+');
0179 
0180 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0181 % finds non-zero elements of the S-matrix (in order to make the algorithm faster),
0182 % parses through each row, and considers an edge for every unlike-signs.
0183 % It also consider Reversibility for Enzyme-Enzyme network.
0184 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0185 num=zeros(size(model.rxns));
0186 for j=1:m
0187     indices=find(model.S(j,:));
0188     [a,b]=size(indices);
0189     r=0;
0190     if b~=0
0191         r=1;
0192     end 
0193     while r<b
0194         i=1;
0195         while i<(b-r+1)
0196             if model.S(j,indices(1,r))>0 && model.S(j,indices(1,r+i))<0
0197                 if model.rev(indices(r+i))==1 && model.rev(indices(r))==1
0198                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0199                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0200                     num(1,indices(1,r))=1;
0201                     num(1,indices(1,r+i))=1;
0202                 else
0203                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0204                     num(1,indices(1,r))=1;
0205                     num(1,indices(1,r+i))=1;
0206                 end
0207             elseif model.S(j,indices(1,r))<0 && model.S(j,indices(1,r+i))>0
0208                 if model.rev(indices(r+i))==1 && model.rev(indices(r))==1
0209                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0210                     fprintf(fout1,'%d\t%d\n',indices(1,r),indices(1,r+i));
0211                     num(1,indices(1,r))=1;
0212                     num(1,indices(1,r+i))=1;
0213                 else
0214                     fprintf(fout1,'%d\t%d\n',indices(1,r+i),indices(1,r));
0215                     num(1,indices(1,r))=1;
0216                     num(1,indices(1,r+i))=1;
0217                 end
0218             end
0219             i=i+1;
0220         end
0221         r=r+1;
0222     end
0223 end
0224 
0225 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0226 % building the output file name
0227 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0228 outname2=strrep(fileName2,'.xml','_Enzyme_Cent_RCMLib_Dir_Kavosh.dat')    
0229 fout2=fopen(outname2,'w+');
0230 
0231 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0232 % reading the constructed Enzyme-Enzyme network file and re-format it to a Kavosh-compatible file.
0233 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0234 fid = fopen(outname1);
0235 C=fscanf(fid,'%d');
0236 i=1;
0237 while isinteger(fid)
0238     C(i)=fscanf(fid,'%d',C);
0239     i=i+1;
0240 end
0241 g=size(C);
0242 A=size(unique(C));
0243 if g~=0
0244     n=1;
0245 else
0246     disp('Error in reading the file, No Edge detected')
0247 end
0248 k=1;
0249 j=2;
0250 last=g/2;
0251 fprintf(fout2,'%d\n',A(1,1));   % total number of uniques nodes in the network (needed for Kavosh Algorithm)
0252 for i=1:last
0253     fprintf(fout2,'%d\t%d\n ',C(k),C(j));
0254     k=k+2;
0255     j=j+2;
0256 end
0257 fclose(fid);
0258 fclose(fout2);
0259 
0260 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0261 % End of time evaluation of program
0262 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0263 toc;

Generated on Sat 16-Jul-2016 17:47:22 by m2html © 2003