ACM Home Page
Please provide us with feedback. Feedback
SWARM: a scientific workflow for supporting bayesian approaches to improve metabolic models
Full text PdfPdf (563 KB)
Source
International Workshop on Challenges of Large Applications in Distributed Environments archive
Proceedings of the 6th international workshop on Challenges of large applications in distributed environments table of contents
Boston, MA, USA
SESSION: Scientific workflow table of contents
Pages 25-34  
Year of Publication: 2008
ISBN:978-1-60558-156-9
Authors
Xinghua Shi  University of Chicago, Chicago, IL, USA
Rick Stevens  Argonne National Laboratory, Argonne, IL, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 72,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1383529.1383535
What is a DOI?

ABSTRACT

With the exponential growth of complete genome sequences, the analysis of these sequences is becoming a powerful approach to build genome-scale metabolic models. These models can be used to study individual molecular components and their relationships, and eventually study cells as systems. However, constructing genome-scale metabolic models manually is time-consuming and labor-intensive. This property of manual model-building process causes the fact that much fewer genome-scale metabolic models are available comparing to hundreds of genome sequences available. To tackle this problem, we design SWARM, a scientific workflow that can be utilized to improve genome-scale metabolic models in high-throughput fashion. SWARM deals with a range of issues including the integration of data across distributed resources, data format conversions, data update, and data provenance. Putting altogether, SWARM streamlines the whole modeling process that includes extracting data from various resources, deriving training datasets to train a set of predictors and applying Bayesian techniques to assemble the predictors, inferring on the ensemble of predictors to insert missing data, and eventually improving draft metabolic networks automatically. By the enhancement of metabolic model construction, SWARM enables scientists to generate many genome-scale metabolic models within a short period of time and with less effort.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Reed, J.L. and Palsson, B.Ø. 2003. Thirteen Years of Building Constraint-Based In Silico Models of Escherichia coli. Journal of Bacteriology, Vol. 185, No. 9, p. 2692--2699.
 
2
Edwards, J.S., Covert, M., and Palsson, B.Ø. 2002. Metabolic modelling of microbes: the flux-balance approach. Environ. Microbiol. 4:133--140.
 
3
Varma, A. and Palsson, B.Ø. 1994. Metabolic flux balancing: basic concepts, scientific and practical use. BioTechnology 12:994--998.
 
4
Feist, A.M., Henry, C.S., et al. 2007. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Molecular Systems Biology.
 
5
Reed, J.L., Vo, T.D., Schilling, C.H., and Palsson, B.Ø. 2003. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol. 4(9):R54.
 
6
Edwards, J.S. and Palsson, B.Ø. 2000. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc Natl Acad Sci USA. 97:5528--5533.
 
7
Becker, S.A. and Palsson, B.Ø. 2005. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC Microbiol. 7;5(1):8.
 
8
Thiele, I., Vo, T.D., Price, N.D., and Palsson, B.Ø. 2005. Expanded Metabolic Reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an In Silico Genome-Scale Characterization of Single- and Double-Deletion Mutants. Journal of Bacteriology, Vol.187, No.16, p.5818--5830.
 
9
Forster, J., Famili, I., Fu, P., Palsson, B.Ø., and Nielsen, J. 2003. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res. 13(2):244-5-5-53.
 
10
Duarte, N.C., Herrgard, M.J., and Palsson, B.Ø. 2004. Reconstruction and Validation of Saccharomyces cerevisiae iND750, a Fully Compartmentalized Genome-Scale Metabolic Model. Genome Research 14:1298--1309.
 
11
Oliveira, A.P., Nielsen, J., and Forster, J. 2005. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol. 27;5:39.
 
12
Oh, Y.K., Palsson, B.Ø., Park, S.M., Schilling, C.H., and Mahadevan, R. 2007. Genome-scale Reconstruction of Metabolic Network in Bacillus subtilis Based on High-throughput Phenotyping and Gene Essentiality Data. J Biol Chem. 10.1074.
 
13
Schilling, C.H., Covert, M.W., Famili, I., Church, G.M., Edwards, J.S., and Palsson, B.Ø. 2002. Genome-scale metabolic model of Helicobacter pylori 26695. J Bacteriol. 184(16):4582--93.
 
14
Gates, B., Pinchuk, G.E., Schilling, C., et al. 2006. Genome-Scale Metabolic Model of Shewanella oneidensis MR1. GTL.
 
15
Feist, M.A., Scholten, C.J., Palsson, B.Ø., et.al. 2006. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Molecular Systems Biology.
 
16
Edwards, J.S. and Palsson, B.Ø. 1999. Systems Properties of the Haemophilus influenzae Rd Metabolic Genotype. Journal of Biological Chemistry, 274, 17410--17416.
 
17
Duarte, N.D., Becker, S.A., et al. 2007. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad. Sci USA 104(6):1777--82.
 
18
BIGG (A Biochemical Genetic and Genomic Database of large scale metabolic reconstructions.): http://bigg.ucsd.edu/
 
19
Osterman, A. and Overbeek, R. 2003. Missing genes in metabolic pathways: a comparative genomics approach. Curr Opin Chem Biol , 7:238--251.
20
 
21
The SEED: an Annotation/Analysis Tool Provided by FIG: http://theseed.uchicago.edu/.
 
22
Kanehisa, M., Araki, M., et al. 2008. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480-D484.
 
23
KEGG: Kyoto Encyclopedia of Genes and Genome: http://www.genome.jp/kegg/.
 
24
 
25
Kharchenko, P., Chen, L., et al. 2006. Identifying metabolic enzymes with multiple types of association evidence. BMC Bioinformatics. 29;7(1):177.
 
26
DeJongh, M., Formsma, K., Boillot, P., Gould, J., Rycenga, M., and Best, A. 2007. Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics, 8:139.
 
27
Becker, S.A., Feist, A.M., et al. 2007. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nature Protocols 2, - 727 -- 738 .
 
28
SimPheny: www.genomatica.com/solutions_simpheny.shtml.
 
29
Klamt, S., Stelling, J., Ginkel, M., and Gilles, E.D. 2003. FluxAnalyzer: exploring structure, pathways, and flux distributions in metabolic networks on interactive flux maps. Bioinformatics, 19(2): 261--269.
 
30
Klamt, S., Saez-Rodriguez, J., and Gilles, E.D. 2007. Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Systems Biology, 1:2.
 
31
Green, M.L. and Karp, P.D. 2004. A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics, vol. 5, no. 76.
 
32
Karp, P.D., Paley, S., and Romero, P. 2002. The Pathway Tools software. Bioinformatics. 18 Suppl 1:S225--32.
 
33
Overbeek, R., Begley, T., Butler, R.M., et al. 2005. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 7;33(17):5691--702.
 
34
Graphviz: Graph Visualization Software: www.graphviz.org.
 
35
CellDesigner: A modeling tool of biochemical networks: http://www.celldesigner.org/
 
36
Systems Biology Markup Language (SBML): www.sbml.org.
 
37
Gene Ontology: http://www.geneontology.org/
 
38
TCDB: Transport Classification Database: www.tcdb.org.
 
39
40
 
41
Barker, A. and Hemert, J. 2007. Scientific Workflow: A Survey and Reaearch Directions. In Proceedings of the The Third Grid Applications and Middleware Workshop (GAMW'2007), Gdansk, Poland.
 
42
Ludäscher, B., Altintas, I., et al. 2005. Scientific Workflow Management and the Kepler System. Concurrency and Computation: Practice & Experience, 36.
 
43
Bowers, S. and Ludascher, B. 2005. Actor-Oriented Design of Scientific Workflows. In 24 th Intl. Conf. on Conceptual Modeling (ER).
 
44
 
45
Stevens, R.D., Robinson, A.J., and Goble, C.A. 2003. myGrid: personalised bioinformatics on the information grid. Bioinformatics 19(1) c Oxford University Press.
 
46
 
47
Merelli, I., Morra, G., and Milanesi, L. 2005. Bioinformatics Workflow using ASSIST on GRID. In Proc. of The Network Tools and Applications in Biology Workshop (NETTAB), Naples, Italy.
 
48
Swift: http://www.ci.uchicago.edu/swift/.
49
 
50
Rsync: http://samba.anu.edu.au/rsync/.

Collaborative Colleagues:
Xinghua Shi: colleagues
Rick Stevens: colleagues