Loading, converting, and saving models

COBREXA can load models stored in .mat, .json, and .xml formats (with the latter denoting SBML formatted models).

We will primarily use the E. Coli "core" model to demonstrate the utilities found in COBREXA. First, let's download the model in several formats.

# Downloads the model files if they don't already exist
!isfile("e_coli_core.mat") &&
    download("http://bigg.ucsd.edu/static/models/e_coli_core.mat", "e_coli_core.mat");
!isfile("e_coli_core.json") &&
    download("http://bigg.ucsd.edu/static/models/e_coli_core.json", "e_coli_core.json");
!isfile("e_coli_core.xml") &&
    download("http://bigg.ucsd.edu/static/models/e_coli_core.xml", "e_coli_core.xml");

Now, load the package:

using COBREXA

Save bandwidth!

The published models usually do not change very often. It is therefore pretty useful to save them to a central location and load them from there. That saves your time, and does not unnecessarily consume the connectivity resources of the model repository.

Loading models

Load the models using the load_model function. Each model is able to "pretty-print" itself, hiding the inner complexity.

mat_model = load_model("e_coli_core.mat")

Metabolic model of type MATModel
sparse([9, 51, 55, 64, 65, 34, 44, 59, 66, 64  …  20, 22, 23, 25, 16, 17, 34, 44, 57, 59], [1, 1, 1, 1, 1, 2, 2, 2, 2, 3  …  93, 93, 94, 94, 95, 95, 95, 95, 95, 95], [1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0  …  1.0, -1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0], 72, 95)
Number of reactions: 95
Number of metabolites: 72

json_model = load_model("e_coli_core.json")

Metabolic model of type JSONModel
sparse([9, 51, 55, 64, 65, 34, 44, 59, 66, 64  …  20, 22, 23, 25, 16, 17, 34, 44, 57, 59], [1, 1, 1, 1, 1, 2, 2, 2, 2, 3  …  93, 93, 94, 94, 95, 95, 95, 95, 95, 95], [1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0  …  1.0, -1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0], 72, 95)
Number of reactions: 95
Number of metabolites: 72

sbml_model = load_model("e_coli_core.xml")

Metabolic model of type SBMLModel
sparse([41, 23, 51, 67, 61, 65, 1, 7, 19, 28  …  72, 3, 8, 33, 57, 66, 31, 45, 46, 57], [1, 2, 2, 2, 3, 3, 4, 4, 4, 4  …  93, 94, 94, 94, 94, 94, 95, 95, 95, 95], [-1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0, 1.0, -1.0  …  1.0, -1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0, -1.0, 1.0], 72, 95)
Number of reactions: 95
Number of metabolites: 72

Note: `load_model` infers the output type from the file extension

Notice how each model was read into memory as a model type corresponding to its file type, i.e. the file ending with .json loaded as a JSONModel, the file ending with .mat loaded as MATModel, and the file ending with .xml loaded as an SBMLModel.

You can directly inspect the model objects, although only with a specific way for each specific type.

JSON models contain their corresponding JSON:

json_model.json

Dict{String, Any} with 6 entries:
  "metabolites"  => Any[Dict{String, Any}("compartment"=>"e", "name"=>"D-Glucos…
  "id"           => "e_coli_core"
  "compartments" => Dict{String, Any}("c"=>"cytosol", "e"=>"extracellular space…
  "reactions"    => Any[Dict{String, Any}("name"=>"Phosphofructokinase", "metab…
  "version"      => "1"
  "genes"        => Any[Dict{String, Any}("name"=>"adhE", "id"=>"b1241", "notes…

SBML models contain a complicated structure from SBML.jl package:

typeof(sbml_model.sbml)

SBML.Model

MAT models contain MATLAB data:

mat_model.mat

Dict{String, Any} with 17 entries:
  "description" => "e_coli_core"
  "c"           => [0.0; 0.0; … ; 0.0; 0.0;;]
  "rev"         => [0; 0; … ; 1; 0;;]
  "mets"        => Any["glc__D_e"; "gln__L_c"; … ; "g3p_c"; "g6p_c";;]
  "grRules"     => Any["b3916 or b1723"; "((b0902 and b0903) and b2579) or (b09…
  "subSystems"  => Any["Glycolysis/Gluconeogenesis"; "Pyruvate Metabolism"; … ;…
  "b"           => [0.0; 0.0; … ; 0.0; 0.0;;]
  "metFormulas" => Any["C6H12O6"; "C5H10N2O3"; … ; "C3H5O6P"; "C6H11O9P";;]
  "rxnGeneMat"  => sparse([6, 10, 6, 11, 27, 82, 93, 94, 12, 12  …  37, 38, 38,…
  "S"           => [0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0…
  "metNames"    => Any["D-Glucose"; "L-Glutamine"; … ; "Glyceraldehyde 3-phosph…
  "lb"          => [0.0; 0.0; … ; -1000.0; 0.0;;]
  "metCharge"   => [0.0; 0.0; … ; -2.0; -2.0;;]
  "ub"          => [1000.0; 1000.0; … ; 1000.0; 1000.0;;]
  "rxnNames"    => Any["Phosphofructokinase"; "Pyruvate formate lyase"; … ; "O2…
  "rxns"        => Any["PFK"; "PFL"; … ; "O2t"; "PDH";;]
  "genes"       => Any["b1241"; "b0351"; … ; "b2935"; "b3919";;]

Using the generic interface to access model details

To prevent the complexities of object representation, COBREXA.jl uses a set of generic interface functions that extract various important information from all supported model types. This approach ensures that the analysis functions can work on any data.

For example, you can check the reactions and metabolites contained in SBML and JSON models using the same accessor:

reactions(json_model)

95-element Vector{String}:
 "PFK"
 "PFL"
 "PGI"
 "PGK"
 "PGL"
 "ACALD"
 "AKGt2r"
 "PGM"
 "PIt2r"
 "ALCD2x"
 ⋮
 "MALt2_2"
 "MDH"
 "ME1"
 "ME2"
 "NADH16"
 "NADTRHD"
 "NH4t"
 "O2t"
 "PDH"

reactions(sbml_model)

95-element Vector{String}:
 "R_EX_fum_e"
 "R_ACONTb"
 "R_TPI"
 "R_SUCOAS"
 "R_GLNS"
 "R_EX_pi_e"
 "R_PPC"
 "R_O2t"
 "R_G6PDH2r"
 "R_TALA"
 ⋮
 "R_THD2"
 "R_EX_h2o_e"
 "R_GLUSy"
 "R_ME1"
 "R_GLUN"
 "R_EX_o2_e"
 "R_FRUpts2"
 "R_ALCD2x"
 "R_PIt2r"

issetequal(reactions(json_model), reactions(mat_model)) # do models contain the same reactions?

true

All accessors are defined in a single file in COBREXA source code; you may therefore get a list of all accessors as follows:

using InteractiveUtils

for method in filter(
    x -> endswith(string(x.file), "MetabolicModel.jl"),
    InteractiveUtils.methodswith(MetabolicModel, COBREXA),
)
    println(method.name)
end

balance
bounds
coupling
coupling_bounds
gene_annotations
gene_name
gene_notes
genes
metabolite_annotations
metabolite_charge
metabolite_compartment
metabolite_formula
metabolite_name
metabolite_notes
metabolites
n_coupling_constraints
n_genes
n_metabolites
n_reactions
objective
precache!
reaction_annotations
reaction_gene_association
reaction_name
reaction_notes
reaction_stoichiometry
reaction_subsystem
reactions
stoichiometry

Converting between model types

It is possible to convert model types to-and-fro. To do this, use the convert function, which is overloaded from Julia's Base.

Data loss may occur when converting between models

The conversion of models only uses the data accessible through the generic accessors. Other data may get lost.

m = convert(MATModel, json_model)

Metabolic model of type MATModel
sparse([9, 51, 55, 64, 65, 34, 44, 59, 66, 64  …  20, 22, 23, 25, 16, 17, 34, 44, 57, 59], [1, 1, 1, 1, 1, 2, 2, 2, 2, 3  …  93, 93, 94, 94, 95, 95, 95, 95, 95, 95], [1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0  …  1.0, -1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0], 72, 95)
Number of reactions: 95
Number of metabolites: 72

m will now contain the MATLAB-style matrix representation of the model:

Matrix(m.mat["S"])

72×95 Matrix{Float64}:
  0.0  0.0   0.0  0.0   0.0  0.0   0.0  …  0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0  …  0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0  -1.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  1.0  0.0   0.0  0.0   1.0  1.0   1.0     0.0  0.0  -4.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0  -1.0     0.0  0.0   3.0  0.0  0.0  0.0  0.0
  ⋮                          ⋮          ⋱             ⋮                   
 -1.0  0.0   1.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  1.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  1.0   0.0  0.0   0.0  0.0   0.0  …  0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0   0.0  0.0   0.0  0.0   0.0  …  0.0  0.0   0.0  0.0  0.0  0.0  0.0
  0.0  0.0  -1.0  0.0   0.0  0.0   0.0     0.0  0.0   0.0  0.0  0.0  0.0  0.0

The loading and conversion can be combined using a shortcut:

m = load_model(MATModel, "e_coli_core.json")

Metabolic model of type MATModel
sparse([9, 51, 55, 64, 65, 34, 44, 59, 66, 64  …  20, 22, 23, 25, 16, 17, 34, 44, 57, 59], [1, 1, 1, 1, 1, 2, 2, 2, 2, 3  …  93, 93, 94, 94, 95, 95, 95, 95, 95, 95], [1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0  …  1.0, -1.0, 1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0], 72, 95)
Number of reactions: 95
Number of metabolites: 72

Saving and exporting models

COBREXA.jl supports exporting the models in JSON and MAT format, using save_model.

save_model(m, "converted_model.json")
save_model(m, "converted_model.mat")

If you need a non-standard suffix, use the type-specific saving functions:

save_json_model(m, "file.without.a.good.suffix")
save_mat_model(m, "another.file.matlab")

If you are saving the models only for future processing in Julia environment, it is often wasteful to encode the models to external formats and decode them back. Instead, you can use the "native" Julia data format, accessible with package Serialization.

This way, you can use serialize to save even the StandardModel that has no file format associated:

using Serialization

sm = convert(StandardModel, m)

open(f -> serialize(f, sm), "myModel.stdmodel", "w")

The models can then be loaded back using deserialize:

sm2 = deserialize("myModel.stdmodel")
issetequal(metabolites(sm), metabolites(sm2))

true

This form of loading operation is usually pretty quick:

t = @elapsed deserialize("myModel.stdmodel")
@info "Deserialization took $t seconds"

[ Info: Deserialization took 0.002182626 seconds

Notably, large and complicated models with thousands of reactions and annotations can take seconds to decode properly. Serialization allows you to almost completely remove this overhead, and scales well to tens of millions of reactions.

This page was generated using Literate.jl.