Gene knockouts
Here we will use the knockout
function to modify the optimization model before solving, in order to simulate genes knocked out. We can pass knockout
to many analysis functions that support parameter modifications
, including flux_balance_analysis
, flux_variability_analysis
, and others.
Deleting a single gene
!isfile("e_coli_core.xml") &&
download("http://bigg.ucsd.edu/static/models/e_coli_core.xml", "e_coli_core.xml")
using COBREXA, GLPK
model = load_model("e_coli_core.xml")
Metabolic model of type SBMLModel sparse([8, 10, 21, 43, 50, 51, 8, 9, 6, 12 … 33, 66, 68, 72, 23, 26, 33, 72, 22, 33], [1, 1, 1, 1, 1, 1, 2, 2, 3, 3 … 93, 93, 93, 93, 94, 94, 94, 94, 95, 95], [-1.0, 1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0, -1.0, 1.0 … 1.0, -1.0, 1.0, -1.0, -1.0, 1.0, 1.0, -1.0, -1.0, 1.0], 72, 95) Number of reactions: 95 Number of metabolites: 72
First, let's compute the "original" flux, with no knockouts.
original_flux = flux_balance_analysis_dict(model, GLPK.Optimizer);
One can find gene IDs that we can knock out using genes
and gene_name
functions:
genes(model)
137-element Vector{String}: "G_b0008" "G_b0114" "G_b0115" "G_b0116" "G_b0118" "G_b0351" "G_b0356" "G_b0451" "G_b0474" "G_b0485" ⋮ "G_b4122" "G_b4151" "G_b4152" "G_b4153" "G_b4154" "G_b4232" "G_b4301" "G_b4395" "G_s0001"
It is possible to sort the genes by gene name to allow easier lookups:
sort(gene_name.(Ref(model), genes(model)) .=> genes(model))
137-element Vector{Pair{String, String}}: "aceA" => "G_b4015" "aceB" => "G_b4014" "aceE" => "G_b0114" "aceF" => "G_b0115" "ackA" => "G_b2296" "acnA" => "G_b1276" "acnB" => "G_b0118" "adhE" => "G_b1241" "adhP" => "G_b1478" "adk" => "G_b0474" ⋮ "talB" => "G_b0008" "tdcD" => "G_b3115" "tdcE" => "G_b3114" "tktA" => "G_b2935" "tktB" => "G_b2465" "tpiA" => "G_b3919" "ydjI" => "G_b1773" "ytjC" => "G_b4395" "zwf" => "G_b1852"
Compute the flux with a genes knocked out:
flux_with_knockout =
flux_balance_analysis_dict(model, GLPK.Optimizer, modifications = [knockout("G_b3236")])
Dict{String, Float64} with 95 entries: "R_EX_fum_e" => 0.0 "R_ACONTb" => 8.33976 "R_GLNS" => 0.211162 "R_SUCOAS" => -7.44879 "R_TPI" => 9.17856 "R_EX_pi_e" => -3.03794 "R_PPC" => 9.81525 "R_O2t" => 23.9021 "R_G6PDH2r" => 1.53023e-15 "R_TALA" => -0.147739 "R_PPCK" => -1.24923e-14 "R_EX_lac__D_e" => 2.65889e-14 "R_PGL" => 1.45805e-15 "R_H2Ot" => -30.8724 "R_GLNabc" => 0.0 "R_EX_co2_e" => 24.8568 "R_EX_gln__L_e" => 0.0 "R_EX_nh4_e" => -4.50303 "R_MALt2_2" => 0.0 ⋮ => ⋮
We can see there is a small decrease in production upon knocking out the gene:
biomass_id = "R_BIOMASS_Ecoli_core_w_GAM"
flux_with_knockout[biomass_id] / original_flux[biomass_id]
0.9449581959159197
Similarly, we can explore how the flux variability has changed once the gene is knocked out:
variability_with_knockout =
flux_variability_analysis(model, GLPK.Optimizer, modifications = [knockout("G_b3236")])
95×2 Matrix{Float64}: 2.9513e-13 -2.67554e-15 2.97805e-13 0.0 6.12843e-13 4.26326e-14 8.33976 8.33976 8.33976 8.33976 6.12843e-13 4.26326e-14 3.67464 3.67464 7.44879 7.44879 1.53157e-13 0.0 2.40799e-13 5.35109e-15 ⋮ 2.08722e-13 6.57758e-11 0.0 6.57758e-11 7.44879 1000.0 -7.44879 -7.44879 -0.147739 -0.147739 0.0 9.86631e-11 -0.147739 -0.147739 -0.44586 -0.44586 9.17856 9.17856
Knocking out multiple genes
Multiple genes can be knocked out by simply passing a vector of genes to the knockout modification. This knocks out all genes that can run the FBA reaction:
reaction_gene_association(model, "R_FBA")
3-element Vector{Vector{String}}: ["G_b1773"] ["G_b2097"] ["G_b2925"]
flux_with_double_knockout = flux_balance_analysis_dict(
model,
GLPK.Optimizer,
modifications = [knockout(["G_b2097", "G_b1773", "G_b2925"])],
)
Dict{String, Float64} with 95 entries: "R_EX_fum_e" => 0.0 "R_ACONTb" => 0.759585 "R_GLNS" => 0.180022 "R_SUCOAS" => -1.36381e-15 "R_TPI" => 6.75636e-16 "R_EX_pi_e" => -2.58994 "R_PPC" => 2.01749 "R_O2t" => 27.5263 "R_G6PDH2r" => 27.8991 "R_TALA" => 9.17374 "R_PPCK" => 4.66459e-15 "R_EX_lac__D_e" => 1.39942e-15 "R_PGL" => 27.8991 "R_H2Ot" => -31.7697 "R_GLNabc" => 0.0 "R_EX_co2_e" => 26.6412 "R_EX_gln__L_e" => 0.0 "R_EX_nh4_e" => -3.83897 "R_MALt2_2" => 0.0 ⋮ => ⋮
flux_with_double_knockout[biomass_id] / original_flux[biomass_id]
0.8056066159777874
Processing all single gene knockouts
Function screen
provides a parallelizable and extensible way to run the flux balance analysis with the knockout over all genes:
knockout_fluxes = screen(
model,
args = tuple.(genes(model)),
analysis = (m, gene) -> begin
res = flux_balance_analysis_dict(m, GLPK.Optimizer, modifications = [knockout(gene)])
if !isnothing(res)
res[biomass_id]
end
end,
)
137-element Vector{Union{Nothing, Float64}}: 0.8739215069684118 0.7966959254309448 0.7966959254309448 0.7823510529477573 0.8739215069684118 0.8739215069684118 0.8739215069684118 0.8739215069684118 0.8739215069684118 0.8739215069684118 ⋮ 0.8739215069684118 0.8739215069684213 0.8739215069684213 0.8739215069684213 0.8739215069684213 0.8739215069684118 0.8739215069684118 0.8739215069684118 0.21114065257211534
It is useful to display the biomass growth rates of the knockout models together with the gene name:
sort(gene_name.(Ref(model), genes(model)) .=> knockout_fluxes, by = first)
137-element Vector{Pair{String}}: "aceA" => 0.8739215069684099 "aceB" => 0.8739215069684118 "aceE" => 0.7966959254309448 "aceF" => 0.7966959254309448 "ackA" => 0.8739215069684118 "acnA" => 0.8739215069684118 "acnB" => 0.8739215069684118 "adhE" => 0.8739215069684118 "adhP" => 0.8739215069684118 "adk" => 0.8739215069684118 ⋮ "talB" => 0.8739215069684118 "tdcD" => 0.8739215069684118 "tdcE" => 0.8739215069684118 "tktA" => 0.8739215069684118 "tktB" => 0.8739215069684118 "tpiA" => 0.7040369478590172 "ydjI" => 0.8739215069684118 "ytjC" => 0.8739215069684118 "zwf" => 0.8638133095039895
Processing all multiple-gene deletions
Double gene knockouts
Since we can generate any kind of argument matrix for screen
to process, it is straightforward to generate the matrix of all double gene knockouts and let the function process it. This computes the biomass production of all double-gene knockouts:
gene_groups = [[g1, g2] for g1 in genes(model), g2 in genes(model)];
double_knockout_fluxes = screen(
model,
args = tuple.(gene_groups),
analysis = (m, gene_groups) -> begin
res = flux_balance_analysis_dict(
m,
GLPK.Optimizer,
modifications = [knockout(gene_groups)],
)
if !isnothing(res)
res[biomass_id]
end
end,
)
137×137 Matrix{Union{Nothing, Float64}}: 0.873922 0.796696 0.796696 0.782351 … 0.873922 0.873922 0.211141 0.796696 0.796696 0.796696 0.782351 0.796696 0.796696 0.210794 0.796696 0.796696 0.796696 0.782351 0.796696 0.796696 0.210794 0.782351 0.782351 0.782351 0.782351 0.782351 0.782351 0.210794 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 … 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 -6.45106e-16 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 ⋮ ⋱ ⋮ 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 … 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 0.873922 0.873922 0.211141 0.873922 0.796696 0.796696 0.782351 … 0.873922 0.873922 0.211141 0.211141 0.210794 0.210794 0.210794 0.211141 0.211141 0.211141
The results can be converted to an easily scrutinizable form as follows:
reshape([gene_name.(Ref(model), p) for p in gene_groups] .=> double_knockout_fluxes, :)
18769-element Vector{Pair{Vector{String}}}: ["talB", "talB"] => 0.8739215069684118 ["aceE", "talB"] => 0.7966959254309448 ["aceF", "talB"] => 0.7966959254309448 ["lpd", "talB"] => 0.7823510529477573 ["acnB", "talB"] => 0.8739215069684118 ["mhpF", "talB"] => 0.8739215069684118 ["frmA", "talB"] => 0.8739215069684118 ["amtB", "talB"] => 0.8739215069684118 ["adk", "talB"] => 0.8739215069684118 ["glsA", "talB"] => 0.8739215069684118 ⋮ ["fumB", "s0001"] => 0.21114065257211534 ["frdD", "s0001"] => 0.21114065257211645 ["frdC", "s0001"] => 0.21114065257211645 ["frdB", "s0001"] => 0.21114065257211645 ["frdA", "s0001"] => 0.21114065257211645 ["fbp", "s0001"] => 0.21114065257211534 ["sgcE", "s0001"] => 0.21114065257211534 ["ytjC", "s0001"] => 0.21114065257211534 ["s0001", "s0001"] => 0.21114065257211534
Triple gene knockouts (and others)
We can extend the same analysis to triple or other gene knockouts by generating a different array of gene pairs. For example, one can generate gene_groups for triple gene deletion screening:
gene_groups = [[g1, g2, g3] for g1 in genes(model), g2 in genes(model), g3 in genes(model)];
!!! warning Full triple gene deletion analysis may take a long time to compute. We may use parallel processing with screen
to speed up the analysis. Alternatively, process only a subset of the genes triples.
This page was generated using Literate.jl.