Title: | Calculates Microbial Lag Duration (on the Population Level) from Provided Growth Curve Data |
---|---|
Description: | Microbial growth is often measured by growth curves i.e. a table of population sizes and times of measurements. This package allows to use such growth curve data to determine the duration of "microbial lag phase" i.e. the time needed for microbes to restart divisions. It implements the most commonly used methods to calculate the lag duration, these methods are discussed and described in Opalek et.al. 2022. Citation: "How to determine microbial lag phase duration?", M. Opalek, B. Smug, D. Wloch-Salamon (2022) <doi:10.1101/2022.11.16.516631>. |
Authors: | Bogna Smug [aut, cre] |
Maintainer: | Bogna Smug <[email protected]> |
License: | GPL-3 |
Version: | 1.0.4 |
Built: | 2024-11-05 04:12:32 UTC |
Source: | https://github.com/cran/miLAG |
Calculates lag based on fitting baranyi model to data
calc_baranyi_fit_lag( data, n0, init_lag = NULL, init_gr_rate = NULL, algorithm = "auto", max_iter = 100 )
calc_baranyi_fit_lag( data, n0, init_lag = NULL, init_gr_rate = NULL, algorithm = "auto", max_iter = 100 )
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
n0 |
a data frame describing initial biomass for each of the curves, i.e. it has two obligatory columns: "curve_id", "N0" |
init_lag |
initial value for the lag parameter, defaults to NULL in which case it will be approximated based on the data |
init_gr_rate |
initial value for the growth rate, defaults to NULL in which case it will be approximated based on the data |
algorithm |
eg. "auto", "Levenberg-Marquardt", "port", defaults to "auto" |
max_iter |
Maximum number of iterations, defaults to 100 |
growth curve data with additional columns ('lag', and predicted biomass 'predicted')
The main function that calculates lags based on growth curve data, selected method and parameters and returns an extended growth rate data frame (extended by multiple columns with parameters related to lag calculation)
calc_lag(data, method, pars)
calc_lag(data, method, pars)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
method |
method of lag calculation, choose one of the follwoing: "exponential", "biomass increase", "max growth acceleration", "parameter fitting to a model" |
pars |
a list of parameters. Get.default.parameters function can be used to get the default ones. Otherwise create your onwn list with the following names: - model: if method = "parameter fitting to a model" , one of the following models needs to be chosen: "logistic", "baranyi" - n0_method: first.observation" if the first point is taken as the initial biomass or "minimal.observation" if the minimal biomass is taken is the initial point. In "healthy" growth curves these options should be equivalent but sometimes a drop in OD/biomass is observed at the beginning of a growth curve. In this case it is not obvious what to assume the initial biomass is. - tangent_method "local.regression" (if the tangent is fitted to a number of points around the maximal growth rate) or "to.point" (if the tangent is fitted only to the point where the growth rate is maximal); defaults to "to.point" - threshold: A value of the biomass increase that we can surely associate with the end of the lag phase rather than random variation durinh the lag. Defaults to 10^2 - curve_points: if tangent.method = "local.regression" then curve_points is the number of points the line is fitted to; defaults to 3 i.e. the point with the maximal uptake rate one point before and one point after - init_gr_rate: if logistic model is fitted. Defaults to NULL in which case the initial value will be based on the data - init_lag: if a logistic model is fitted, Defaults to NULL in which case the initial value will be based on the data - algorithm: if method = "parameter fitting to a model", nls algorithm to run the model fit; defaults to "auto" which will choose the best between bounded and unbounded "Levenberg-Marquardt" and bounded "port" - max_iter = if method = "parameter fitting to a model", the maximum number of nls iterations, defaults to 100 |
growth curve data (time, biomass, curve_id) with the following additional columns: log_biomass, lag, line_slope, line_intercept, lag_calc_method, predicted_data, diff, second_deriv_b, tangent_point, threshold
Runs nlsLM/nls algorithms with three different parameter setups to fit the best Logistic model parameters to our data and chooses the best model
calc_lag_fit_to_baranyi_with_lag( gr_curve, LOG10N0 = NULL, init_lag = NULL, init_mumax = NULL, init_LOG10Nmax = NULL, algorithm = "auto", max_iter = 100, lower_bound = c(0, 0, 0, 0) )
calc_lag_fit_to_baranyi_with_lag( gr_curve, LOG10N0 = NULL, init_lag = NULL, init_mumax = NULL, init_LOG10Nmax = NULL, algorithm = "auto", max_iter = 100, lower_bound = c(0, 0, 0, 0) )
gr_curve |
data from one specific growth curve with these two columns: time and biomass |
LOG10N0 |
the decimal logarithm of initial biomass |
init_lag |
initial value for the lag parameter |
init_mumax |
initial value for the mumax parameter |
init_LOG10Nmax |
initial value for the LOG10Nmax parameter |
algorithm |
defaults to "auto" which chooses between bounded and unbounded Levenberg-Marquardt method and the bounded port method |
max_iter |
max. number of itertaions; defaults to 100 |
lower_bound |
lower.bound for the bounded nls optimisation; defaults to 0 |
lag and the nls fitting object with parameters fitted to logistic model
Runs nlsLM/nls algorithm of the user's choice to fit the Logistic model parameters to our data
calc_lag_fit_to_logistic_with_lag( gr_curve, n0, init_gr_rate = init_gr_rate, init_K = init_K, init_lag = init_lag, algorithm = "auto", max_iter = 100, lower_bound = c(0, 0, 0) )
calc_lag_fit_to_logistic_with_lag( gr_curve, n0, init_gr_rate = init_gr_rate, init_K = init_K, init_lag = init_lag, algorithm = "auto", max_iter = 100, lower_bound = c(0, 0, 0) )
gr_curve |
data from one specific growth curve with these two columns: time and biomass |
n0 |
the initial biomass |
init_gr_rate |
initial value for the growth rate |
init_K |
initial value for the saturation parameter K |
init_lag |
initial value for the lag parameter |
algorithm |
defaults to "auto" which chooses between bounded and unbounded Levenberg-Marquardt method and the bounded port method |
max_iter |
max. number of iterations; defaults to 100 |
lower_bound |
lower bound for the bounded nls optimization; defaults to 0 |
lag and the nls fitting object with parameters fitted to logistic model
Calculates lag based on fitting logistic model to data
calc_lagistic_fit_lag( data, n0, init_gr_rate = NULL, init_K = NULL, init_lag = NULL, algorithm, max_iter, return_all_params = FALSE, min_b = 0.2, min_a = 0.8 )
calc_lagistic_fit_lag( data, n0, init_gr_rate = NULL, init_K = NULL, init_lag = NULL, algorithm, max_iter, return_all_params = FALSE, min_b = 0.2, min_a = 0.8 )
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
n0 |
a data frame describing initial biomass for each of the curves, i.e. it has two obligatory columns: "curve_id", "N0" |
init_gr_rate |
initial value for the growth rate, defaults to NULL in which case it will be approximated based on the data |
init_K |
initial value for the saturation parameter K, defaults to NULL in which case it will be approximated based on the data |
init_lag |
initial value for the lag parameter, defaults to NULL in which case it will be approximated based on the data |
algorithm |
eg. "auto", "Levenberg-Marquardt", "port" |
max_iter |
Maximum number of iterations |
return_all_params |
defaults to FALSE, TRUE if you also want to get K and growth.rate apart from lag |
min_b |
defaults to 0.2; mina and minb define where to look for exponential phase: it will be where the biomass is between min + (max-min)*(lower.bound.for.gr TO upper.bound.for.gr) |
min_a |
defaults to 0.8 |
growth curve data with additional columns ('lag', and predicted biomass 'predicted'), and the fitting object if return.all.params was set to TRUE
Runs nlsLM/nls algorithms with three different parameter setups to fit the best Baranyi parameters to our data and chooses the best model
choose_lag_fit_algorithm_baranyi( gr_curve, LOG10N0, init_lag, init_mumax, init_LOG10Nmax, max_iter, lower_bound )
choose_lag_fit_algorithm_baranyi( gr_curve, LOG10N0, init_lag, init_mumax, init_LOG10Nmax, max_iter, lower_bound )
gr_curve |
data from one specific growth curve with the following columns: LOG10N, t |
LOG10N0 |
init value for the LOG10N0 parameter |
init_lag |
initial value for the lag |
init_mumax |
initial value for the mumax parameter |
init_LOG10Nmax |
initial value for the LOG10Nmax parameter |
max_iter |
max. number of iterations |
lower_bound |
lower bound for the bounded nls optimization; |
the best nls fitting object with parameters fitted to Baranyi model (lowest Res.Sum Sq provided that all coefficients are nonnegative)
Runs nlsLM/nls algorithms with three different parameter setups to fit the best Logistic model parameters to our data and chooses the best model
choose_lag_fit_algorithm_logistic( gr_curve, n0, init_gr_rate = init_gr_rate, init_K = init_K, init_lag = init_lag, max_iter = 100, lower_bound = c(0, 0, 0) )
choose_lag_fit_algorithm_logistic( gr_curve, n0, init_gr_rate = init_gr_rate, init_K = init_K, init_lag = init_lag, max_iter = 100, lower_bound = c(0, 0, 0) )
gr_curve |
data from one specific growth curve with the following columns: LOG10N, t |
n0 |
the initial biomass |
init_gr_rate |
initial value for the growth rate |
init_K |
initial value for the saturation parameter K |
init_lag |
initial value for the lag parameter |
max_iter |
max. number of iterations; defaults to 100 |
lower_bound |
lower bound for the bounded nls optimization; defaults to 0 |
the best nls fitting object with parameters fitted to logistic model (lowest Res.Sum Sq provided that all coefficients are nonnegative)
Compares results of 3 objects obtained from running nls
compare_algorithms(nls_LM_no_bound, nls_PORT, nlsres_LM)
compare_algorithms(nls_LM_no_bound, nls_PORT, nlsres_LM)
nls_LM_no_bound |
first object resulting from running nls |
nls_PORT |
second object resulting from running nls |
nlsres_LM |
third object resulting from running nls |
the best fitting object (lowest Res.Sum Sq provided that all coefficients are nonnegative)
cut_the_data Subsets the data frame containing only the observations up to the specified maximum time
cut_the_data(data, max_time)
cut_the_data(data, max_time)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
max_time |
max. time at which we want to cut the growth curve data |
cut data
Fits the lag to multiple growth curves based on the basic tangent method
fit_exp_lag(data, tangent_method, n0, curve_points = 3)
fit_exp_lag(data, tangent_method, n0, curve_points = 3)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
tangent_method |
"local.regression" (if the tangent is fitted to a number of points around the maximal growth rate) or "to.point" (if the tangent is fitted only to the point where the growth rate is maximal); defaults to "to.point" |
n0 |
the initial biomass (a tangent line crossing N0 line will determine the lag) |
curve_points |
if tangent_method = "local.regression" then curve_points is the number of points the line is fitted to; defaults to 3 i.e. the point with the maximal uptake rate one point before and one point aftter |
growth curve data (as input) together with additional columns: lag, line.intercept and line.slope
Fits the lag to one growth curve based on the basic tangent method
fit_exp_lag_to_curve(data, n0, tangent_method = "to.point", curve_points = 3)
fit_exp_lag_to_curve(data, n0, tangent_method = "to.point", curve_points = 3)
data |
a data frame with two required columns names: "time" and "biomass", This is data from one growth curve only, one (mean) observation per time |
n0 |
the initial biomass (a tangent line crossing N0 line will determine the lag) |
tangent_method |
"local.regression" (if the tangent is fitted to a number of points around the maximal growth rate) or "to.point" (if the tangent is fitted only to the point where the growth rate is maximal); defaults to "to.point" |
curve_points |
if tangent_method = "local.regression" then curve_points is the number of points the line is fitted to; defaults to 3 i.e. the point with the maximal uptake rate one point before and one point after |
line_slope: slope of the tangent line, line_intercept: intercept of the tangent line, lag: lag, tangent_points: i..e a data frame of all points selected for fitting the line
Fits the lag to multiple growth curves based on the max growth acceleration method It finds where the second derivative is the largest
fit_max_infl_lag(data)
fit_max_infl_lag(data)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
growth curve data (as input) together with additional columns: lag, log.biomass, time.diff, time.av, second.deriv.b, biomass.increase
Runs the main function that calculates lags based on growth curve data based on all possible methods.
get_all_methods_lag(data, biomass_incr_threshold, pars = NULL)
get_all_methods_lag(data, biomass_incr_threshold, pars = NULL)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
biomass_incr_threshold |
A value of the biomass increase that we can surely associate with the end of the lag phase rather than random variation during the lag. Needs to be set specifically to avoid unconscious use of the value set by default. If set to NULL, the value from pars will be taken |
pars |
a list of parameters. defaults to the ones set by get_def_pars function. Otherwise create your own list with the following names: - model: if method = "parameter fitting to a model" , one of the following models needs to be chosen: "logistic", "baranyi" - n0_method: first.observation" if the first point is taken as the initial biomass or "minimal.observation" if the minimal biomass is taken is the initial point. In "healthy" growth curves these options should be equivalent but sometimes a drop in OD/biomass is observed at the beginning of a growth curve. In this case it is not obvious what to assume the initial biomass is. - tangent.method "local.regression" (if the tangent is fitted to a number of points around the maximal growth rate) or "to.point" (if the tangent is fitted only to the point where the growth rate is maximal); defaults to "to.point" - threshold: A value of the biomass increase that we can surely associate with the end of the lag phase rather than random variation during the lag. Defaults to 10^2 - curve_points: if tangent_method = "local.regression" then curve_points is the number of points the line is fitted to; defaults to 3 i.e. the point with the maximal uptake rate one point before and one point after - init_growth.rate: if logistic model is fitted. Defaults to NULL in which case the initial value will be based on the data - init_lag: if a logistic model is fitted, Defaults to NULL in which case the initial value will be based on the data - algorithm: if method = "parameter fitting to a model", nls algorithm to run the model fit; defaults to "auto" which will choose the best between bounded and unbounded "Levenberg-Marquardt" and bounded "port" - max_iter = if method = "parameter fitting to a model", the maximum number of nls iterations, defaults to 100 |
growth curve data (time, biomass, curve_id) with the column: lag_calculation_method, and with the following additional columns: log_biomass, lag, line_slope, line_intercept, lag_calculation_method, predicted_data, diff, second_deriv_b, tangent_point, threshold Note that each growth curve will appear
get_def_pars Set defaults parameters used by calc_lag function
get_def_pars()
get_def_pars()
list of parameters
Finds reasonable approximation for baranyi growth curve parameters (init_mumax, lag) based on the growth curve and some initial values These approximations will be used as the initial values for the proper optimization algorithm run later.
get_init_pars_baranyi( data_this_curve, this_n0, init_lag, init_gr_rate, min_b = 0.2, min_a = 0.8 )
get_init_pars_baranyi( data_this_curve, this_n0, init_lag, init_gr_rate, min_b = 0.2, min_a = 0.8 )
data_this_curve |
data from one specific growth curve with these two columns: time and biomass |
this_n0 |
the initial biomass |
init_lag |
initial value for the lag parameter |
init_gr_rate |
initial value for the growth rate |
min_b |
defaults to 0.2; mina and minb define where to look for exponential phase: it will be where the biomass is between min + (max-min)*(mina TO minb) |
min_a |
defaults to 0.8 |
list of parameters: init_mumax, init_lag
Finds reasonable approximation for logistic growth curve parameters (K, lag. growth rate) based on the growth curve and some initial values These approximations will be used as the initial values for the proper optimization algorithm run later.
get_init_pars_logistic( data_this_curve, this_n0, init_K, init_lag, init_gr_rate, min_b = 0.2, min_a = 0.8 )
get_init_pars_logistic( data_this_curve, this_n0, init_K, init_lag, init_gr_rate, min_b = 0.2, min_a = 0.8 )
data_this_curve |
data from one specific growth curve with these two columns: time and biomass |
this_n0 |
the initial biomass |
init_K |
initial value for the saturation parameter K |
init_lag |
initial value for the lag parameter |
init_gr_rate |
initial value for the growth rate |
min_b |
defaults to 0.2; mina and minb define where to look for exponential phase: it will be where the biomass is between min + (max-min)*(min_a TO min_b) |
min_a |
defaults to 0.8 |
list of parameters: init_K, init_lag, init_gr_rate,
The most basic function that calculates lags based on growth curve data, selected method and parameters. It uses calc_lag function and strips the results to only get lag parameter for each growth curve id.
get_lag(data, method, pars)
get_lag(data, method, pars)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
method |
method of lag calculation, choose one of the follwoing: "exponential", "biomass increase", "max growth acceleration", "parameter fitting to a model" |
pars |
a list of parameters. Get.default.parameters function can be used to get the default ones. Otherwise create your onwn list with the following names: - model: if method = "parameter fitting to a model" , one of the following models needs to be chosen: "logistic", "baranyi" - n0_method: first.observation" if the first point is taken as the initial biomass or "minimal.observation" if the minimal biomass is taken is the initial point. In "healthy" growth curves these options should be equivalent but sometimes a drop in OD/biomass is observed at the beginning of a growth curve. In this case it is not obvious what to assume the initial biomass is. - tangent_method "local.regression" (if the tangent is fitted to a number of points around the maximal growth rate) or "to.point" (if the tangent is fitted only to the point where the growth rate is maximal); defaults to "to.point" - threshold: A value of the biomass increase that we can surely associate with the end of the lag phase rather than random variation durinh the lag. Defaults to 10^2 - curve_points: if tangent.method = "local.regression" then curve_points is the number of points the line is fitted to; defaults to 3 i.e. the point with the maximal uptake rate one point before and one point after - init_gr_rate: if logistic model is fitted. Defaults to NULL in which case the initial value will be based on the data - init_lag: if a logistic model is fitted, Defaults to NULL in which case the initial value will be based on the data - algorithm: if method = "parameter fitting to a model", nls algorithm to run the model fit; defaults to "auto" which will choose the best between bounded and unbounded "Levenberg-Marquardt" and bounded "port" - max_iter = if method = "parameter fitting to a model", the maximum number of nls iterations, defaults to 100 |
lag per each curve_id
Gets the initial biomass to relate to
get_n0(biomass, n0_method)
get_n0(biomass, n0_method)
biomass |
vector of biomass (chronologically ordered as in growth curve) |
n0_method |
"first.observation" if the first point is taken as the initial biomass or "minimal.observation" if the minimal biomass is taken is the initial point. In "healthy" growth curves these options should be equivalent but sometimes a drop in OD/biomass is observed at the beginning of a growth curve. In this case it is not obvious what to assume the initial biomass is. |
a value of the initial biomass (either the first observation or the minimum value depending on the parameter N0.method)
This function sets a ggplot theme without grid. The theme removes the major and minor grid lines, sets a white background with a gray border and adjusts the text size.
get_theme(text_size = 12)
get_theme(text_size = 12)
text_size |
defaults to 12 |
a ggplot theme
Fits the lag to multiple growth curves based on the biomass increase method
lag_biomass_incr(data, threshold, n0)
lag_biomass_incr(data, threshold, n0)
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
threshold |
A value of the biomass increase that we can surely associate with the end of the lag phase rather than random variation during the lag |
n0 |
the initial biomass (lag will be defined as the time point where the difference between biomass and N0 reaches a predefined threshold) |
growth curve data (as input) together with additional columns: N0, increase.from.N0, lag
Create a growth curve data frame that can be later passed to the lag clculation functions
make_grwoth_curve_df(time, biomass, curve_id = NULL)
make_grwoth_curve_df(time, biomass, curve_id = NULL)
time |
numeric vector of times when biomass was measured (chronologically ordered as in growth curve) |
biomass |
numeric vector of measured biomass values (chronologically ordered as in growth curve) |
curve_id |
character vector of growth curve identifiers (i.e. if there are multiple measurements done at the same time point, they should have different curve_id) |
a data frame representing growth curve data
Plots the provided growth curve (one single growth curve) on logarithmic scale
plot_data(data_new, log10_transform = TRUE)
plot_data(data_new, log10_transform = TRUE)
data_new |
a data frame with two required columns names: "time" and "biomass" |
log10_transform |
if to plot y axis (biomass) on log10 scale |
ggplot object with a growth curve
Plots the provided growth curve (one single growth curve) together with the calculated lag and and the rationale for lag calculation
plot_lag_fit(data_new, print_lag_info = TRUE, log10_transform = TRUE)
plot_lag_fit(data_new, print_lag_info = TRUE, log10_transform = TRUE)
data_new |
a data frame output by Calculate.Lag function: it needs to have the following columns: "time", "biomass", "tangent.point", "predicted.data", "threshold", "N0", "second.deriv.b", "line.intercept", "line.slope" |
print_lag_info |
if set to "TRUE" prints the lag length on the graph |
log10_transform |
if to plot y axis (biomass) on log10 scale |
ggplot object with a growth curve
smooth_data Smoothens growth curves data
smooth_data(data, smooth_kind = "3RS3R")
smooth_data(data, smooth_kind = "3RS3R")
data |
a data frame with two required columns names: "time" and "biomass",and one optional column: "curve_id" This is data from may come from multiple growth curves |
smooth_kind |
kind used for the smooth functions, defaults to "3RS3R" |
smoothened data