extract {opm} | R Documentation |
Extract selected aggregated and/or discretised values
into common matrix or data frame. The extract
data-frame method conducts normalisation and/or computes
normalised point-estimates and respective confidence
intervals for user-defined experimental groups. It is
mainly a helper function for ci_plot
.
extract_columns
extracts only selected metadata
entries for use as additional columns in a data frame or
(after joining) as character vector with labels.
## S4 method for signature 'MOPMX' extract(object, as.labels, subset = opm_opt("curve.param"), ci = FALSE, trim = "full", dataframe = FALSE, as.groups = NULL, sep = " ", ...) ## S4 method for signature 'OPMS' extract(object, as.labels, subset = opm_opt("curve.param"), ci = FALSE, trim = "full", dataframe = FALSE, as.groups = NULL, sep = " ", dups = "warn", exact = TRUE, strict = TRUE, full = TRUE, max = 10000L, ...) ## S4 method for signature 'data.frame' extract(object, as.groups = TRUE, norm.per = c("row", "column", "none"), norm.by = TRUE, subtract = TRUE, direct = inherits(norm.by, "AsIs"), dups = c("warn", "error", "ignore"), split.at = param_names("split.at")) ## S4 method for signature 'WMD' extract_columns(object, what, join = FALSE, sep = " ", dups = c("warn", "error", "ignore"), factors = TRUE, exact = TRUE, strict = TRUE) ## S4 method for signature 'WMDS' extract_columns(object, what, join = FALSE, sep = " ", dups = c("warn", "error", "ignore"), factors = TRUE, exact = TRUE, strict = TRUE) ## S4 method for signature 'data.frame' extract_columns(object, what, as.labels = NULL, as.groups = NULL, sep = opm_opt("comb.value.join"), factors = is.list(what), direct = is.list(what) || inherits(what, "AsIs"))
object |
|
as.labels |
List, character vector or formula
indicating the metadata to be joined and used as row
names (if If a |
subset |
Character vector. The parameter(s) to put
in the matrix. One of the values of
|
ci |
Logical scalar. Also return the confidence intervals? |
trim |
Character scalar. See
|
dataframe |
Logical scalar. Return data frame or
matrix? In the case of the |
as.groups |
For the If a If For the data-frame method, a logical, character or
numeric vector indicating according to which columns
(before the |
sep |
Character scalar. Used as separator between
the distinct metadata entries if these are to be pasted
together. |
dups |
Character scalar specifying what to do in the
case of duplicate labels: either ‘warn’,
‘error’ or ‘ignore’. Ignored unless
|
exact |
Logical scalar. Passed to
|
strict |
Logical scalar. Also passed to
|
full |
Logical scalar indicating whether full
substrate names shall be used. This is passed to
|
max |
Numeric scalar. Passed to
|
... |
Optional other arguments passed to
|
norm.per |
Character scalar indicating the presence and direction of a normalisation step.
This step can further by modified by the next three arguments. |
norm.by |
Vector indicating which wells (columns) or
plates (rows) are used to calculate means used for the
normalisation. By default, the mean is calculated over
all rows or columns if normalisation is requested using
|
direct |
Logical scalar. For |
subtract |
Logical scalar indicating whether normalisation (if any) is done by subtracting or dividing. |
split.at |
Character vector defining alternative names of the column at which the data frame shall be divided. Exactly one must match. |
what |
For the For the data-frame method, just the names of the columns
to extract, or their indexes, as vector, if In the ‘direct’ mode, |
join |
Logical scalar. Join each row together to yield a character vector? Otherwise it is just attempted to construct a data frame. |
factors |
Logical scalar determining whether strings should be converted to factors. Note that this would only affect newly created data-frame columns. |
extract_columns
is not normally directly called by
an opm user because extract
is available,
which uses this function, but can be used for testing the
applied metadata selections beforehand.
The extract_columns
data-frame method is partially
trivial (extract the selected columns and join them to
form a character vector or new data-frame columns),
partially more useful (extract columns with data of a
specified class).
Not all MOPMX
objects are suitable for
extract
. The call will be successful if only
OPMS
objects are contained, i.e.
OPM
objects are forbidden. But even if
successful it might result in NA
values within the
resulting matrix or data frame. This may cause methods
that call extract
to fail. NA
values will
not occur if the set of row names created using
as.labels
is equal between the distinct elements
of object
. The also holds if dataframe
is
TRUE
, even though in that case row names are only
temporarily created.
Duplicate combinations of row and columns names currently
cause the MOPMX
methods to skip all of them
except the last one if dataframe
is FALSE
.
This should mainly effect substrates that occur in plates
of distinct plate types.
Similarly, duplicate row names will cause the skipping of
all but the last one. This can be circumvented by using
an as.labels
argument that yields unique row
names. If as.labels
is empty, the
MOPMX
method of extract
will create
potentially unique row names from the names if these are
present but from the plate types if the ‘names’
attribute is NULL
. This will not be done, and rows
will neither be skipped nor reordered, if
dataframe
is TRUE
.
Otherwise row names and names of substrate columns will
be reordered (sorted). The created ‘row.groups’
attribute, if any, will be adapted accordingly. If
dataframe
is TRUE
, the placement of the
columns created by as.groups
will also be as
usual, but duplicates, if any, will be removed.
Numeric matrix or data frame from extract
; always
a data frame for the data-frame method with the same
column structure as object
and, if grouping was
used, a triplet structure of the rows, as indicated in
the new split.at
column: (i) group mean, (ii)
lower and (iii) upper boundary of the group confidence
interval. The data could then be visualised using
ci_plot
. See the examples.
For the OPMS
method of extract_columns
, a
data frame or character vector, depending on the
join
argument. The data-frame method of
extract_columns
returns a character vector or a
data frame, too, but depending on the what
argument.
Lea A.I. Vaas, Markus Goeker
aggregated
for the extraction of aggregated
values from a single OPMA
objects.
boot::norm base::data.frame base::as.data.frame base::matrix base::as.matrix base::cbind
Other conversion-functions: as.data.frame
,
flatten
, merge
,
oapply
, opmx
,
plates
, rep
,
rev
, sort
,
split
, to_yaml
,
unique
## 'OPMS' method
opm_opt("curve.param") # default parameter
## [1] "A"
# generate matrix (containing the parameter given above)
(x <- extract(vaas_4, as.labels = list("Species", "Strain")))[, 1:3]
## A01 (Negative Control) A02 (Dextrin)
## Escherichia coli DSM18039 57.66618 131.67996
## Escherichia coli DSM30083T 123.45581 248.18087
## Pseudomonas aeruginosa DSM1707 61.35526 75.10225
## Pseudomonas aeruginosa 429SC1 55.74738 66.05093
## A03 (D-Maltose)
## Escherichia coli DSM18039 42.45742
## Escherichia coli DSM30083T 284.09938
## Pseudomonas aeruginosa DSM1707 22.37216
## Pseudomonas aeruginosa 429SC1 49.63049
stopifnot(is.matrix(x), dim(x) == c(4, 96), is.numeric(x))
# using a formula also works
(y <- extract(vaas_4, as.labels = ~ Species + Strain))[, 1:3]
## A01 (Negative Control) A02 (Dextrin)
## Escherichia coli DSM18039 57.66618 131.67996
## Escherichia coli DSM30083T 123.45581 248.18087
## Pseudomonas aeruginosa DSM1707 61.35526 75.10225
## Pseudomonas aeruginosa 429SC1 55.74738 66.05093
## A03 (D-Maltose)
## Escherichia coli DSM18039 42.45742
## Escherichia coli DSM30083T 284.09938
## Pseudomonas aeruginosa DSM1707 22.37216
## Pseudomonas aeruginosa 429SC1 49.63049
stopifnot(identical(x, y))
# generate data frame
(x <- extract(vaas_4, as.labels = list("Species", "Strain"),
dataframe = TRUE))[, 1:3]
## Species Strain Parameter
## 1 Escherichia coli DSM18039 A
## 2 Escherichia coli DSM30083T A
## 3 Pseudomonas aeruginosa DSM1707 A
## 4 Pseudomonas aeruginosa 429SC1 A
stopifnot(is.data.frame(x), dim(x) == c(4, 99))
# using a formula
(y <- extract(vaas_4, as.labels = ~ Species + Strain,
dataframe = TRUE))[, 1:3]
## Species Strain Parameter
## 1 Escherichia coli DSM18039 A
## 2 Escherichia coli DSM30083T A
## 3 Pseudomonas aeruginosa DSM1707 A
## 4 Pseudomonas aeruginosa 429SC1 A
stopifnot(identical(x, y))
# using a formula, with joining into new columns
(y <- extract(vaas_4, as.labels = ~ J(Species + Strain),
dataframe = TRUE))[, 1:3]
## Species Strain Species.Strain
## 1 Escherichia coli DSM18039 Escherichia coli/DSM18039
## 2 Escherichia coli DSM30083T Escherichia coli/DSM30083T
## 3 Pseudomonas aeruginosa DSM1707 Pseudomonas aeruginosa/DSM1707
## 4 Pseudomonas aeruginosa 429SC1 Pseudomonas aeruginosa/429SC1
stopifnot(identical(x, y[, -3]))
# put all parameters in a single data frame
x <- lapply(param_names(), function(name) extract(vaas_4, subset = name,
as.labels = list("Species", "Strain"), dataframe = TRUE))
x <- do.call(rbind, x)
# get discretised data
(x <- extract(vaas_4, subset = param_names("disc.name"),
as.labels = list("Strain")))[, 1:3]
## A01 (Negative Control) A02 (Dextrin) A03 (D-Maltose)
## DSM18039 FALSE NA FALSE
## DSM30083T NA TRUE TRUE
## DSM1707 FALSE FALSE FALSE
## 429SC1 FALSE FALSE FALSE
stopifnot(is.matrix(x), identical(dim(x), c(4L, 96L)), is.logical(x))
## data-frame method
# extract data from OPMS-object as primary data frame
# second call to extract() then applied to this one
(x <- extract(vaas_4, as.labels = list("Species", "Strain"),
dataframe = TRUE))[, 1:3]
## Species Strain Parameter
## 1 Escherichia coli DSM18039 A
## 2 Escherichia coli DSM30083T A
## 3 Pseudomonas aeruginosa DSM1707 A
## 4 Pseudomonas aeruginosa 429SC1 A
# no normalisation, but grouping for 'Species'
y <- extract(x, as.groups = "Species", norm.per = "none")
# plotting using ci_plot()
ci_plot(y[, c(1:6, 12)], legend.field = NULL, x = 350, y = 1)
# normalisation by plate means
y <- extract(x, as.groups = "Species", norm.per = "row")
# plotting using ci_plot()
ci_plot(y[, c(1:6, 12)], legend.field = NULL, x = 130, y = 1)
# normalisation by well means
y <- extract(x, as.groups = "Species", norm.per = "column")
# plotting using ci_plot()
ci_plot(y[, c(1:6, 12)], legend.field = NULL, x = 20, y = 1)
# normalisation by subtraction of the well means of well A10 only
y <- extract(x, as.groups = "Species", norm.per = "row", norm.by = 10,
subtract = TRUE)
# plotting using ci_plot()
ci_plot(y[, c(1:6, 12)], legend.field = NULL, x = 0, y = 0)
## extract_columns()
# 'OPMS' method
# Create data frame
(x <- extract_columns(vaas_4, what = list("Species", "Strain")))
## Species Strain
## 1 Escherichia coli DSM18039
## 2 Escherichia coli DSM30083T
## 3 Pseudomonas aeruginosa DSM1707
## 4 Pseudomonas aeruginosa 429SC1
stopifnot(is.data.frame(x), dim(x) == c(4, 2))
(y <- extract_columns(vaas_4, what = ~ Species + Strain))
## Species Strain
## 1 Escherichia coli DSM18039
## 2 Escherichia coli DSM30083T
## 3 Pseudomonas aeruginosa DSM1707
## 4 Pseudomonas aeruginosa 429SC1
stopifnot(identical(x, y)) # same result using a formula
(y <- extract_columns(vaas_4, what = ~ J(Species + Strain)))
## Species Strain Species.Strain
## 1 Escherichia coli DSM18039 Escherichia coli/DSM18039
## 2 Escherichia coli DSM30083T Escherichia coli/DSM30083T
## 3 Pseudomonas aeruginosa DSM1707 Pseudomonas aeruginosa/DSM1707
## 4 Pseudomonas aeruginosa 429SC1 Pseudomonas aeruginosa/429SC1
stopifnot(is.data.frame(y), dim(y) == c(4, 3)) # additional column created
stopifnot(identical(x, y[, -3]))
(x <- extract_columns(vaas_4, what = TRUE)) # use logical scalar
## Group
## 1 1
## 2 1
## 3 1
## 4 1
stopifnot(is.data.frame(x), dim(x) == c(4, 1))
(y <- extract_columns(vaas_4, what = FALSE))
## Group
## 1 1
## 2 2
## 3 3
## 4 4
stopifnot(is.data.frame(y), dim(y) == c(4, 1), !all(y[, 1] == x[, 1]))
# Create a character vector
(x <- extract_columns(vaas_4, what = list("Species", "Strain"), join = TRUE))
## [1] "Escherichia coli DSM18039" "Escherichia coli DSM30083T"
## [3] "Pseudomonas aeruginosa DSM1707" "Pseudomonas aeruginosa 429SC1"
stopifnot(is.character(x), length(x) == 4L)
(x <- try(extract_columns(vaas_4, what = list("Species"), join = TRUE,
dups = "error"), silent = TRUE)) # duplicates yield error
## [1] "Error in .local(object, ...) : duplicated label: Escherichia coli\n"
## attr(,"class")
## [1] "try-error"
## attr(,"condition")
## <simpleError in .local(object, ...): duplicated label: Escherichia coli>
stopifnot(inherits(x, "try-error"))
(x <- try(extract_columns(vaas_4, what = list("Species"), join = TRUE,
dups = "warn"), silent = TRUE)) # duplicates yield warning only
## Warning in .local(object, ...): duplicated label: Escherichia coli
## [1] "Escherichia coli" "Escherichia coli"
## [3] "Pseudomonas aeruginosa" "Pseudomonas aeruginosa"
stopifnot(is.character(x), length(x) == 4L)
# data-frame method, 'direct' running mode
x <- data.frame(a = 1:26, b = letters, c = LETTERS)
(y <- extract_columns(x, I(c("a", "b")), sep = "-"))
## [1] "1-a" "2-b" "3-c" "4-d" "5-e" "6-f" "7-g" "8-h" "9-i" "10-j"
## [11] "11-k" "12-l" "13-m" "14-n" "15-o" "16-p" "17-q" "18-r" "19-s" "20-t"
## [21] "21-u" "22-v" "23-w" "24-x" "25-y" "26-z"
stopifnot(grepl("^\\s*\\d+-[a-z]$", y)) # pasted columns 'a' and 'b'
# data-frame method, using class name
(y <- extract_columns(x, as.labels = "b", what = "integer", as.groups = "c"))
## a
## a 1
## b 2
## c 3
## d 4
## e 5
## f 6
## g 7
## h 8
## i 9
## j 10
## k 11
## l 12
## m 13
## n 14
## o 15
## p 16
## q 17
## r 18
## s 19
## t 20
## u 21
## v 22
## w 23
## x 24
## y 25
## z 26
## attr(,"row.groups")
## [1] A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
## Levels: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
stopifnot(is.matrix(y), dim(y) == c(26, 1), rownames(y) == x$b)
stopifnot(identical(attr(y, "row.groups"), x$c))