MongoDB I/OMongoDB is a popular document-oriented database. The RMongo package can
be used in conjunction with opm to store and receive phenotype microarray
data in such a database. OPMX objects fit nicely to such a pattern because
they do not impose a certain structure on the metadata, much like MongoDB
is able to store any kinds of data structures. The same holds for the options
parts of the aggr_settings and disc_settings entries of OPMX objects.
So the only work that needs to be done is to convert between OPMX objects
and JSON strings.
See the documentation of MongoDB and RMongo for further details.
Author: Markus Goeker
library(RMongo)
## Loading required package: rJava
library(opm)
In our case, RMongo unfortunately returns a data frame with one plate per
row and raw JSON strings per field. But the following short function is
able to convert all such results. Note that opms also takes care of names
converted by to_yaml, if any (see below).
mongo2opm <- function(x) {
x <- split(x, seq_len(nrow(x)))
x <- rapply(x, rjson::fromJSON, "character", NULL, "replace")
opms(x, precomputed = FALSE, skip = FALSE, group = TRUE)
}
MongoDB connectionConnect to database test, which comes with MongoDB. We will use the
collection pmdata to store JSON representations of OPM objects.
conn <- mongoDbConnect("test", "localhost", 27017)
coll <- "pmdata"
We insert each plate separately to be able to query them separately. Note
that under these settings, to_yaml takes care of removing names with dots,
which are disallowed in MongoDB.
result <- vapply(plates(vaas_4), function(plate) {
dbInsertDocument(conn, coll, to_yaml(plate, json = TRUE, nodots = TRUE))
}, "")
stopifnot(result == "ok")
print(result)
## [1] "ok" "ok" "ok" "ok"
got <- dbGetQuery(conn, coll, '{"metadata.Species": "Escherichia coli"}')
stopifnot(is.data.frame(got), dim(got) > 0)
OPMX objectsConversion necessary. See comments to mongo2opm above.
got <- mongo2opm(got)
Yields list with one element per plate type (only one plate type here). Some checks:
stopifnot(is.list(got), length(got) == 1, names(got) == plate_type(vaas_4))
got <- got[[1]]
stopifnot(is(got, "OPMS"), has_disc(got), dim(got) == c(2, dim(vaas_4)[-1]))
print(got)
## 1
## Class OPMD
## From file ./E. coli DSM
## 18039_vim10_12B__1_28_PMX_0_8#30#2010_E_12B_5.csv
## Hours measured 95.75
## Number of wells 96
## Plate type Gen III
## Position 12-B
## Setup time 8/30/2010 1:19:11 PM
## Metadata 5
## Aggregated TRUE
## Discretized TRUE
##
## 2
## Class OPMD
## From file ./E. coli DSM
## 30083T_vim10_7B__1_28_PMX_0_8#30#2010_F_
## 7B_5.csv
## Hours measured 95.75
## Number of wells 96
## Plate type Gen III
## Position 7-B
## Setup time 8/30/2010 1:53:08 PM
## Metadata 5
## Aggregated TRUE
## Discretized TRUE
##
## => OPMS object with 2 plates (2 aggregated, 2 discretized) of type 'Gen III', 96 well(s) and about 384 time point(s).
result <- dbRemoveQuery(conn, coll,
'{$and: [{"csv_data": {$exists: true}}, {"measurements": {$exists: true}}]}')
stopifnot(result == "ok")
print(result)
## [1] "ok"
empty <- dbGetQuery(conn, coll, '{"metadata.Species": "Escherichia coli"}')
stopifnot(is.data.frame(empty), dim(empty) == 0)
dbDisconnect(conn)
rm(conn)
detach("package:RMongo", unload = TRUE)