javascript - Aggregation Unwind Document Keys as New Documents -
i having problems in altering schema using time series database have constructed using mongo db. currently, have records 1 shown below:
{ "_id" : 20, "name" : "bob, "location" : "london", "01/01/1993" : { "height" : "110cm", "weight" : "60kg", }, "02/01/1993" : { "height" : "112cm", "weight" : "61kg", }
}
i wish use aggregation framework create several records each "person", 1 each "time-value" subdocument in original record:
{ "_id" : 20, "name" : "bob, "date" : "01/01/1993" "location" : "london", "height" : "110cm", "weight" : "60kg", }, { "_id" : 20, "name" : "bob, "date" : "02/01/1993" "location" : "london", "height" : "112cm", "weight" : "61kg", }
the new scheme should more efficient when adding large number of time series values each record , shouldn't run max document size error!
any on how using mongo db aggregation pipeline appreciated!
whilst there functions in modern releases of aggregation framework can allow sort of thing, mileage may vary whether best solution this.
in essence can create array of entries comprised of document keys "which not include" other top level keys included in document. array can processed $unwind
, whole result reshaped new documents:
db.getcollection('input').aggregate([ { "$project": { "name": 1, "location": 1, "data": { "$filter": { "input": { "$objecttoarray": "$$root" }, "as": "d", "cond": { "$not": { "$in": [ "$$d.k", ["_id","name","location"] ] } } } } }}, { "$unwind": "$data" }, { "$replaceroot": { "newroot": { "$arraytoobject": { "$concatarrays": [ [{ "k": "id", "v": "$_id" }, { "k": "name", "v": "$name" }, { "k": "location", "v": "$location" }, { "k": "date", "v": "$data.k" }], { "$objecttoarray": "$data.v" } ] } } }}, { "$out": "output" } ])
or alternately reshaping in initial $project
within array elements produced:
db.getcollection('input').aggregate([ { "$project": { "_id": 0, "data": { "$map": { "input": { "$filter": { "input": { "$objecttoarray": "$$root" }, "as": "d", "cond": { "$not": { "$in": [ "$$d.k", ["_id", "name", "location"] ] } } } }, "as": "d", "in": { "$arraytoobject": { "$concatarrays": [ { "$filter": { "input": { "$objecttoarray": "$$root" }, "as": "r", "cond": { "$in": [ "$$r.k", ["_id", "name", "location"] ] } }}, [{ "k": "date", "v": "$$d.k" }], { "$objecttoarray": "$$d.v" } ] } } } } }}, { "$unwind": "$data" }, { "$replaceroot": { "newroot": "$data" } }, { "$out": "output" } ])
so use $objecttoarray
, $filter
in order make array keys contain data points each date.
after $unwind
apply $arraytoobject
on set of named keys in "array format" in order construct newroot
$replaceroot
, write new collection, 1 new document each data key using $out
.
that may part of way though, should change "date"
data bson date. takes lot less storage space, , easier query well.
var updates = []; db.getcollection('output').find().foreach( d => { updates.push({ "updateone": { "filter": { "_id": d._id }, "update": { "$set": { "date": new date( date.utc.apply(null, d.date.split('/') .reverse().map((e,i) => (i == 1) ? parseint(e)-1: parseint(e) ) ) ) } } } }); if ( updates.length >= 500 ) { db.getcollection('output').bulkwrite(updates); updates = []; } }) if ( updates.length != 0 ) { db.getcollection('output').bulkwrite(updates); updates = []; }
of course, if mongodb server lacks aggregation features better off writing output new collection iterating loop in first place:
var output = []; db.getcollection('input').find().foreach( d => { output = [ ...output, ...object.keys(d) .filter(k => ['_id','name','location'].indexof(k) === -1) .map(k => object.assign( { id: d._id, name: d.name, location: d.location, date: new date( date.utc.apply(null, k.split('/') .reverse().map((e,i) => (i == 1) ? parseint(e)-1: parseint(e) ) ) ) }, d[k] )) ]; if ( output.length >= 500 ) { db.getcollection('output').insertmany(output); output = []; } }) if ( output.length != 0 ) { db.getcollection('output').insertmany(output); output = []; }
in either of cases want apply date.utc
reversed string elements existing "string" based date , value can cast bson date.
the aggregation framework does not allow casting of types only solution part ( , necessary part ) loop , update, using forms @ least makes efficient loop , update.
either case gives same end output:
/* 1 */ { "_id" : objectid("599275b1e38f41729f1d64fe"), "id" : 20.0, "name" : "bob", "location" : "london", "date" : isodate("1993-01-01t00:00:00.000z"), "height" : "110cm", "weight" : "60kg" } /* 2 */ { "_id" : objectid("599275b1e38f41729f1d64ff"), "id" : 20.0, "name" : "bob", "location" : "london", "date" : isodate("1993-01-02t00:00:00.000z"), "height" : "112cm", "weight" : "61kg" }
Comments
Post a Comment