collections - How to tag documents in MongoDB? -
i need tag documents in collection, let's call 'contacts'.
the first idea had create attribute called "tags" each document. well, in case have like:
{ _id:'1', contact_name:'asya kamsky', tags:['mongodb', 'maths', 'travels'] }
now, let's suppose have users want tag document in 'contacts'.
if keep decision save tags attribute each document, tags personal, need use userid each tag. our document (or not):
{ _id:'1', contact_name:'asya kamsky', tags:[ {userid:'alex',tags:['mongodb', 'maths', 'travels']}, {userid:'eric',tags:['databases', 'friends', 'japan']}, ] }
now, let's complicate bit. let's imagine have lot of users , each 1 want tag documents personal tags.
how deal that?
ok, create thousands of tags each document:
{ _id:'1', contact_name:'asya kamsky', tags:[ {userid:'alex',tags:['mongodb', 'maths', 'travels']}, {userid:'eric',tags:['databases', 'friends', 'japan']}, {.....................................................} {.....................................................} {......................................................} ] }
but, if have millions of users? in case have 16mg limitation each document, know....
at point, worrying future growth of application, decided create nice separated collection called 'tags' contain documents similar to:
{ "contact_name" : "asya kamsky", "useriid" : "alex", "tags" : ['mongodb', 'maths', 'travels'], "timestamp" : "2017-08-08 14:33:28" }, { "contact_name" : "asya kamsky", "useriid" : "eric", "tags" : ['databases', 'friends', 'japan'], "timestamp" : "2017-08-08 14:33:28" }
that's, have separated documents represent tag of each user.
cool , clean, right?
well, case, face 2 problems:
- minor problem: return sql logic don't anymore accept in cases.
- big (for me) problem: how search contact personal tags? in case have nice 'join' problem mongodb resolves using $lookup. "resolves well" 10000, 20000, or 500000 documents. want ensure performance in future, think 10000000 contacts. so, researched recently, $lookup works "small part" of universe and, indexes, search take lot of time executed.
how resolve challenge?
thanks all
if usage such number of users
x number/size of tags
per contact (plus whatever other data in contacts
document) bring near 16mb document size limit storing tags ins separate collection seems valid. before go down route sure likely? have tried creating contact documents in bid see how many tags, how many users per contact near 16mb limit. if answer implies number of users and/or tags unlikely ever reach maybe concerns strictly theoretical , consider sticking simplest solution embed user specific tags inside contacts
.
the rest of answer assumes size estimates , knowledge number of tags , users per contact such size constraints valid. on basis, stated specific concern join performance ...
but want ensure performance in future, think 10000000 contacts. so, researched recently, $lookup works "small part" of universe and, indexes, search take lot of time executed.
have tried measuring performance? generate seed documents contacts
, tags
, persist variations of these , run queries using $lookup , measure performance. few benchmarks, example:
- 1,000 contacts , 10,000 tags
- 100,000 contacts , 1,000,000 tags
- 1,000,000 contacts , 10,000,000 tags
- 10,000,000 contacts , 100,000,000 tags
when running benchmark tests can additionally use explain()
understand what's going on inside mongodb.
you might find performance acceptable, can know since understand expectations users of system have respect performance.
one last point, if use case here given user wants find of contacts , tags could handled 'client side join' i.e. 2 queries (1) tags "userid" : "..."
, (2) find contacts referenced tags. depending on use cases are, could more performant server side join (aka $lookup).
Comments
Post a Comment