data structures - Is there an equivalent to the Resilient Distributed Dataset in native Clojure? -
apache spark has concept of resilient distributed dataset.
an rdd is:
it immutable distributed collection of objects. each dataset in rdd divided logical partitions, may computed on different nodes of cluster.
formally, rdd read-only, partitioned collection of records. rdds can created through deterministic operations on either data on stable storage or other rdds. rdd fault-tolerant collection of elements can operated on in parallel.
now clojure has immutable data structures, , running higher order functions in parallel.
i'm aware of flambo , sparkling. i'm not looking interface, equivalent data structure.
my question is: is there equivalent resilient distributed dataset in native clojure?
well, normal clojure map , vector can easilly processed in sub partitions parallely on multiple cores using core.reducers/fold.
maps , vectors being immutable default, setup seems equivalent rdd are.
the difference being, fold compute on multi-cores, not multiple machines. parallel, not distributed.
onyx , storm distributed computing frameworks implemented in clojure , can spark does. these close gets rdd on spark.
Comments
Post a Comment