spark dataframe - Pyspark - WARN BisectingKMeans: The input RDD is not directly cached -


i'm running bisecting kmeans as

bkm_test=bisectingkmeans().setk(5).setseed(1)  rdf.cache() assembled.cache() model_test=bkm_test.fit(assembled) 

i cached 2 dataframes keep getting error, doesn't make difference, found question similar kmeans. warn executor error below. inside algorithm can't fix?

17/08/14 21:53:17 warn bisectingkmeans: input rdd 306 not directly cached, may hurt performance if parent rdds not cached. 17/08/14 21:53:17 warn executor: 1 block locks not released tid = 132: [rdd_302_0] 


Comments

Popular posts from this blog

python Tkinter Capturing keyboard events save as one single string -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

javascript - Z-index in d3.js -