java - How to debug ClassCastException error in Spark? -


when try in pyspark simple read spark 2.1.1 elasticsearch 2.4 via elasticsearch-spark connector 5.1.2 (es_read_field_exclude , es_read_field_as_array_include environment variables, rest variables passed arguments reading function or contained in self object):

df = spark.read.format("org.elasticsearch.spark.sql") \             .option("es.net.proxy.http.host", self.server) \             .option("es.net.proxy.http.port", self.port) \             .option("es.net.http.auth.user", self.username) \             .option("es.net.http.auth.pass", self.password) \             .option("es.net.proxy.http.user", self.username) \             .option("es.net.proxy.http.pass", self.password) \             .option("query", qparam) \             .option("pushdown", "true") \             .option("es.read.field.exclude",es_read_field_exclude) \             .option("es.read.field.as.array.include",es_read_field_as_array_include) \             .load(self.index) \             .limit(limit) \             .select(*fields) \             .withcolumn("id", monotonically_increasing_id()) 

i'm getting classcastexception error (from double long):

warn scheduler.tasksetmanager: lost task 42.0 in stage ...: java.lang.classcastexception: java.lang.double cannot cast java.lang.long @ scala.runtime.boxesruntime.unboxtolong(boxesruntime.java:105) ... 

the strange thing works, not. suspect reading data null values or data have no content fields causes problem it's hypothesis, i'm maybe wrong.

is there way better trace error, don't know at.

i found problem. first have used latest dev build spark elasticsearch connector (6.0.0-beta1), hoping solve problem. not case time error message more informative:

org.elasticsearch.hadoop.eshadoopillegalargumentexception: incompatible types found in multi-mapping:  field [my_problematic_field] has conflicting types of [long] , [double]. 

now understand cast class exception long double got @ beginning. related field defined in 1 index long , in double (i use 1 index alias in es point series of indexes). problem these fields dynamically mapped es when inserted first time in respective index, , casted long (because first value example 123) , other casted double (because first value example 123.0).

i don't know if there way around problem without having reindex data (billions!)


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -