java - How to debug ClassCastException error in Spark? -

August 15, 2013

when try in pyspark simple read spark 2.1.1 elasticsearch 2.4 via elasticsearch-spark connector 5.1.2 (es_read_field_exclude , es_read_field_as_array_include environment variables, rest variables passed arguments reading function or contained in self object):

df = spark.read.format("org.elasticsearch.spark.sql") \             .option("es.net.proxy.http.host", self.server) \             .option("es.net.proxy.http.port", self.port) \             .option("es.net.http.auth.user", self.username) \             .option("es.net.http.auth.pass", self.password) \             .option("es.net.proxy.http.user", self.username) \             .option("es.net.proxy.http.pass", self.password) \             .option("query", qparam) \             .option("pushdown", "true") \             .option("es.read.field.exclude",es_read_field_exclude) \             .option("es.read.field.as.array.include",es_read_field_as_array_include) \             .load(self.index) \             .limit(limit) \             .select(*fields) \             .withcolumn("id", monotonically_increasing_id())

i'm getting classcastexception error (from double long):

warn scheduler.tasksetmanager: lost task 42.0 in stage ...: java.lang.classcastexception: java.lang.double cannot cast java.lang.long @ scala.runtime.boxesruntime.unboxtolong(boxesruntime.java:105) ...

the strange thing works, not. suspect reading data null values or data have no content fields causes problem it's hypothesis, i'm maybe wrong.

is there way better trace error, don't know at.

i found problem. first have used latest dev build spark elasticsearch connector (6.0.0-beta1), hoping solve problem. not case time error message more informative:

org.elasticsearch.hadoop.eshadoopillegalargumentexception: incompatible types found in multi-mapping:  field [my_problematic_field] has conflicting types of [long] , [double].

now understand cast class exception long double got @ beginning. related field defined in 1 index long , in double (i use 1 index alias in es point series of indexes). problem these fields dynamically mapped es when inserted first time in respective index, , casted long (because first value example 123) , other casted double (because first value example 123.0).

i don't know if there way around problem without having reindex data (billions!)

Search This Blog

LP

java - How to debug ClassCastException error in Spark? -

Comments

Post a Comment

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

nginx - phpPgAdmin - log in works but I have to login again after clicking on any links -

How to deploy a middleman blog inside a rails app? -