java - Unable to extract array/list from dataframe, AnalysisException : need struct type but got binary -
i have dataset string[], , struggling extract columns out of it. here's code
import static org.apache.spark.sql.functions.col; //read parquet data dataset<row> readerdf = spark.readstream().format("parquet"). list<string> columns = arrays.aslist("city","country"); //interested in field in data 'fieldmap' map<string,string> dataset<string[]> stringarrdf = readerdf.map((mapfunction<row, string[]>) row -> { map<string,string> fields = row.getjavamap(row.fieldindex("fieldmap")); list<string> columnlist = new arraylist<>(); columns.foreach(columnname -> { columnlist.add(fields.getordefault(columnname, "")); }); return columnlist.toarray(new string[columns.size]); }, encoders.kryo(string[].class)); //i expecting extract city here: dataset ds = stringarrdf.select(col("value").getitem(1).as("city"));
but fails below exception.
exception in thread "main" org.apache.spark.sql.analysisexception: can't extract value value#22;
how can access string[] or list field dataset?
you getting below error.
exception in thread "main" org.apache.spark.sql.analysisexception: can't extract value value#22: need struct type got binary;
you using encoders.kryo(string[].class)
creating stringarrdf
. if check documentation encoders.kryo
, says
creates encoder serializes objects of type t using kryo. encoder maps t single byte array (binary) field.
use spark.implicits().newstringarrayencoder()
encoding string[].
Comments
Post a Comment