apache spark sql - unable to load data from parquet files to hive external table -
i have written below scala code create parquet file
scala> case class person(name:string,age:int,sex:string) defined class person scala> val data = seq(person("jack",25,"m"),person("john",26,"m"),person("anu",27,"f")) data: seq[person] = list(person(jack,25,m), person(john,26,m), person(anu,27,f)) scala> import sqlcontext.implicits._ import sqlcontext.implicits._ scala> import org.apache.spark.sql.savemode import org.apache.spark.sql.savemode scala> df.select("name","age","sex").write.format("parquet").mode("overwrite").save("sparksqloutput/person")
hdfs status:
[cloudera@quickstart ~]$ hadoop fs -ls sparksqloutput/person found 4 items -rw-r--r-- 1 cloudera cloudera 0 2017-08-14 23:03 sparksqloutput/person/_success -rw-r--r-- 1 cloudera cloudera 394 2017-08-14 23:03 sparksqloutput/person/_common_metadata -rw-r--r-- 1 cloudera cloudera 721 2017-08-14 23:03 sparksqloutput/person/_metadata -rw-r--r-- 1 cloudera cloudera 773 2017-08-14 23:03 sparksqloutput/person/part-r-00000-2dd2f334-1985-42d6-9dbf-16b0a51e53a8.gz.parquet
then have created external hive table using command below
hive> create external table person (name string,age int,sex string) stored parquet location '/sparksqlouput/person/'; ok time taken: 0.174 seconds hive> select * person > ; ok time taken: 0.125 seconds
but while run above select query no rows returned. kindly on this.
in general, hive sql statement 'select * <table>'
locates table directory table data exist , dumps file contents hdfs
directory.
in case select *
not working means location not correct.
please note, in scala last statement contains save("sparksqloutput/person")
, "sparksqloutput/person"
relative path , expand "/user/<logged in username>/sparksqloutput/person"
(i.e. "/user/cloudera/sparksqloutput/person"
).
hence while creating hive table should use "/user/cloudera/sparksqloutput/person"
instead of "/sparksqloutput/person"
. practically "/sparksqloutput/person"
not exist , hence did not output in select * person
.
Comments
Post a Comment