python - How to use "sqlContext" in different notebooks when using one of them as a module (Pyspark) -


i have notebook a.pynb has function read statement of parquet file.

i using a.pynb in notebook b.pynb , in new notebook, calling function of a.pynb read parquet file , create sqltable. fails

      error: global name sqlcontext not defined,  

when defined in both notebooks.

the exact code :

a.pynb ( utils)

   sc = sparkcontext.getorcreate()    sqlcontext = sqlcontext(sc)      def parquet_read(file_name):         df = sqlcontext.read.parquet(file_name+"*.parquet")         return df 

in b.pynb have used function

    import nbimporter     import commonutils     reload(commonutils)     sc = sparkcontext.getorcreate()     sqlcontext = sqlcontext(sc)      df2 = commonutils.parquet_read("abc") 

it fails

    error: global name sqlcontext not defined,  

when defined in both notebooks.

i hesitantly use approach you're following (i.e. importing notebooks modules). think far better served moving utility code .py file , not trying use magic import notebook module.

based on documentation, appears overlooked magic:

here run code either defines function or class

it looks code sample define sqlcontext module-level variable, not class or function.

one approach reorganize code following. better still, think, move code .py file.

def parquet_read(file_name):     sc = sparkcontext.getorcreate()     sqlcontext = sqlcontext(sc)     df = sqlcontext.read.parquet(file_name+"*.parquet")     return df 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -