python 3.x - How to apply this pandas.Series code to compare several files within a folder -
i have code finds csv files in folder , reads them in:
directory = os.fsencode(folderpath) os.chdir(directory) file in os.listdir(directory): filename = os.fsdecode(file) if filename.endswith(".csv"): df1 = pd.read_csv(filename)[columnname]
now have code can find rows found in every single csv file input:
match = pd.series(list(set(file1.columnname) & set(file2.columnname) & set(file3.columnname) & set(file4.columnname)))
how can merge 2 pieces of code above find rows found in every single csv file within folder , return matches in single pandas dataframe?
i think can create list of series
first , dynamically find matches reduce
:
#data previous answer vals = [] directory = os.fsencode(folderpath) os.chdir(directory) file in os.listdir(directory): filename = os.fsdecode(file) if filename.endswith(".csv"): df1 = pd.read_csv(filename)['name'] vals.append(df1) functools import reduce = reduce(lambda x, y: set(x) & set(y), vals) print (a) {'ben', 'neil'} df = pd.dataframe({'col':list(a)}) print (df) col 0 ben 1 neil
Comments
Post a Comment