How to do intersection match between 2 DataFrames in Pandas? -
assume exists 2 dataframes a
, b
following
a
:
a b b c c
b
:
1 2 3 4
how produce c
dataframe like
a 1 2 3 4 b b 1 2 b b 3 4 c c 1 2 c c 3 4
is there function in pandas can operation?
first values has unique in each dataframe
.
i think need product
:
from itertools import product = pd.dataframe({'a':list('abc')}) b = pd.dataframe({'a':[1,2]}) c = pd.dataframe(list(product(a['a'], b['a']))) print (c) 0 1 0 1 1 2 2 b 1 3 b 2 4 c 1 5 c 2
pandas pure solutions multiindex.from_product
:
mux = pd.multiindex.from_product([a['a'], b['a']]) c = pd.dataframe(mux.values.tolist()) print (c) 0 1 0 1 1 2 2 b 1 3 b 2 4 c 1 5 c 2
c = mux.to_frame().reset_index(drop=true) print (c) 0 1 0 1 1 2 2 b 1 3 b 2 4 c 1 5 c 2
solution cross join merge
, column filled same scalars assign
:
df = pd.merge(a.assign(tmp=1), b.assign(tmp=1), on='tmp').drop('tmp', 1) df.columns = ['a','b'] print (df) b 0 1 1 2 2 b 1 3 b 2 4 c 1 5 c 2
edit:
a = pd.dataframe({'a':list('abc'), 'b':list('abc')}) b = pd.dataframe({'a':[1,3], 'c':[2,4]}) print (a) b 0 1 b b 2 c c print (b) c 0 1 2 1 3 4 c = pd.merge(a.assign(tmp=1), b.assign(tmp=1), on='tmp').drop('tmp', 1) c.columns = list('abcd') print (c) b c d 0 1 2 1 3 4 2 b b 1 2 3 b b 3 4 4 c c 1 2 5 c c 3 4
Comments
Post a Comment