I have the following code in R:
col1 = 'carrier'
col2 = 'mode'
a = tbl %>%
select(sym(col1), sym(col2))
tbl2 <- a %>%
select(sym(col1), sym(col2)) %>%
expand(!!!syms(c(col1, col2)))
This is the content of tbl
:
carrier mode
3 CRX ALL
4 GLS ALL
6 LSR ALL
7 TFRC ALL
8 UDS ALL
11 UPS GROUND
12 UPS AIR2
13 UPS AIR1
14 FEDEX GROUND
15 FEDEX AIR2
16 FEDEX AIR1
And this is the final content of tbl2
:
carrier mode
CRX AIR1
CRX AIR2
CRX ALL
CRX GROUND
FEDEX AIR1
FEDEX AIR2
FEDEX ALL
FEDEX GROUND
GLS AIR1
GLS AIR2
GLS ALL
GLS GROUND
LSR AIR1
LSR AIR2
LSR ALL
LSR GROUND
TFRC AIR1
TFRC AIR2
TFRC ALL
TFRC GROUND
UDS AIR1
UDS AIR2
UDS ALL
UDS GROUND
UPS AIR1
UPS AIR2
UPS ALL
UPS GROUND
I can see very clearly what R expand()
does, but I haven't found an equivalent in Pandas to do so.
CodePudding user response:
You can check with merge
, since we do have the method cross
, you only need drop_duplicates
before pass two columns join
new = df1[['carrier']].drop_duplicates().merge(df1[['mode']].drop_duplicates(),how='cross')
CodePudding user response:
Are you looking for expand_grid
from the pandas cookbook?
import itertools
def expand_grid(data_dict):
rows = itertools.product(*data_dict.values())
return pd.DataFrame.from_records(rows, columns=data_dict.keys())
expand_grid(tbl.to_dict(orient='list')).drop_duplicates()