Home > Enterprise >  python: Dictionary key as row index, values as column headers. How do I refer back and select specif
python: Dictionary key as row index, values as column headers. How do I refer back and select specif

Time:06-06

I have a dataframe that looks like this:

a=['a','b','c','d']
b=['the','fox','the','then']
c=['quick','jumps','lazy','barks']
d=['brown','over','dog','loudly']
df=pd.DataFrame(zip(a,b,c,d),columns=['indexcol','col1','col2','col3'])

and a dictionary that looks like this:

keys=['a','b','c','d']
vals=[]
vals.append(['col1','col3'])
vals.append(['col1','col2'])
vals.append(['col1','col2','col3'])
vals.append(['col2','col3'])
newdict = {k: v for k, v in zip(keys, vals)}

What I'm trying to do is to create a new column in df which constructs a statement for each row. Taking the first row as an example, the sentence should look like so:

"col1 is 'the' | col3 is 'lazy' "

another example using the 3rd row just to make the task at hand crystal clear: "col1 is 'brown' | col2 is 'the' | col3 is 'then' "

essentially, I want to refer to the dictionary values to look up the column in df using the dictionary keys as the row reference matching to indexcol in df.

Thanks in advance.

CodePudding user response:

I'm not sure if I understand you correctly but you can try:

df = df.set_index("indexcol")

for k, v in newdict.items():
    row = df.loc[k]
    df.loc[k, "new_column"] = " | ".join(f"{i} is '{row[i]}'" for i in v)

print(df.reset_index())

Prints:

  indexcol  col1   col2    col3                                      new_column
0        a   the  quick   brown                 col1 is 'the' | col3 is 'brown'
1        b   fox  jumps    over                 col1 is 'fox' | col2 is 'jumps'
2        c   the   lazy     dog  col1 is 'the' | col2 is 'lazy' | col3 is 'dog'
3        d  then  barks  loudly              col2 is 'barks' | col3 is 'loudly'

CodePudding user response:

I guess this is what you're looking for

def func(df_row):
    return ' | '.join(
        f'"{col}" is "{df_row[col]}"'
        for col in newdict[df_row['indexcol']]
    )

df['new col'] = df.apply(func, axis=1)
indexcol col1 col2 col3 new col
a the quick brown "col1" is "the"
b fox jumps over "col1" is "fox"
c the lazy dog "col1" is "the" "col3" is "dog"
d then barks loudly "col2" is "barks"
  • Related