Home > other >  How to create a a column from a pandas dataframe with the repeated values in dictionary format
How to create a a column from a pandas dataframe with the repeated values in dictionary format

Time:03-21

i'm very confused on how to do this, (i'm very newbie yet) and I need to convert this dataframe into a dictionary with a column for repeated values:

import pandas as pd
df = pd.DataFrame({'Name': [['John', 'hock'], ['John','pepe'],['Peter', 'wdw'],['Peter'],['John'], ['Stef'], ['John']],
                   'Age': [38, 47, 63, 28, 33, 45, 66]
                  })

and i need something like:

Name Age Repeated:
John 38  4

thanks!

CodePudding user response:

Use DataFrame.explode with GroupBy.size:

df = df.explode('Name').groupby(['Name']).size().reset_index(name='Repeated')
print (df)
    Name  Repeated
0   John         4
1  Peter         2
2   Stef         1
3   hock         1
4   pepe         1
5    wdw         1

CodePudding user response:

I can think of something like:

resultDict = {}
for index, row in df.iterrows():
  for value in row["Name"]:
    if value not in resultDict:
      resultDict[value] = 0
    resultDict[value]  = 1
resultDict

Output

{'John': 4, 'Peter': 2, 'Stef': 1, 'hock': 1, 'pepe': 1, 'wdw': 1}

If you want to have it as a dataframe and not a dictionary:

resultDict = {}
for index, row in df.iterrows():
  for value in row["Name"]:
    if value not in resultDict:
      resultDict[value] = 0
    resultDict[value]  = 1
pd.DataFrame({"Name":resultDict.keys(), "Repeated":resultDict.values()})

Output

Name Repeated
John 4
hock 1
pepe 1
Peter 2
wdw 1
Stef 1
  • Related