Home > Blockchain >  Increment numbers in a Pandas column, based on existing highest value
Increment numbers in a Pandas column, based on existing highest value

Time:05-18

Grateful for your help. This has been driving me crazy as I'm completely stumped. I'm trying to concat two dataframes. dfA has incremental numbers in column 1, dfB does not. Starting from highest value in dfA, I would like to increment the numbers in the rows from dfB when they are concatenated in dfC. So it should be:

dfA:

Column 1 Column 2
1 John
2 Peter

dfB:

Column 1 Column 2
Sara
Paul

dfC (concat of dfA and dfB):

Column 1 Column 2
1 John
2 Peter
3 Sara
4 Paul

Thanks so much for your help!

CodePudding user response:

You could modify dfB like this -

dfB['Column 1'] = pd.Series(list(range(dfA['Column 1'].max(), dfA['Column 1'].max()   len(dfB.index))))

Then you could simply concatenate dfA and the updated version of dfB.

CodePudding user response:

You can first assign values starting from the next to last value of dfA's Column1 to Column1 of dfB and then concat :

dfB['Column1'] = pd.Series(list(range(1 dfA['Column1'].iloc[-1],len(dfB) len(dfA)))
dfC = pd.concat([dfA,dfB])

CodePudding user response:

You can take the highest value of dfA's Column 1, then replace dfB's Column 1 values. Just create a sequence from the highest value (from dfA) and from there using range() and replace, for example:

# Recreating data for example
dfA = {'Column 1': [1, 2], 'Column 2': ['John', 'Peter']}
dfA = pd.DataFrame(data=dfA)
dfB = {'Column 1': [0, 0], 'Column 2': ['John', 'Peter']}
dfB = pd.DataFrame(data=dfB)

# Specifying the range and replacing the values in Column 1
start = len(dfB['Column 1'])   1
stop = start   dfA['Column 1'].max()
dfB['Column 1'] = range(start, stop)

# Concatenate both df
dfC = pd.concat([dfA, dfB], axis=0)

Output should be the following:

Column 1    Column 2
    1         John
    2         Peter
    3         John
    4         Peter
  • Related