Using Python/Pandas (of whatever is best way in python):
What if you want to increment a new column with values such as A1,A2,A3...
in the rows (increasing on each row) based on if values are repeating in an existing column such that if the repeating in the existing column ends, column 2 jumps to B1,B2,B3...
for the next repeating value (note if next value is not repeating it would simply be B1
and then C1
, etc)?
In other words, I have the existing column below and would like to simply add the new column as such:
Existing Column | New Column |
---|---|
12 | A1 |
12 | A2 |
12 | A3 |
13 | B1 |
13 | B2 |
14 | C1 |
15 | D1 |
15 | D2 |
and so forth....
I tried stuff like this which is not really close because
a) it does not start and stop counting based on existing column's values - it simply goes to the end with a single count. b) there is another incrementing or keeping track involved because when one count is done A1,A2,A3 - the next count should start with B (B1, B2, B3) and this means the letter one is one needs to be kept track of for the new column.
list = []
nid = df['note_id']
for i in range(1, df.shape[0] 1): # gets the length of the DataFrame.
if nid[df[i]] == nid[df[i 1]]:
list.append(f'A{i}')
df['ID'] = list
CodePudding user response:
Update
Use get_column_letter
from openpyxl
package if you have more than 26 letters:
from openpyxl.utils import get_column_letter
df['New Column'] = df['Existing Column'].ne(df['Existing Column'].shift()) \
.cumsum().map(get_column_letter)
df['New Column'] = df.groupby('New Column').cumcount().add(1).astype(str)
print(df)
# Output
Existing Column New Column
0 12 A1
1 12 A2
2 12 A3
3 13 B1
4 13 B2
5 14 C1
6 15 D1
7 15 D2
Old answer
Quick and dirty but it works only if you have less than 26 different values in Existing Column
.
df['New Column'] = pd.Series(pd.factorize(df['Existing Column'])[0]).add(65).map(chr)
df['New Column'] = df.groupby('New Column').cumcount().add(1).astype(str)
print(df)
# Output
Existing Column New Column
0 12 A1
1 12 A2
2 12 A3
3 13 B1
4 13 B2
5 14 C1
6 15 D1
7 15 D2
CodePudding user response:
Have you tried looking at pandas rank? You could group by the existing column and then do a rank(method='first')
. Then your existing column and the ranked column together could serve as the new column you want.