Home > Software design >  Splitting a non delimited column and create an additional column to count which number value
Splitting a non delimited column and create an additional column to count which number value

Time:09-08

I have a problem in which I want to take Table 1 and turn it into Table 2 using Python.

Does anybody have any ideas? I've tried to split the Value column from table 1 but run into issues in that each value is a different length, hence I can't always define how much to split it.

Equally I have not been able to think through how to create a new column that counts the position that value was in the string.

Table 1, before:

ID Value
1 000000S
2 000FY

Table 2, after:

ID Position Value
1 1 0
1 2 0
1 3 0
1 4 0
1 5 0
1 6 0
1 7 S
2 1 0
2 2 0
2 3 0
2 4 F
2 5 Y

CodePudding user response:

You can split the string to individual characters and explode:

out = (df
  .assign(Value=df['Value'].apply(list))
  .explode('Value')
)

output:

   ID Value
0   1     0
0   1     0
0   1     0
0   1     0
0   1     0
0   1     0
0   1     S
1   2     0
1   2     0
1   2     0
1   2     F
1   2     Y

CodePudding user response:

Given:

   ID    Value
0   1  000000S
1   2    000FY

Doing:

df.Value = df.Value.apply(list)
df = df.explode('Value')
df['Position'] = df.groupby('ID').cumcount()   1

Output:

   ID Value  Position
0   1     0         1
0   1     0         2
0   1     0         3
0   1     0         4
0   1     0         5
0   1     0         6
0   1     S         7
1   2     0         1
1   2     0         2
1   2     0         3
1   2     F         4
1   2     Y         5
  • Related