Home > Enterprise >  Pandas / python to autogenerate new values through a range
Pandas / python to autogenerate new values through a range

Time:04-13

I have a dataframe read in from some CSV data. I have two numbers in seperate columns, representing a range and would like to do the following;

A) iterate through the data set B) At each new index, calculate the range C) create a total number of new rows based on the sum of that range (doesn't need to be in the existing dataframe) D) And for each new row, insert incremental values through that range, concatanating specified column data, and retaining the original additional details from the other columns.

Example:

value range1 range2 color
a 0 3 blue
b 4 6 yellow

Result:

output value colour
a1 a blue
a2 a blue
a3 a blue
b4 b yellow
b5 b yellow
b6 b yellow

CodePudding user response:

Here's one way to do it:

import pandas as pd
df = pd.DataFrame({
    'value' : ['a', 'b'],
    'range1' : [0, 4],
    'range2' : [3, 6],
    'color' : ['blue', 'yellow']
})
df['output'] = df.apply(lambda x: [x['value']   str(i   1) for i in range(x['range1'], x['range2'])], axis=1)
df = df.explode('output', ignore_index=True)[['output', 'value', 'color']]
print(df)

Output

  output value   color
0     a1     a    blue
1     a2     a    blue
2     a3     a    blue
3     b5     b  yellow
4     b6     b  yellow

CodePudding user response:

Apart from that it's not clear why range 0 is not counted,

but here is one way:

df = df.loc[df.index.repeat(df.range2 - df.range1 1)]
df['output'] = df.value   (df.range1   df.groupby('value').cumcount()).map(str)

output:

>>
  value  range1  range2   color output
0     a       0       3    blue     a0
0     a       0       3    blue     a1
0     a       0       3    blue     a2
0     a       0       3    blue     a3
1     b       4       6  yellow     b4
1     b       4       6  yellow     b5
1     b       4       6  yellow     b6
  • Related