Home > Enterprise >  Pandas: How do you fill an index according to its values
Pandas: How do you fill an index according to its values

Time:11-14

Pretty new to stackoverflow and data munging all together so apologies if this is a overly simple or previously asked question,

Say I have data as below:

index = list('ABCDEF')
values = [1,2,3,4,5,6]
test = pd.Series(values, index = index)

A    1
B    2
C    3
D    4
E    5
F    6

and want to create something like below, where the number of times each index value is appended is given by its value in the previous object

0     A
1     B
2     B
3     C
4     C
5     C
6     D
7     D
8     D
9     D
10    E
11    E
12    E
13    E
14    E
15    F
16    F
17    F
18    F
19    F
20    F

I have written the following code, but feel that looping defeats the whole purpose of using pandas. If anyone knows of a more simplistic and elegant solution, please share:

aggr = pd.Series([])

for index,value in zip(test.index.values,test):
    to_append = pd.Series(list(index*value))
    aggr = aggr.append(to_append, ignore_index = True)

Cheers

CodePudding user response:

You can use pd.repeat on the index:

pd.Series(test.index.repeat(test))

0     A
1     B
2     B
3     C
4     C
5     C
6     D
7     D
8     D
9     D
10    E
11    E
12    E
13    E
14    E
15    F
16    F
17    F
18    F
19    F
20    F

CodePudding user response:

Use Index.repeat.

You can transform your index to a Series (to_series) or a DataFrame (to_frame) and give it a name with name='...' as parameter of both methods:

>>> test.index.repeat(test).to_series().reset_index(drop=True)
0     A
1     B
2     B
3     C
4     C
5     C
6     D
7     D
8     D
9     D
10    E
11    E
12    E
13    E
14    E
15    F
16    F
17    F
18    F
19    F
20    F
dtype: object

CodePudding user response:

In a general case, outside pandas you can generate this with list comprehension which then you might want to flatten. Given we are using pandas, we can make good use of explode() to flatten the nested list:

[[index[x-1]]*x for x in values]

Outputs:

[['A'],
 ['B', 'B'],
 ['C', 'C', 'C'],
 ['D', 'D', 'D', 'D'],
 ['E', 'E', 'E', 'E', 'E'],
 ['F', 'F', 'F', 'F', 'F', 'F']]

Therefore passing it to a pd.Series() and using explode():

pd.Series([[index[x-1]]*x for x in values]).explode()

Outputs:

0    A
1    B
1    B
2    C
2    C
2    C
3    D
3    D
3    D
3    D
4    E
4    E
4    E
4    E
4    E
5    F
5    F
5    F
5    F
5    F
5    F
dtype: object
  • Related