I've been struggling trying to take a dictionary "d" composed of n dataframes, and apply to them this:
idf = idf.iloc[idf.index.repeat(idf.iloc[:,0])]
Which is a function to repeat index-number of times the column 0 of each dataframe. Something like this:
BEFORE: AFTER:
Index Index
1290 2 1290 2
1320 3 1290 2
1400 4 1320 3
1320 3
1320 3
1400 4
1400 4
1400 4
1400 4
So, the dictionary "d" has the dataframes that look like the before column. I tried this way to apply the function:
for idf in d:
d = idf.iloc[idf.index.repeat(idf.iloc[:,0])]
I was able to do it this way when I select manually a column name, but these dataframes have different column names (on purpose). But I can't apply this because .iloc[ ] doesn't work on strings (I found weird because it is not selecting the values of the dictionary, instead is using the string of the dictionary).
If I want back the dictionary "d" with the function applied, how can I solve this?
Thanks!
Edits:
- Example picture of one of the dataframes inside the dictionary "d", remember that the name of the first column [0] is different in each dataframe (and it shouldn't be changed for data managment things):
- I already know how to repeat n times, my question is to apply it to a dictionary with dataframes.
CodePudding user response:
Is this doing what you need?
import pandas as pd
df1 = pd.DataFrame({"a":[2, 3, 4, 3], "col1":[1, 2, 3, 4]})
df1.set_index("a", inplace=True)
df2 = pd.DataFrame({"b":[1, 2, 4], "col2":[3, 2, 1]})
df2.set_index("b", inplace=True)
d = {"df1": df1, "df2": df2}
for idf,this_df in d.items():
d[idf] = this_df.loc[this_df.index.repeat(this_df.iloc[:,0])]