I am trying to create datasets from the name of the columns of a dataframe. Where I have the columns ['NAME1', 'EMAIL1', 'NAME2', 'EMAIL2', NAME3', 'EMAIL3', etc].
I'm trying to split the dataframe based on the 'EMAIL' column, where through a loop, but it's not working properly.
I need it to be a JSON, because there is the possibility that between each 'EMAILn' column there may be a difference in number of columns.
I need this:
This is my code:
for i in df_entities.filter(regex=('^(EMAIL)' str(i))).columns:
df_groups = df_temp_1.groupby(i)
df_detail = df_groups.get_group(i)
display(df_detail)
What do you recommend me to do?
From already thank you very much.
Regards
CodePudding user response:
filter
returns a copy of your dataframe with only the matching columns, but you're trying to loop over just the column names. Just add .columns
:
for i in df_entities.filter(regex=('^(Email)' str(i))).columns:
... # ^^^^^^^^^ important
CodePudding user response:
From your input and desired output, simply call pandas.wide_to_long
:
long_df = pd.wide_to_long(
df_entities.reset_index(),
stubnames=["NAME", "EMAIL"],
i="index",
j="version"
)