I have a variable called columns_list
which has a list of data frame columns desired: columns_list = ['col1','col2','col3']
. How do I iterate through the dataframe dynamically?
Right now the code I have is like this:
for i in range(len(df)):
s = tuple(zip(df[columns_list][0].str.split(",")[i],df[columns_list][1].str.split(",")[i],df[columns_list][2].str.split(",")[i])
How to make this work dynamically when the column_list keeps changing?
CodePudding user response:
Are you trying to do something like this?
for i in range(len(df)):
s = []
for col in columns_list:
s.append(df.iloc[i][col].split(","))
s = tuple(zip(s))
...
Or with list comprehensions:
for i in range(len(df)):
s = tuple(zip(df.iloc[i][col].split(",") for col in columns_list))
...
CodePudding user response:
If you really want to iterate over the dataframe then I'd do something like:
for s in zip(*(df[c].str.split(",") for c in columns_list)):
print(s)
Result for
df = pd.DataFrame(
{"col1": ["a,b,c", "d,e"], "col2": ["1,2", "3,4,5"], "col3": ["x,y,z", "v"]}
)
columns_list = ["col2", "col3"]
is
(['1', '2'], ['x', 'y', 'z'])
(['3', '4', '5'], ['v'])
If you want to modify your original dataframe:
for c in columns_list:
df[c] = df[c].str.split(",")
col1 col2 col3
0 a,b,c [1, 2] [x, y, z]
1 d,e [3, 4, 5] [v]
Or if you want to build a new one with the splitted columns:
df_cols_splitted = pd.concat(
(df[c].str.split(",") for c in columns_list), axis="columns"
)
col2 col3
0 [1, 2] [x, y, z]
1 [3, 4, 5] [v]