I have the following lists:
dates = ['12/29/2020', '12/25/2020', '12/22/2020']
numbers = [ [1, 31, 35], [17, 23, 36], [29, 53, 56] ]
I used them to make a DataFrame:
df = pd.DataFrame(
{
'date':dates,
'nums': numbers
}
)
This gives me a DataFrame with two columns. I want to break out the elements in the list to create 3 columns (one for each number in the list) to end up with the following DataFrame:
date num1 num2 num3
0 '12/29/2020' 1 31 35
1 '12/25/2020' 17 23 36
2 '12/22/2020' 29 53 56
How can I do this?
CodePudding user response:
Create a new data frame from nums
column by converting it to list first, and then concat with date
column:
pd.concat([df.date, pd.DataFrame(df.nums.to_list()).add_prefix('num')], axis=1)
date num0 num1 num2
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
CodePudding user response:
Create a new dataframe and join it back:
>>> df[['date']].join(pd.DataFrame(df['num'].tolist()).rename(lambda x: f'num{x 1}', axis=1))
date num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
>>>
Or just add_prefix
:
>>> df[['date']].join(pd.DataFrame(df['num'].tolist()).add_prefix('num'))
date num0 num1 num2
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
>>>
CodePudding user response:
So the other answers sufficiently cover the case where you need to fix an already existing dataframe, but just in case you have the opportunity, it's much easier to simply fix your data before creating a dataframe:
In [1]: import pandas as pd
In [2]: dates = ['12/29/2020', '12/25/2020', '12/22/2020']
In [3]: numbers = [[1, 31, 35], [17, 23, 36], [29, 53, 56]]
In [4]: nums = {f"num{i}": n for i, n in enumerate(zip(*numbers), 1)}
In [5]: df = pd.DataFrame({"dates": dates, **nums})
In [6]: df
Out[6]:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
Or, another way:
In [7]: data = [[date, *nums] for date, nums in zip(dates, numbers)]
In [8]: pd.DataFrame(data, columns=["dates", "num1", "num2", "num3"])
Out[8]:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
CodePudding user response:
You can use a dataframe constructor like this:
pd.DataFrame(numbers,
index=dates,
columns=[f'num{i 1}' for i in range(len(numbers))])\
.rename_axis('dates').reset_index()
Output:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56