I have a list of lists representing some data that I want to convert to a pandas dataframe.
Each of the list does not represent a row, but instead a column in the DataFrame.
ColumnNames = ['Col1', 'Col2']
MyData = [[1, 2, 3], ['Foo', 'Bar', 'Baz']]
Now if I just create a Dataframe from this data, it takes the values RowWise, which is not what I want. Because of that I also cannot give column names when creating the dataframe.
What I have seen is, create the DataFrame rowwise, transpose it, add column names.
MyDf = pd.DataFrame(MyData)
MyDf = MyDf.transpose()
MyDf.columns = MyColums
This works, but looks a bit Hacky to me, requiring 3 instructions to create a Dataframe and also I'm not sure about the efficiency of first creating a Dataframe the wrong way and then having to transpose it, although I imagine that there's no actual data movement when transposing.
Question is: Is there a better way?
CodePudding user response:
Use zip
:
MyDf = pd.DataFrame(zip(*MyData), columns=ColumnNames)
# OR
MyDf = pd.DataFrame({k: v for k, v in zip(ColumnNames, MyData)})
Output:
>>> MyDf
Col1 Col2
0 1 Foo
1 2 Bar
2 3 Baz
Details about the transformation
>>> list(zip(*MyData))
[(1, 'Foo'), (2, 'Bar'), (3, 'Baz')]
CodePudding user response:
You can create the dataframe setting your columns as index, and transpose:
ColumnNames = ['Col1', 'Col2']
MyData = [[1, 2, 3], ['Foo', 'Bar', 'Baz']]
df = pd.DataFrame(MyData, index=ColumnNames).T
output:
Col1 Col2
0 1 Foo
1 2 Bar
2 3 Baz