I have a problem where I am taking a 2D array and want to convert it to a pandas Dataframe. I will be taking this Dataframe and displaying in an excel spreadsheet.
I created my Dataframe like this: df = pd.DataFrame("twoDArray")
. The 2D array I am transforming into a Dataframe is of length 8, and I named all the columns using the following code,
df.columns = ["column1", column2", "column3", "column4", "column5", "column6", "column7", "column8"]
The Nested arrays are very long, and not always the same length. I want the nested arrays at each index to be an entire column on the Dataframe. So one row would be lst[0][0], lst[1][0], lst[2][0]
.
example:
Pandas seems to do this by default
lst = [["hello", 1],["World", 3], ["Goodbye" , 5]]
df = pd.DataFrame(lst)
output:
column1 column2
1 hello 1
2 World 3
3 Goodbye 5
but I want:
lst = [["hello", 1, 2], ["World", 3], ["Goodbye", 5,6,7,"test"]]
df = pd.DataFrame(lst)
output:
column1 column2 column3
1 hello World Goodbye
2 1 3 5
3 2 - 6
4 - - 7
5 - - test
Is this possible to do?
Thanks for the help.
CodePudding user response:
The following code solves your question:
lst = [["hello", 1, 2], ["World", 3], ["Goodbye", 5,6,7,"test"]]
df = pd.DataFrame(lst).T
Because the .T
represents transposing the dataframe, which is what you were trying to do, instead of your lists being rows, they should be columns, and the remaining values are NaN