I have a numpy array data
. Here is its shape:
data.shape
(223,12,437)
I want to make a dataframe out of this array. I want the data frame to have:
223
rows1
column- Each element is a np.array with shape
(12,437)
.
When I run:
pd.DataFrame(data)
I get this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-55-cd168f76d566> in <module>
----> 1 pd.DataFrame(data)
~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
676 dtype=dtype,
677 copy=copy,
--> 678 typ=manager,
679 )
680
~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
302 # by definition an array here
303 # the dtypes will be coerced to a single dtype
--> 304 values = _prep_ndarray(values, copy=copy)
305
306 if dtype is not None and not is_dtype_equal(values.dtype, dtype):
~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in _prep_ndarray(values, copy)
553 values = values.reshape((values.shape[0], 1))
554 elif values.ndim != 2:
--> 555 raise ValueError(f"Must pass 2-d input. shape={values.shape}")
556
557 return values
ValueError: Must pass 2-d input. shape=(223, 12, 437)
Why is this? Doesn't the 0th element in the shape tuple (223,)
contain the information for how many rows to make the df?
What should I do instead?
CodePudding user response:
Create a new axis to transform your array from (223, 12, 247) to (223, 1, 12, 247):
data = np.random.random((223, 12, 247)
df = pd.DataFrame.from_records(m[:, None]) # None or np.newaxis
>>> df
0
0 [[0.8545012287346487, 0.094059810377082, 0.470...
1 [[0.6645975722200621, 0.567675394564319, 0.459...
2 [[0.4474745169474814, 0.4823023696009986, 0.15...
3 [[0.31251548689763453, 0.7357607646976804, 0.2...
4 [[0.4522848739922676, 0.042101609272210516, 0....
.. ...
218 [[0.0085917543787426, 0.9542525347386845, 0.37...
219 [[0.03667682481611034, 0.14416094093914922, 0....
220 [[0.05475820484458771, 0.8582654934659678, 0.0...
221 [[0.641864950301045, 0.9591725641855815, 0.103...
222 [[0.5027508533463017, 0.15570208093984983, 0.4...
>>> df.shape
(223, 1)
>>> df.iloc[0, 0].shape
(12, 247)
Note: (223, 1) and (12, 247) give you a shape of (223, 1, 12, 247)