I have two numpy arrays and dataframe as given below
val = np.array([0.501,0.32])
values = np.arange(24).reshape((2,3,4))
input_df = pd.DataFrame(columns=['colname_' str(i) for i in range(4)])
I would like to
a) Create a new dataframe (dummy) with 3 columns such as ROW_ID
, FEATURE NAME
, Contribution
b) values for dummy dataframe should be populated using np.array
above and column names from
input_df`
c) Under the Feature Name
column use the input_df column names
b) Populate the val[0]
as contribution
in dummy dataframe and also use each element from values[0][1]
to populate it in contribution
column.
I tried the below code
pd.DataFrame({
"Feature Name": ["Base value"] [f"{col}" for col in df.columns.tolist()],
"Contribution": (val[0].tolist()) list(values[0][1])
})
But I get an error message
TypeError: unsupported operand type(s) for : 'float' and 'list'
Or I also receive another error which is
ValueError: All arrays must be of the same length
I expect my output to be like as shown below
update - real data issue
CodePudding user response:
Try:
pd.DataFrame({
"Feature Name": ["Base value"] [f"{col}" for col in df.columns.tolist()],
"Contribution": (val[:1].tolist()) list(values[0][1])
# ^^^^
})
val[0]
makes it a scalar value, even followed by .tolist()
>>> type(val[0].tolist())
<class 'float'>