Home > Mobile >  adding a new column to existing dataframe and fill with numpy array [duplicate]
adding a new column to existing dataframe and fill with numpy array [duplicate]

Time:09-17

This question is probably very simple, but I seem to be having trouble creating a new column in a dataframe and filling that column with a numpy array. I have an array i.e. [0,0,0,1,0,1,1] and a dataframe that has the same number of rows as the length of that array. I want to add a column and I have been doing this:

df['new_col'] = array

however I get the following warning error:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

I tried to do df.loc[:,'new_col'] = array but get the same warning error. I also tried:

df.loc['new_col'] = pd.Series(array, index = df.index)

based on a different answer from a question a different user asked. Does anyone know a "better" way to code this? Or should I just ignore the warning messages?

CodePudding user response:

Code from https://www.geeksforgeeks.org/adding-new-column-to-existing-dataframe-in-pandas/

Import pandas package

import pandas as pd

Define a dictionary containing data

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
        'Height': [5.1, 6.2, 5.1, 5.2]
        }
  

Convert the dictionary into DataFrame

original_df = pd.DataFrame(data)

Using 'Qualification' as the column name and equating it to the list

altered_df = original_df.assign(Qualification = ['Msc', 'MA', 'Msc', 'Msc'])

Observe the result

altered_df

CodePudding user response:

The DataFrame expects a list input (Each column is like a dictionary with columns as keys and a list as values)

Try this using the tolist() method on the numpy array:

df['new_col'] = array.tolist()

  • Related