Home > Enterprise >  Initialize dataframe without column name
Initialize dataframe without column name

Time:09-16

I have data that looks like this.

             state sex salary
jordan        CA    m    100
lebron        NY    m    200

There are 4 columns however the first one does not have a column name. The other 3 columns are state , sex, salary`. How do i initialize a data frame withe the above data?

I tried the following.

import pandas as pd
data = [['jordan','CA','m',100], ['lebron','NY','m',200]]
df = pd.DataFrame(data, columns = ['','state','sex','Age'])  

When i do df.columns I see

Index(['', 'state', 'sex', 'Age'], dtype='object')

However I expect to see Index(['state', 'sex', 'Age'], dtype='object') when i do df.columns

So i am wondering how can i initialize the dataframe such that the column that has the names jordan and lebron is not actually a column.

CodePudding user response:

data = [['CA','m',100], ['NY','m',200]]

df = pd.DataFrame(data,columns= ['state','sex','Age'], index=['jordan', 'lebron'])

or you can do with your existing datafram as below

import pandas as pd
data = [['jordan','CA','m',100], ['lebron','NY','m',200]]
df = pd.DataFrame(data, columns = ['','state','sex','Age']) 

df.set_index(df[''],inplace=True)
df.drop(columns=[''], inplace=True)

CodePudding user response:

Just want to add the scenario when loading from csv file, you can use index_col to specify which column to use as index.

Assumes the data is in a file named temp.csv like:

,state,sex,salary
jordan,CA,m,100
lebron,NY,m,200

you can read in the data with:

import pandas as pd
df = pd.read_csv("temp.csv", index_col=0)

then you can get

df.index # Index(['jordan', 'lebron'], dtype='object')
df.columns # Index(['state', 'sex', 'salary'], dtype='object')

Reference:

  • Related