Home > Back-end >  Converting data types on python data frame
Converting data types on python data frame

Time:11-25

House Number Street First Name Surname Age Relationship to Head of House Marital Status Gender Occupation Infirmity Religion
0 1 Smith Radial Grace Patel 46 Head Widowed Female Petroleum engineer None Catholic
1 1 Smith Radial Ian Nixon 24 Lodger Single Male Publishing rights manager None Christian
2 2 Smith Radial Frederick Read 87 Head Divorced Male Retired TEFL teacher None Catholic
3 3 Smith Radial Daniel Adams 58 Head Divorced Male Therapist, music None Catholic
4 3 Smith Radial Matthew Hall 13 Grandson NaN Male Student None NaN
5 3 Smith Radial Steven Fletcher 9 Grandson NaN Male Student None NaN
6 4 Smith Radial Alison Jenkins 38 Head Single Female Physiotherapist None Catholic
7 4 Smith Radial Kelly Jenkins 12 Daughter NaN Female Student None NaN
8 5 Smith Radial Kim Browne 69 Head Married Female Retired Estate manager/land agent None Christian
9 5 Smith Radial Oliver Browne 69 Husband Married Male Retired Merchandiser, retail None None

I have a dataset which you can see up the side of the question. I want to convert all these datasets to integers and strings from objects.

df = pd.read_csv('user-data.csv')
df[['Street','Relationship to Head of House','Marital Status','Gender','Occupation','Infirmity','Religion']] = df[['Street','Relationship to Head of House','Marital Status','Gender','Occupation','Infirmity','Religion']].astype('str') 
df[['House Number','Age']] = df[['House Number','Age']].astype('int') 

I tried two different ways but all the dataset was gone after that operations.

df = df['Street'].astype(str)
df = df['Relationship to Head of House'].astype(str)
df = df['Marital Status'].astype(str)
df = df['Gender'].astype(str)
df = df['Occupation'].astype(str)
df = df['Infirmity'].astype(str)
df = df['Religion'].astype(str)
df = df['Gender'].astype(str)

Could you help me to convert columns? Thanks

I still got the same types as:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10610 entries, 0 to 10609
Data columns (total 11 columns):
 #   Column                         Non-Null Count  Dtype 
---  ------                         --------------  ----- 
 0   House Number                   10610 non-null  int64 
 1   Street                         10610 non-null  object
 2   First Name                     10610 non-null  object
 3   Surname                        10610 non-null  object
 4   Age                            10610 non-null  object
 5   Relationship to Head of House  10610 non-null  object
 6   Marital Status                 7995 non-null   object
 7   Gender                         10610 non-null  object
 8   Occupation                     10610 non-null  object
 9   Infirmity                      10610 non-null  object
 10  Religion                       7928 non-null   object
dtypes: int64(1), object(10)
memory usage: 911.9  KB

Object instead of int or string, could you help me to fix that?

CodePudding user response:

you need the df['Street']= df['Street'].astype(str) on the left side of the assignment

df['Street']= df['Street'].astype(str)
df['Relationship to Head of House'] = df['Relationship to Head of House'].astype(str)
df['Marital Status'] = df['Marital Status'].astype(str)
df['Gender'] = df['Gender'].astype(str)
df['Occupation'] = df['Occupation'].astype(str)
df['Infirmity'] = df['Infirmity'].astype(str)
df['Religion'] = df['Religion'].astype(str)
df['Gender'] = df['Gender'].astype(str)

or

columns=df.columns
for column in columns:
    df[column]=df[column].astype(str)

or

in the pd.read_csv you can set the dtypes=[str,str,...] for each column
  • Related