I have a Pandas dataframe df
:
foo = {
'Code' : [200, 101, 308, 393],
'City' : ['New York', 'Los Angeles', 'Miami', 'Houston'],
'State' : ['New York', 'California', 'Florida', 'Texas'],
'Country' : ['United States', 'United States', 'United States', 'United States'],
'Sales' : [100, 200, 300, 400]
}
df = pd.DataFrame(foo)
df
Code City State Country Sales
0 200 New York New York United States 100
1 101 Los Angeles California United States 200
2 308 Miami Florida United States 300
3 393 Houston Texas United States 400
To get the data types, I call:
df.dtypes
Code int64
City object
State object
Country object
Sales int64
dtype: object
I would like to be able to convert the names of these data types to different names that they can be used in a database schema. To do so, I use the following:
new_types = df.dtypes.map({'int64': 'int', 'object': 'text', 'float64': 'int'})
This returns:
new_types
Code NaN
City NaN
State NaN
Country NaN
Sales NaN
dtype: object
What is causing the NaN
values when converting using this approach? Is there a more elegant way to do this conversion?
Thanks!
CodePudding user response:
df.dtypes
returns a Series where each value is a numpy.dtype
object. To get these dtype names as strings and map them, you can cast them to strings with .astype
:
dt = df.dtypes
# Confirm the type of these values
print(type(dt[0]))
# Result:
# <class 'numpy.dtype[int64]'>
new_types = dt.astype(str).map({'int64': 'int',
'object': 'text',
'float64': 'int'})
print(new_types)
# Result:
# Code int
# City text
# State text
# Country text
# Sales int
# dtype: object
CodePudding user response:
I solved it this by casting the types to str
(which I should have done to begin with!):
types = df.dtypes.astype('str')
new_types = types.map({'int64': 'int', 'object': 'text', 'float64': 'int'})
Code int
City text
State text
Country text
Sales int
dtype: object
If there is a more elegant way to do this, I'm all ears. Thanks!
CodePudding user response:
You can call the name
d = {'int64': 'int', 'object': 'text', 'float64': 'int'}
df.dtypes.map(lambda x : d.get(x.name))
Out[62]:
Code int
City text
State text
Country text
Sales int
dtype: object