I am working with a document where the values in column A are numbers. When I import it, Python reads the column as a dtype: object, however, I cannot transform the column to Integer of Float because there are some values that are understood as datetime.datetime and not object data type.
For example, the number 19.1 when imported in Python, it is read as datetime.datetime(1900, 1, 19, 0, 0)
.
I already spotted those cases where the data type is datetime.datetime and I know how to find them using the following code:
data.loc[data['A'] == datetime.datetime(1900, 1, 19, 0, 0)]
Nevertheless, I don't know how to access those specific cases to either delete these wrong values or change them for other values, in order to transform later the entire column to dtype: Int.
I tried to do so by using the following codes, but none worked:
data['A']=data['A'].replace(datetime.datetime(1900, 1, 19, 0, 0),'0')
data['A']=data['A'].replace(datetime(1900, 1, 19, 0, 0),'0').
This is the data frame in the original Excel file:
a
1 10
2 5
3 19.1
4 15
5 3.8
column dtype: number
This is the data frame imported by python using Pandas:
a
1 10
2 5
3 1900-01-19 00:00:00
4 15
5 2021-08-03 00:00:00
column dtype: object
The two cases (3 and 5) data type:
[datetime.datetime(1900, 1, 19, 0, 0),
datetime.datetime(2021, 8, 3, 0, 0),]
CodePudding user response:
You can use drop to remove and where to replace as in:
import pandas as pd
import datetime
df = pd.DataFrame.from_dict({'a': [1,2,3,datetime.datetime(1900, 1, 19, 0, 0),4,5,datetime.datetime(1900, 1, 19, 0, 0),6]})
print("Original:")
print(df)
df_drop = df.drop(df.loc[df['a'] == datetime.datetime(1900, 1, 19, 0, 0)].index)
print("Rows dropped:")
print(df_drop)
df_replace = df.copy()
df_replace['a'] = df_replace['a'].where(df_replace['a']!=datetime.datetime(1900, 1, 19, 0, 0),-1)
print("Rows replaced:")
print(df_replace)
which results in:
Original:
a
0 1
1 2
2 3
3 1900-01-19 00:00:00
4 4
5 5
6 1900-01-19 00:00:00
7 6
Rows dropped:
a
0 1
1 2
2 3
4 4
5 5
7 6
Rows replaced:
a
0 1
1 2
2 3
3 -1
4 4
5 5
6 -1
7 6