I have a dataframe where some of the columns contain floating point numbers with 6 decimals but some columns only have 1 or 2 decimals. I want to delete all columns with less than 6 decimals. I tried filling the columns with less than 6 decimals but this did not turn out well.
CodePudding user response:
Try this :
import pandas as pd
data_preprocessed1 = pd.DataFrame({ "Value":[2675.39881,62.2320980,9.3409093,3.343434443],
"Landed weight":[10.0,5.0,10.0,10.10 ],
"Extra": [3.5728, 3.8263, 3.827264, 3.257]})
def delete_after_n_number(df,column,n):
"""
Deleting column with less than 6 decimals in pandas
delete all columns with less than 6 decimals
Parameters
----------
df : pd.DataFrame
initial dataframe
column : str
name of the column you to check
n : int
Deleting columns with less than n decimals in pandas
Returns
-------
dataframe : pd.DataFrame
dataframe wanted
"""
lst_col=df.columns.to_list()
df['copy'] = df[column].astype(str)
df['count']=[len(val) for val in df['copy'].str.split('.',n=1,expand=True)[1]]
verif=df.loc[df['count']<n,:].copy()
if verif.shape[0]>0:
lst_col.remove(column)
df=df[lst_col].copy()
print(lst_col)
else :
df=df[lst_col].copy()
print(lst_col)
return df
for col in data_preprocessed1.select_dtypes(include=[np.float64]).columns :
data_preprocessed1=delete_after_n_number(data_preprocessed1,col,6)
data_preprocessed1
Input :
Ouput :
CodePudding user response:
This is the output when I run the above code on the preprocessed dataset.