The .replace() method doesn't work during the process of replacing blank spaces of a column while I'm creating an .xlsx file from a DataFrame in Pandas. I also tried .str.strip(), it deletes the blank spaces but it also deletes all the cells of the column. I also used regex=True into the .replace() method but it still doesn't work. Am I doing something wrong? here's the code I'm using: import pandas as pd
from openpyxl import Workbook
book = Workbook()
operacional_1100 = book.active
maestro = pd.read_excel("2021 Gastos Ortodontik.xlsx", sheet_name="MAESTRO TR")
df_ordenar = maestro.iloc[:, [0,1,2,3,4]]
df_ordenar2 = df_ordenar['Monto'].replace(' ', '')
escrito = pd.ExcelWriter('prueba.xlsx')
df_ordenar2.to_excel(escrito)
escrito.save()
CodePudding user response:
df_ordenar2 = df_ordenar['Monto'].str.replace(' ', '')
CodePudding user response:
If you replace " " (whitespace) with "" (empty string) you would definitely end up with a blank cell. So it's recommended to use NaN
instead.
I tried creating a dummy dataset with white spaces and empty strings and doing the process below helped me to replace them with an NaN
value.
I can make NaN
values using the NumPy module.
df_dict = {"First": [" ", 2, 3, 4], "Second": [1, 2, "", 4]}
test_df = pd.DataFrame(df_dict)
for column in test_df.columns:
test_df[column].replace(to_replace = [" ", ""], value = numpy.nan, inplace = True)
test_df