Problem while deleting blank spaces of a column of a DataFrame in Pandas-CodePudding

The .replace() method doesn't work during the process of replacing blank spaces of a column while I'm creating an .xlsx file from a DataFrame in Pandas. I also tried .str.strip(), it deletes the blank spaces but it also deletes all the cells of the column. I also used regex=True into the .replace() method but it still doesn't work. Am I doing something wrong? here's the code I'm using: import pandas as pd

from openpyxl import Workbook

book = Workbook()

operacional_1100 = book.active

maestro = pd.read_excel("2021 Gastos Ortodontik.xlsx", sheet_name="MAESTRO TR")

df_ordenar = maestro.iloc[:, [0,1,2,3,4]]

df_ordenar2 = df_ordenar['Monto'].replace(' ', '')

escrito = pd.ExcelWriter('prueba.xlsx')

df_ordenar2.to_excel(escrito)

escrito.save()

CodePudding user response：

df_ordenar2 = df_ordenar['Monto'].str.replace(' ', '')

CodePudding user response：

If you replace " " (whitespace) with "" (empty string) you would definitely end up with a blank cell. So it's recommended to use NaN instead.

I tried creating a dummy dataset with white spaces and empty strings and doing the process below helped me to replace them with an NaN value.

I can make NaN values using the NumPy module.

df_dict = {"First": [" ", 2, 3, 4], "Second": [1, 2, "", 4]}
test_df = pd.DataFrame(df_dict)

for column in test_df.columns:
  test_df[column].replace(to_replace = [" ", ""], value = numpy.nan, inplace = True)

test_df