Home > Software engineering >  Why strip() does not clear all whitespaces in python?
Why strip() does not clear all whitespaces in python?

Time:10-24

I have this code:

with open("db.csv", "r") as file:
    for row in file:
        row = row.strip()
        print("Stripped:", row)
        task = tuple(row.split(";"))
        print("Splitted:", task)

Thats opens this csv file:

asdf; [(datetime.datetime(2021, 10, 23, 14, 8, 8, 198986), 'asdf')]; 2012-12-12 12:12:00; 0
adsf; [(datetime.datetime(2021, 10, 23, 14, 45, 30, 806811), 'asdf')]; None; 0

The output is this:

Stripped: asdf; [(datetime.datetime(2021, 10, 23, 14, 8, 8, 198986), 'asdf')]; 2012-12-12 12:12:00; 0
Splitted: ('asdf', " [(datetime.datetime(2021, 10, 23, 14, 8, 8, 198986), 'asdf')]", ' 2012-12-12 12:12:00', ' 0')
Stripped: adsf; [(datetime.datetime(2021, 10, 23, 14, 45, 30, 806811), 'asdf')]; None; 0
Splitted: ('adsf', " [(datetime.datetime(2021, 10, 23, 14, 45, 30, 806811), 'asdf')]", ' None', ' 0')

Why row.strip() only removes "\n" line, but not the spaces in the begining of the item on the splitted list?

EDIT: I added row = row.replace(" ", "") before splitting, which works as workaround, but I would still like to know why strip() did not work as I expected.

CodePudding user response:

In your function, you called the strip() on the full string, thus if you string was:

" a b c "

This will turn into

"a b c"

All spaces removed are only on the edges.

Thus, to remove all spaces you'd have to split before using strip, with a for loop or list comprehension such as:

task = tuple([item.strip() for item in row.split(";")])

CodePudding user response:

The reason is: because it is defined like that:

Return a copy of the string with the leading and trailing characters removed.

Source: Python docs, emphasis mine.

How should strip() know that you will split it afterwards and don't want spaces after that? Therefore, you need to split() first and then call strip() on all items.

In contrast to your workaround:

Return a copy of the string with all occurrences of substring old replaced by new.

Source: Python docs, emphasis mine .

A note of warning: your current CSV data might be so simple that you can just split it at the semicolons. However, it is possible to have the semicolon as part of the data, like

data1;"data;2";data3

Make it yourself a habit to use the right tool for the job. In this case a CSV reader. It will handle the special cases for you.

  • Related