Home > Blockchain >  How to trim a string inside a query string passed to the Pandas query function?
How to trim a string inside a query string passed to the Pandas query function?

Time:11-04

I need to trim the 'My Pet' column inside the query string. How can that be done?

# Note that one cat literal has a trailing space.
testDF = pd.DataFrame([{"My Pet":"cat ",   "Cost":"$10 ", "Weight":"10 pounds", "Name":"Violet"},
                       {"My Pet":"cat",    "Cost":"$10 ", "Weight":"15 pounds", "Name":"Sirius"},
                       {"My Pet":"dog",    "Cost":"$0 ",  "Weight":"50 pounds", "Name":"Sam"},
                       {"My Pet":"turtle", "Cost":"$5 ",  "Weight":"20 ounces", "Name":"Tommy"},
                      ])
# We try to filter on cat. 
catDF = testDF.query("`My Pet` == 'cat'")  # This yields only one row because one cat cell has a trailing space
catDF.head()

Output is only one row but I would like to get both rows with cat in them

    My Pet  Cost    Weight  Name
1   cat     $10     15 pounds   Sirius

CodePudding user response:

Use Series.str.strip, but need engine="python":

catDF = testDF.query("`My Pet`.str.strip() == 'cat'", engine="python")
print (catDF)
  My Pet  Cost     Weight    Name
0   cat   $10   10 pounds  Violet
1    cat  $10   15 pounds  Sirius

CodePudding user response:

You can access string methods on string series. This means that you can use the Series.str.strip function to clean remove whitespace.

testDF["My Pet"] = testDF["My Pet"].str.strip()

If you want to only clean trailing spaces then use rstrip (and lstrip for leading spaces)

You can optionally specify the set of characters you want to remove explicitly. E.g. to remove only spaces and tabs:

testDF["My Pet"] = testDF["My Pet"].str.strip(" \t")
  • Related