Home > Software engineering >  Remove everything after a space - Pyspark
Remove everything after a space - Pyspark

Time:05-18

I have a dataframe df as follows:

    A               B
  21k2 b            1
  2412 9            p

Both A and B are strings.

I would like for the A column elements to be trimmed as follows:

  A               B
21k2              1
2412              p

Extra thank you points if you can also show how to remove anything before a space.

CodePudding user response:

You can use the split function and getItem method.

df = df.select(F.split('A', ' ').getItem(0).alias('A'), 'B')
  • Related