I have a column in my dataframe that contains text like:
Sunny, with a high near 82. Light and variable wind becoming northwest 5 to 7 mph in the afternoon.
but sometimes contains text like:
A 50 percent chance of showers. Partly sunny, with a high near 61.
I want to manipulate it so that the temperature value (i.e., the 82 or 61) is retained while all other information is removed. So it would become "82" or "61." I cannot do this on a fixed index since the length of the dataframe entry is variable, as is the number length since it is temperature.
I want to use phrases like "high near", "low near", etc to parse through the string to find the temperature value. Is there a pythonically pleasing way of accomplishing this?
CodePudding user response:
Try this:
df['temperature'] = df['text'].str.extract('(?:high|low) near (\d )')[0]
Output:
>>> df
text temperature
0 Sunny, with a high near 82. Light and variable... 82
1 A 50 percent chance of showers. Partly sunny,... 61
CodePudding user response:
You could use a regex with pandas like near (\d ) which shall find digits following near