I'm trying to drop rows which contain strings that are wrapped in a column. I want to drop all values that contain the strings '[removed]', '[deleted]'. My df looks like this:
Comments
1 The main thing is the price appreciation of the token (this determines the gains or losses more
than anything). Followed by the ecosystem for the liquid staking asset, the more opportunities
and protocols that accept the asset as collateral, the better. Finally, the yield for staking
comes into play.
2 [deleted]
3 [removed]
4 I could be totally wrong, but sounds like destroying an asset and claiming a loss, which I
believe is fraudulent. Like someone else said, get a tax guy - for this year anyway and then
you'll know for sure. Peace of mind has value too.
I have tried df[df["Comments"].str.contains("removed")==False]
But when i try to save the dataframe, it is still not removed.
EDIT: My full code
import pandas as pd
sol2020 = pd.read_csv("Solana_2020_Comments_Time_Adjusted.csv")
sol2021 = pd.read_csv("Solana_2021_Comments_Time_Adjusted.csv")
df = pd.concat([sol2021, sol2020], ignore_index=True, sort=False)
df[df["Comments"].str.contains("deleted")==False]
df[df["Comments"].str.contains("removed")==False]
CodePudding user response:
Try this
I have created a data frame for comments column and used my own comments but it should work for you
import pandas as pd
sample_data = { 'Comments': ['first comment whatever','[deleted]','[removed]','last comments whatever']}
df = pd.DataFrame(sample_data)
data = df[df["Comments"].str.contains("deleted|removed")==False]
print(data)
output I got
Comments
0 first comment whatever
3 last comments whatever
CodePudding user response:
You can do it like this:
new_df = df[~(df['Comments'].str.startswith('[') & df['Comments'].str.endswith(']'))].reset_index(drop=True)
Output:
>>> new_df
Comments
0 The main thing is the price appreciation of th...
3 I could be totally wrong, but sounds like dest...
That will remove all rows where the value of the Comments
column for that row starts with [
and ends with ]
.