I am a beginner in python and I have a pandas dataframe that I want to change as below:
10% of rows of column "review" must be changed by adding a prefix 90% of rows of column "review" must be unchanged
for changing all rows of "review" I can use the code :
X_test["modified_review"] = " abc " X_test["review"]
and to select 10% of rows I can use :
X_test.sample(frac=0.1)
But I don't know how to combine the two codes to modify only the selected lines.
Please help!
CodePudding user response:
You can sample 10% random indexes and update the corresponding locations only:
df["modified_review"] = df["review"]
rand_ids = df.index.to_series().sample(frac=0.1)
df.loc[rand_ids, "modified_review"] = " abc " df.loc[rand_ids, "modified_review"]