Home > database >  Mongodb - using regex on Python to replace part of a string
Mongodb - using regex on Python to replace part of a string

Time:09-11

python, regex and mongoDB. a field named 'page_url' storing a single url like

https://baike.baidu.hk/item/黃金分割率/24137816

or

https://baike.baidu.hk/item/物理光學/61334055#viewPageContent

I want to do a whole document replace that remove the #viewPageContent of all the urls with it.

Thanks.

CodePudding user response:

old_url = "https://baike.baidu.hk/item/物理光學/61334055#viewPageContent"

new_url = old_url.replace("#viewPageContent", "")
print(old_url)
>>> https://baike.baidu.hk/item/物理光學/61334055#viewPageContent

print(new_url)
>>> https://baike.baidu.hk/item/物理光學/61334055

CodePudding user response:

import re
a = "https://baike.baidu.hk/item/物理光學/61334055#viewPageContent"
print(re.sub(r"#viewPageContent", '', a))

output: https://baike.baidu.hk/item/物理光學/61334055

Hope I could help you!

CodePudding user response:

db.baike_items.update_many(
  { "page_url": { "$regex": "#viewPageContent"} },
  [{
    "$set": { "page_url": {
      "$replaceOne": { "input": "$page_url", "find": "#viewPageContent", "replacement": "" }
    }}
  }]
)
  • Related