Home > front end >  Add sufix on duplicates in pandas dataframe Python
Add sufix on duplicates in pandas dataframe Python

Time:11-16

i am writing a script to download images. I'm reading a excel file as a pandas dataframe Column A -url links Column B - Name

downloaded images will have this name, example "A.jpeg" There will be duplicates in Column B[Name] in that case i would like to add a suffix on the image name. so the output will be A.jpeg A-1.Jpeg ..

import requests
import pandas as pd


df = pd.read_excel(r'C:\Users\exdata1.xlsx')

for index, row in df.iterrows():
 url = row['url']
 file_name = url.split('/')
r = requests.get(url)  

file_name=(row['name'] ".jpeg") 

if r.status_code == 200:
 with open(file_name, "wb") as f:
  f.write(r.content)
  print (file_name)

I have been trying cumcount but can't really seem to get it to work.. Apreciate all the help I can get

CodePudding user response:

You can try:

import requests
import pandas as pd

df = pd.read_excel(r"C:\Users\exdata1.xlsx")
cnt = {}

for index, row in df.iterrows():
    name = row["name"]
    if name not in cnt:
        cnt[name] = 0
        name = f"{name}.jpeg"
    else:
        cnt[name]  = 1
        name = f"{name}-{cnt[name]}.jpeg"

    url = row["url"]
    r = requests.get(url)
    
    if r.status_code == 200:
        with open(name, "wb") as f:
            f.write(r.content)
            print(name)

This will download the files as A.jpeg, A-1.jpeg, A-2.jpeg, ...

  • Related