Home > Blockchain >  No such file or directory while web scrapping altough the folder does exist
No such file or directory while web scrapping altough the folder does exist

Time:10-07

i want to scrape all the .csv files from the url list in my code like this

os.makedirs("Project Data ISPU SPKU DKI JAKARTA 2010 - 2021", exist_ok=True)

change_directory = r"C:\Users\EVOSYS\Documents\PROJECT-ISPU-DKI-JAKARTA-main\Project Data ISPU SPKU DKI JAKARTA 2010 - 2021"

os.chdir(change_directory)
print("Current Working directory has been changed to :", os.getcwd())

URLS = [
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2020',
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2021',
        'https://data.jakarta.go.id/dataset/data-indeks-standar-pencemar-udara-ispu-di-provinsi-dki-jakarta-tahun-2019'
       ]


for url in URLS:
    soup = BeautifulSoup(requests.get(url).content, "html.parser")

    folder = url.split("/")[-1]
    os.makedirs(folder, exist_ok=True)

    for a in soup.select('a[href$=".csv"]'):
        file_name = a["href"].split("/")[-1]
        
        path = os.path.join(folder, file_name)

        print(
            "Downloading {} ...".format(path),
            end=" ",)
        
        with open(path, "wb") as f_out:
            f_out.write(requests.get(a["href"]).content)
        print("OK.")

but for the 2019 url it gives an error

FileNotFoundError: [Errno 2] No such file or directory: 'data-indeks-standar-pencemar-udara-ispu-di-provinsi-dki-jakarta-tahun-2019\\Indeks-Standar-Pencemar-Udara-di-Provinsi-DKI-Jakarta-Bulan-Januari-Tahun-2019.csv'

i already check that the folder for 2019 data does exist but it still showing an error that the folder is not exist, all the url using the sam tag (href) to get the .csv files

CodePudding user response:

You may need to run your program as admin Terminal opening python with sudo

You likely need to run it as an admin as the code seems to be using the C:/ directory further more i think you should take out the username in the file path and edit this discussion

  • Related