Home > OS >  Python Loop read the same CSV Data
Python Loop read the same CSV Data

Time:12-07

I have to take the first line of the csv file and after having processed it, delete it, and then resume the line after.

I'm trying to build a login system that takes the accounts from the csv file, and logs in one by one.

the problem is that every time you start the loop it always takes the same account, how can I fix it?


import pandas as pd
import pyperclip
import selenium
import random
from selenium import webdriver
import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.common.keys import Keys
import names


df = pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv')
def instagram_login():

    df2=df.at[0,'ID'] #Find the first row id
    pyperclip.copy(df2) #Copy the first row id to the clipboard
    print(pyperclip.paste()) #Print the first row id
    #apro il sito
    driver.get('https://www.instagram.com/')
    driver.maximize_window() #schermo intero
    time.sleep(2)
    try:
        consent= driver.find_element(By.XPATH,"/html/body/div[2]/div/div/div/div[2]/div/div/div[1]/div/div[2]/div/div/div/div/div[2]/div/button[2]").click() #clicco il consenso
    except:
        pass    
    time.sleep(5)
    put_username = driver.find_element(By.NAME,("username")).send_keys(pyperclip.paste()) #inserisco username
    df2=df.at[0,'PASSWORD'] #find the password
    pyperclip.copy(df2) #copy the password
    put_password = driver.find_element(By.NAME,("password")).send_keys(pyperclip.paste()) #inserisco password
    time.sleep(2)
    login = driver.find_element(By.XPATH,"//div[contains(text(),'Accedi')]").click() #Click login
    time.sleep(6)
    #here is where the first row got deleted and saved on csv
    df= pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv').drop(0, axis=0)
    df.to_csv(r'/Users/giuseppeleonardi/Downloads/scraping2.csv', index=False)
    

    #this is the loop that always takes the same line of the file every time even though this is canceled at the end of the operation:

for line in len(df):
    instagram_login()
    time.sleep(5)
    driver.delete_all_cookies()


i've googled a lot but cannot figure it out, i've read that file handle will read the file once, i need the loop for reset the list everytime and take the first value, how can i do it?

sorry but i'm still learning

CodePudding user response:

Google for local and global variables. You are changing df inside a function. This does not change the 'global' df. You either need to return your df from the function or declare it first as a global variable.

First option:

df = pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv')
def instagram_login():

    df2=df.at[0,'ID'] #Find the first row id
    .....
    #here is where the first row got deleted and saved on csv
    df= pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv').drop(0, axis=0)
    df.to_csv(r'/Users/giuseppeleonardi/Downloads/scraping2.csv', index=False)
    return df


for line in len(df):
    df = instagram_login()
    time.sleep(5)
    driver.delete_all_cookies()

Second option:

df = pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv')
def instagram_login():

    df2=df.at[0,'ID'] #Find the first row id
    .....
    #here is where the first row got deleted and saved on csv
    global df
    df = pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv').drop(0, axis=0)
    df.to_csv(r'/Users/giuseppeleonardi/Downloads/scraping2.csv', index=False)


for line in len(df):
    instagram_login()
    time.sleep(5)
    driver.delete_all_cookies()

CodePudding user response:

Your definition of df inside the function doesn't change the outside df. So you can return the df and save it to outside df.

data_frame= pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv')
def instagram_login(df):
    ......
    #here is where the first row got deleted and saved on csv
    df= pd.read_csv('/Users/giuseppeleonardi/Downloads/scraping2.csv').drop(0, axis=0)
    df.to_csv(r'/Users/giuseppeleonardi/Downloads/scraping2.csv', index=False)
    return df


#this is the loop that always takes the same line of the file every time even though this is canceled at the end of the operation:

for line in len(df):
    data_frame = instagram_login(data_frame)
    time.sleep(5)
    driver.delete_all_cookies()
  • Related