Home > Enterprise >  how to loop through different columns in a excel file after reading it in pandas
how to loop through different columns in a excel file after reading it in pandas

Time:03-01

Hi I have to loops through my list of column to plot graph against each column on y axis and a harcoded column on x axis. This is a Python pandas problem.

Basically what is happening here is I'm reading a specific spreadsheet from an excel file and then skipping some column to read as they are not required to be plotted and dropping rows that have values other than numerical value or timestamp and then I have to plot a graph against the timestamp column on x axis and rest on y axis.

the excel sheet before any of that happen should look like this:

 experimentID   features    timestamp in UTC           epochs   seed    loss    accuracy
   1           job,degree,  2022-02-19T06:33:13.582610  150        3    rmse    98.70
   2          degree,gender 2022-02-19T06:33:27.737695  150        3    rmse    98.70
   3          salary,gender 2022-02-19T06:33:35.754507  150        3    rmse    98.70

The column of experimentID and feature are not needed for plot so i dropped them and the plot is supposed to be between timestamp and the column after it and these column names are not fixed and may change so i have to loop them dynamically

y = list(modelingData['seed']) this is the line that's not comming straight for me 'seed' is hardcoded but i need to generate it dynamically

def plot(self, value, filename=None):

    modelingData = None
    self._dataSourcing = value.columns

    if (os.path.isfile(filename)):
        modelingData = pandas.read_excel(filename, sheet_name='modelling', index_col=[], skip_col=['experimentID', 'features'])
    
    # excluding numeric, datetime type
    numeric_columns = modelingData.select_dtypes(exclude=['number', 'datetime'])
    modelingData.drop(numeric_columns, axis=0, inplace=True)
    x = list(modelingData['timestamp in UTC'])
    y = list(modelingData['seed'])

    fig,ax=plt.subplots()
    ax.plot(x,y)

    workbook = xlsxwriter.Workbook(filename)
    wks1=workbook.add_worksheet('performancePlots')
    wks1.write(0,0,'plot')

    imgdata=io.BytesIO()
    fig.savefig(imgdata, format='png')
    wks1.insert_image(2,2, '', {'image_data': imgdata})

    workbook.close()

CodePudding user response:

Since you're using modellingData data frame you can just do the for-loop before y and for every value in the loop make the new image with a fixed x-axis, see the code below:

def plot(self, value, filename=None):

    modelingData = None
    self._dataSourcing = value.columns

    if (os.path.isfile(filename)):
        modelingData = pandas.read_excel(filename, sheet_name='modelling', index_col=[], skip_col=['experimentID', 'features'])
    
    # excluding numeric, datetime type
    numeric_columns = modelingData.select_dtypes(exclude=['number', 'datetime'])
    modelingData.drop(numeric_columns, axis=0, inplace=True)
    x = list(modelingData['timestamp in UTC'])
    for columns in modelingData.columns:
        y = list(modelingData[columns])

        fig,ax=plt.subplots()
        ax.plot(x,y)

        workbook = xlsxwriter.Workbook(filename)
        wks1=workbook.add_worksheet('performancePlots')
        wks1.write(0,0,'plot')

        imgdata=io.BytesIO()
        fig.savefig(imgdata, format='png')
        wks1.insert_image(2,2, '', {'image_data': imgdata})

        workbook.close()
  • Related