Home > database >  Find matching strings and populate data using openpyxl
Find matching strings and populate data using openpyxl

Time:11-17

I have a dictionary and data in Excel. I would like to iterate through all the rows in one column in the excel file and see if it matches the string from my 'key' in my dictionary.

Once matched I would like then map the value to a different column but same row.

For example - if the key matches at A2 then populate C2 with the dictionary value or if it matches at A8 then populate C8.

Dictionary example:

my_dict = {'data1':'A','data2':'B','data3':'C'}

Excel spreadsheet column:

    A   | B  |  C |
1  col1 |col2|col3|
2  data1|    |    |
3  apple|    |    |
4  pear||    |    |
5  data2|    |    |
6  data3|    |    |

Desired output in excel:

    A   | B  |  C |
1  col1 |col2|col3|
2  data1|    |A   |
3  apple|    |    |
4  pear||    |    |
5  data2|    |B   |
6  data3|    |C   |

What I have tried so far:

from openpyxl import load_workbook
wb = load_workbook('input.xlsx')
ws = wb['sheet1']

#iterate through column
for rownum in range(1, sheet.max_row   1):
    cell = ws.cell(row=rownum, column=1).value
    #iterate through dictionary
    for k,v in my_dict.items():
        if cell == k:
            #find location
            print(cell.value,cell.row,cell.column,cell)
            print(cell  'match')
            #append value to correct cell

Please could someone point me in the right direction?

CodePudding user response:

Openpyxl has a built-in function for iterating through rows, iter_rows(). Docs for it can be found here. Using this, the solution is quite easy. Iterating through the rows, test to see if the value of your first column matches a key in your dictionary. If it does, write the corresponding value into the third column. The code below does this.

import openpyxl as op

wb = op.load_workbook("test.xlsx")
ws = wb["Sheet"]

my_dict = {'data1':'A', 'data2':'B', 'data3':'C'}

for row in ws.iter_rows(min_row=2, max_col=3):
    if row[0].value in my_dict:
        row[2].value = my_dict[row[0].value]

wb.save("text.xlsx")

Note that there is no need for you to iterate through each item in your dictionary (which is your second for loop). Using if X in my_dict directly loops over the keys in the dictionary.

  • Related