Home > Software design >  Python - Openpyxl - Searching for a string value in one column, multiple sheets. How to detect for N
Python - Openpyxl - Searching for a string value in one column, multiple sheets. How to detect for N

Time:12-10

Searching Excel File For String Using Openpyxl - (Python 3.10)

I'm searching for a string, in column A, within four worksheets. The script works if the cell value is found within the worksheet.

I want to know how to determine if the cell doesn't exist? (so I can the search for using an alternative string).

Script (I've attached a photo of the results)

from openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.styles import Color, PatternFill, Font, Border
from openpyxl.styles import colors
from openpyxl.cell import Cell
import colorama
from colorama import init
init()
from colorama import Fore, Style, Back
import os
import shutil
import math
from decimal import Decimal
from openpyxl.styles import Alignment
from inspect import currentframe
import time

########################################
#########################################

src_path ="/Users/data/"
xlfileName = "test.xlsx"
xlfile = src_path   xlfileName
wb = Workbook()
wb = load_workbook(xlfile)

#########################################
#########################################

def get_linenumber():
    cf = currentframe()
    return cf.f_back.f_lineno

#########################################
#########################################

#########################################
#########################################

for xws in wb.sheetnames:
    worksheet = wb[xws]
    print(Style.RESET_ALL)
    print(Fore.BLUE, "Looping Through Worksheet Name -- ", worksheet, "Script line is -- ", get_linenumber())
    print(Style.RESET_ALL)
    for cell in worksheet['A']:
        search_string33 = "Current debt"  #
        search_string34 = "test"  #
        try:
            if search_string33 == cell.value:
                offset_colB_CD = cell.offset(row=0, column=1)
                print(Fore.RED, ( "Line 52 -- worksheet -- ", worksheet, "Searching for --", search_string33, "isinstance --", isinstance(search_string33, type(None))))
                if(offset_colB_CD.coordinate is None):
                    print("Line 54 -- null value detected in -- ", worksheet)
                elif(offset_colB_CD.coordinate is not None):
                    print(Fore.LIGHTRED_EX, "Line 56 -- worksheet -- ", worksheet," -- cell cordinate found --", offset_colB_CD.coordinate, " -- cell value found --", offset_colB_CD.value )
                    print(Style.RESET_ALL)
                else:
                    print("Line 59 if statement passed by on else")
                time.sleep(5)
        except (AttributeError, TypeError):
            continue


string search openpyxl

CodePudding user response:

A quick solution I can think about is iterating through a list of the strings in the order you want to search for them.

Example:

search_strings = ["Current debt", "test"]

for xws in wb.sheetnames:
    worksheet = wb[xws]
    print(Style.RESET_ALL)
    print(Fore.BLUE, "Looping Through Worksheet Name -- ", worksheet, "Script line is -- ", get_linenumber())
    print(Style.RESET_ALL)

    current_search_index = 0
    found_cell = False
    for current_string in search_strings:
        for cell in worksheet['A']:
            try:
                if current_string == cell.value:
                    offset_colB_CD = cell.offset(row=0, column=1)
                    print(Fore.RED, ( "Line 52 -- worksheet -- ", worksheet, "Searching for --", current_string, "isinstance --", isinstance(current_string, type(None))))
                    if(offset_colB_CD.coordinate is None):
                        print("Line 54 -- null value detected in -- ", worksheet)
                    elif(offset_colB_CD.coordinate is not None):
                        print(Fore.LIGHTRED_EX, "Line 56 -- worksheet -- ", worksheet," -- cell cordinate found --", offset_colB_CD.coordinate, " -- cell value found --", offset_colB_CD.value )
                        print(Style.RESET_ALL)
                        found_cell = True
                    else:
                        print("Line 59 if statement passed by on else")
                    time.sleep(5)
            except (AttributeError, TypeError):
                continue
        if found_cell == False:
            current_search_index  = 1
        else:
            break
    if found_cell == False:
        print(Fore.RED, "Line 67 -- ", current_string, "not found in -- ", worksheet)
        print(Style.RESET_ALL)

CodePudding user response:

worksheet.values will return a list of tuples. At which point we can reduce our list using list comprehension and filtering.

matching_strs = ['Current_debt', 'test']
matches = [{'ws':ws, 'row_num':row_num, 'row':row} for ws in wb.worksheets for row_num, row in enumerate(ws.values) if row[0] in matching_strs]

This will return a list of dicts, with the attributes of ws, row_num, row. Which should provide all of the info that you would need to print out the results.

No need for the ( ) around the if tests.

To print the results just loop and test


for match in matches:
    # Return position [1] is the equivalent of column B
    # since indexing starts at 0.
    if match['row'][1]:
        print( Match found... )

CodePudding user response:

Answer To Question

In my initial question, I failed to view the number of elements being directly queried in the column. As such if a direct hit was not attained, it moved on. I could not use 'None', because, it had other elements present.

The best option was to build a list and then filter through the list to see if it contained the string. If it does contain the string invoke first choice command, and if not invoke alternative command.

from openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.styles import Color, PatternFill, Font, Border
from openpyxl.styles import colors
from openpyxl.cell import Cell
import colorama
from colorama import init
init()
from colorama import Fore, Style, Back
import os
import shutil
import math
from decimal import Decimal
from openpyxl.styles import Alignment
from inspect import currentframe
import time

########################################
#########################################

src_path ="/Users/data/"
xlfileName = "test.xlsx"
xlfile = src_path   xlfileName
wb = Workbook()
wb = load_workbook(xlfile)

#########################################
#########################################

def get_linenumber():
    cf = currentframe()
    return cf.f_back.f_lineno

#########################################
#########################################

#########################################
#########################################

for xws in wb.sheetnames:
    worksheet = wb[xws]
    print(Style.RESET_ALL)
    print(Fore.BLUE, "Looping Through Worksheet Name -- ", worksheet, "Script line is -- ", get_linenumber())
    print(Style.RESET_ALL)
    search_string33 = "Current debt"  #
    search_string34 = "Total current liabilities"
    cell_list = []
    for cell in worksheet['A']:
        cell_list.append(cell.value)
    number_of_elements = len(cell_list)
    print("Number of elements in the list: ", number_of_elements)
    if search_string33 in cell_list :
        print(Fore.GREEN, "Yes -- ", search_string33)
        print(Style.RESET_ALL)
    elif search_string33 not in cell_list and search_string34 in cell_list:
        print(Fore.LIGHTMAGENTA_EX, "No -- ", search_string33, Fore.GREEN, "Yes -- ", search_string34)
        print(Style.RESET_ALL)
    elif search_string33 not in cell_list and search_string34 not in cell_list:
        print(Fore.RED, "Mother of Holy Crap", search_string33)
        print(Style.RESET_ALL)


string match excel via python

  • Related