Home > Back-end >  expected string or bytes-like object' in dictionary
expected string or bytes-like object' in dictionary

Time:11-08

I try to show the contents of a dictionary that has to return this output:

'Watermeloenen': 466, 'Appels': 688, 'Sinaasappels': 803

So I have this method:

    def total_fruit_per_sort(self, file_content):
        file_contents = self.extractingText.extract_text_from_image(
            file_content)
        number_found = re.findall(self.total_amount_fruit_regex(), file_contents)       
        fruit_dict = {}
        for n, f in number_found:
            fruit_dict[f] = fruit_dict.get(f, 0)   int(n)
        return str({value: key for value, key in fruit_dict.items()}).replace("{", "").replace("}", "")

This is the regex:

   def total_amount_fruit_regex(self):
        return r"(\d*(?:\.\d )*)\s*W ({self.fruit_list()})"

and the input string(file_contents) is this:

"[' \n\na)\n\n \n\nFactuur\nVerdi Import Schoolfruit\nFactuur nr. : 71201 Koopliedenweg 33\nDeb. nr. : 108636 2991 LN BARENDRECHT\nYour VAT nr. : NL851703884B01 Nederland\nFactuur datum : 10-12-21\nAantal Omschrijving Prijs Bedrag\nOrder number : 77553 Loading date : 09-12-21 Incoterm: : FOT\nYour ref. : SCHOOLFRUIT Delivery date :\nWK50\nD.C. Schoolfruit\n16 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 123,20\n360 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 2.772,00\n6 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,/0 € 46,20\n75  Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 577,50\n9 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 69,30\n688 Appels Royal Gala 13kg 60/65 Generica PL I € 5,07 € 3.488,16\n22  Sinaasappels Valencias 15kg 105 Elara ZAI € 6,25 € 137,50\n80 Sinaasappels Valencias 15kg 105 Elara ZAI € 6,25 € 500,00\n160 Sinaasappels Valencias 15kg 105 FVC ZAI € 6,25 € 1.000,00\n320 Sinaasappels Valencias 15kg 105 Generica ZAI € 6,25 € 2.000,00\n160 Sinaasappels Valencias 15kg 105 Noordhoek ZA I € 6,25 € 1.000,00\n61  Sinaasappels Valencias 15kg 105 Noordhoek ZA I € 6,25 € 381,25\nTotaal Colli Totaal Netto Btw Btw Bedrag Totaal Bedrag\n€ 12.095,11 1.088,56\nBetaling binnen 30 dagen\nAchterstand wordt gemeld bij de kredietverzekeringsmaatschappij\nVerDi Import BV ING Bank NV. Rotterdam IBAN number: NL17INGB0006959173 ~~\n\n \n\nKoopliedenweg 38, 2991 LN Barendrecht, The Netherlands SWIFT/BIC: INGBNL2A, VAT number: NL851703884B01 i\nTel,  31 (0}1 80 61 88 11, Fax  31 (0)1 8061 88 25 Chamber of Commerce Rotterdam no. 55424309 VerDi\n\nE-mail: [email protected], www.verdiimport.nl Dutch law shall apply. The Rotterdam District Court shall have exclusive jurisdiction.\n\nrut ard wegetables\n\x0c']"

And this is the fruit_list:

 self.list_fruit = ['Appels', 'Ananas', 'Peen Waspeen',
                           'Tomaten Cherry', 'Sinaasappels',
                           'Watermeloenen', 'Rettich', 'Peren', 'Peen',
                           'Mandarijnen', 'Meloenen', 'Grapefruit', 'Rettich']

But if I run the function: total_fruit_per_sort. I get this error:

expected string or bytes-like object

Request Method:     POST
Request URL:    http://127.0.0.1:8000/
Django Version:     4.1.1
Exception Type:     TypeError
Exception Value:    

expected string or bytes-like object

Exception Location:     C:\Python310\lib\re.py, line 240, in findall
Raised during:  main.views.ReadingFile
Python Executable:  C:\Python310\python.exe

But I parse the dictionary already to a string.

So don't know how to tackle this.

This line in the stracktrace it complains:

 number_found = re.findall(
            self.total_amount_fruit_regex(), file_contents)

This is the output of print(file_contents):

[' \n\na)\n\n \n\nFactuur\nVerdi Import Schoolfruit\nFactuur nr. : 71201 Koopliedenweg 33\nDeb. nr. : 108636 2991 LN BARENDRECHT\nYour VAT nr. : NL851703884B01 Nederland\nFactuur datum : 10-12-21\nAantal Omschrijving Prijs Bedrag\nOrder number : 77553 Loading date : 09-12-21 Incoterm: : FOT\nYour ref. : SCHOOLFRUIT Delivery date :\nWK50\nD.C. Schoolfruit\n16 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 123,20\n360 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 2.772,00\n6 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,/0 € 46,20\n75  Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 577,50\n9 Watermeloenen Quetzali 16kg 4 IMPERIAL BR I € 7,70 € 69,30\n688 Appels Royal Gala 
13kg 60/65 Generica PL I € 5,07 € 3.488,16\n22  Sinaasappels Valencias 15kg 105 Elara ZAI € 6,25 € 137,50\n80 Sinaasappels Valencias 15kg 105 Elara ZAI € 6,25 € 500,00\n160 Sinaasappels Valencias 15kg 105 FVC ZAI € 6,25 € 1.000,00\n320 Sinaasappels Valencias 15kg 105 Generica ZAI € 6,25 € 2.000,00\n160 Sinaasappels Valencias 15kg 105 Noordhoek ZA I € 6,25 € 1.000,00\n61  Sinaasappels Valencias 15kg 105 Noordhoek ZA I € 6,25 € 381,25\nTotaal Colli Totaal Netto Btw Btw Bedrag Totaal Bedrag\n€ 12.095,11 1.088,56\nBetaling binnen 30 dagen\nAchterstand wordt 
gemeld bij de kredietverzekeringsmaatschappij\nVerDi Import BV ING Bank NV. Rotterdam IBAN number: NL17INGB0006959173 ~~\n\n \n\nKoopliedenweg 38, 2991 LN Barendrecht, The Netherlands SWIFT/BIC: INGBNL2A, VAT number: NL851703884B01 i\nTel,  31 (0}1 80 61 88 11, Fax  31 (0)1 8061 
88 25 Chamber of Commerce Rotterdam no. 55424309 VerDi\n\nE-mail: [email protected], www.verdiimport.nl Dutch law shall apply. The Rotterdam District Court shall have exclusive jurisdiction.\n\nrut ard wegetables\n\x0c']

CodePudding user response:

Check that the result of

file_contents = self.extractingText.extract_text_from_image(file_content)

is actually a string or bytes-like object. You'll get this error from re.findall(...) when the second parameter is not a string or bytes-like object. For example: re.findall("somestring", None).

When I run your code but change the above to just file_contents = file_content and then I print(total_fruit_per_sort(input_str)), I get an empty string, but no errors.

A second thing to note which is probably why I get an empty string is that your total_amount_fruit_regex raw (r) string is not an f-string so the portion within {self.fruit_list()} is just a raw string and not the interpolated values as you probably expect. You can fix this by prefixing the string with an f. I believe a string can be both or just an f-string should work fine here depending on how you want to deal with escaping certain characters.

CodePudding user response:

ah, oke. this fixed the issue:

   def total_amount_fruit_regex(self):
        return r"(\d*(?:\.\d )*)\s*("   '|'.join(re.escape(word)
                                             for word in self.extractingText.list_fruit)   ')'

number_found = re.findall(
            self.total_amount_fruit_regex(), self.extractingText.text_factuur_verdi[0])
  • Related