Home > Software design >  AWS lambda response ERROR: string indices must be integers (Textract)
AWS lambda response ERROR: string indices must be integers (Textract)

Time:03-19

I'm new to this and have spent 3 weeks on this error. I'm trying to get the response from a lambda function of the extracted text in an image in an s3 bucket. I successfully upload the image into the bucket from me expo project using this code:

try {
                const response = await fetch(data.uri);
                const blob = await response.blob();
                const assetType = 'Mobile Device';
                await Storage.put(`${username}/${assetType}/${UUID}`, blob, {
                    contentType: 'image/jpeg',
                    progressCallback,
                });
            } catch (err) {
                console.log("Error uploading file:", err);
            } 

this will invoke the lambda function to extract the needed text

import json
import boto3
from urllib.parse import unquote_plus
import re

def lambda_handler(event, context):
    bad_chars = [';', ':', '!', "*", ']', "[" ,'"', "{" , "}" , "'",","]
    
    file_obj = event["Records"][0]
    bucketname = str(file_obj["s3"]["bucket"]["name"])
    filename = unquote_plus(str(file_obj["s3"]["object"]["key"], encoding='utf-8'))
    
    textract = boto3.client('textract', region_name='ap-south-1')
    
    print(f"Bucket: {bucketname} ::: Key: {filename}")

    response = textract.detect_document_text(
        Document={
            'S3Object': {
                "Bucket": bucketname,
                "Name": filename,
            }
        }
    )
    
    result=[]
    processedResult=""
    for item in response["Blocks"]:
        if item["BlockType"] == "WORD":
            result.append(item["Text"])
            element = item["Text"]   " "
            processedResult  = element
        
    print (processedResult)    
    res=[]
    imei=""
    
    #Extracting imei number and removing the bad characters
    str_IMEI= str(re.findall('(?i)imei*?[1]*?[:]\s*\d{15}',processedResult))
    imei= str(re.findall('[^\]] ',str_IMEI))
    
    imei = str(imei.split(' ')[-1])
    
    for i in bad_chars :
        imei = imei.replace(i, '')

    if len(imei)>0 :
        res.append(str(imei))
    else:
        res.append('')
    
    #Extracting imei2 number and removing the bad characters
    str_IMEI2= str(re.findall('(?i)imei*?[2]*?[:]\s*\d{15}',processedResult))
    imei2= str(re.findall('[^\]] ',str_IMEI2)) 
    
    imei2 = str(imei2.split(' ')[-1])
    
    for i in bad_chars :
        imei2 = imei2.replace(i, '')

   
    if len(imei2)>0 :
        res.append(str(imei2))
    else:
        res.append('')

    #Extracting model number and removing the bad characters
    
    str_Model = str(re.findall('(?i)model*?[:]\s*[\w|\d|_,-|A-Za-z] \s*[\w|\d|_,-|A-Za-z] ',processedResult)) 
    result = str(str_Model.split(' ')[-1])
    
    for bad_char in bad_chars :
        result = result.replace(bad_char, '')

    if len(result) >0 :
        res.append(result)
    else:
        res.append('')
    
    print(res)
  
    return {
        'statusCode': 200,
        #'body': JSON.stringify(result)
        'body': res
    }

and then use RESTapi to get the response from the lambda

try{
                const targetImage = UUID   '.jpg';
                const response = await fetch(
                    'https://6mpfri01z0.execute-api.eu-west-2.amazonaws.com/dev',
                    {
                        method: 'POST',
                        headers: {
                            Accept: "application/json",
                            "Content-Type": "application/json"                            
                        },
                        body: JSON.stringify(targetImage)
                    }
                )
                const OCRBody = await response.json();
                console.log('OCRBody', OCRBody);
            } catch (err) {
                console.log("Error extracting details:", err);
            }   

but i keep getting this error

OCRBody Object {
  "errorMessage": "string indices must be integers",
  "errorType": "TypeError",
  "requestId": "8042d6f7-44de-4a54-8209-fe93ca8570ec",
  "stackTrace": Array [
    "  File \"/var/task/lambda_function.py\", line 10, in lambda_handler
    file_obj = event[\"Records\"][0]
",
  ],
}

PLEASE HELP.

CodePudding user response:

As the error says, you are trying to slice a string type with a some kind of key as if it were a dict type:

>>> str_name = 'Jhon Doe'
>>> str_name['name']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string indices must be integers

CodePudding user response:

Normaly, this issue could cause, that you do not receive the correct data in header. In all of my Lambda function, is use something like this:

bucket = event['Records'][0]['s3']['bucket']['name']

Event is most of the time a valid dict within Python, so you have two possibilities here:

Option 1

Before proceeding, to something like

if 'Records' in event:
  if 's3' in event['Records'][0]:
    if 'bucket' in ...
      if 'name' in ...

Only if all condition are true, you should proceed.

Option 1 do a simple

print(event)

and check your event, when your Lambda will be failing.

The issue here could cause that the Lambda is triggered before the S3 object is available.

What you can do is to add a S3 trigger which will be invoked automatically the Lambda, as soon as the S3 object was uploaded. See here

https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-how-to-event-types-and-destinations.html

  • Related