Home > Software engineering >  AWS Batch S3 lambda job response produces error
AWS Batch S3 lambda job response produces error

Time:10-11

I have a problem running S3 batch job. I contacted AWS support on this issue already but they didn't answer anything meaningful, so Stackoverflow is my only hope to resolve this :)

I need to copy a big amount (up 10 10000) files from one location to another. Initially I was going to use S3 Batch Copy operation, but the problem with it is that it doesn't let me change output path, and this aspect is critical, as I'm gathering files from different locations to one for further processing in Athena.

So after a while of googling I ended up using S3 Batch Lambda operation, where I'm just making s3 copy object request for each entry. It works nice and fast, but the problem is that I'm receiving error in report for each entry because of incorrect json returned by lambda. I'm returning response json based on this article. It's not for c#, but that's the only reference I was able to find. So my json looks like this:

{
    "invocationSchemaVersion": "1.0",
    "invocationId": "AAAAAAAAAAGQaOzzJcNEbxZje2JpMRjdxh4CsFXUr0 Hdc gVHtCRyXFH9oR5nxKv7uNh4jTCj8Gb7f/QUZmk2DIagbWXK1Km73jlZ2qKWQnJT5Avg4hVRixjjW aOrhLj1GQPYmeuVXoiT5L6TgPwrOrRasNLwRTUrwoEiCCuE3zHsOPi8Jm1ai4tLDpzLIzx2Rd5Ye8as1 W2d7h/oviX1Nc1 FsJqjn RLXChbDdqYcRvqrOfn7UU9zTeoCZFEgJW54xPf3 bVcjfXEOyHkzps/leo0CS4C6KVTsoC sFPUen4s7S3oxBlbVfZvGntX2XTfLnhLy9y8m3OCJGaDvZu9US22sFaTniIcNOdyDGpcmVkolYZwlLoC8dA8EK80aXdWYgy x4Nmrlbkjt8c/xWl5lfgTtL4HDcGXcobpsdio9eC4pgGia16 FzoH2CZQam1qO0PT2LTDO8/rioJsyuHvD3f7R9wTGN1JRhuOurbrK/ribHs09eryBrJXaSIeY4e2nk4wFRZuElk4fPlojpCtR5EMVJvl kA6ZnHom2lFUFaPdeswkUXV1E3MfPqDjUF39IWR2A15im/9JMDkFW1m2129tiL5QyFEUMXkmwZuUEdAQUaBwQzExOgukoLAFxXlsy/b/FfoAATfqqb5q3z3lSN3ioC1uHfSFpTRxPBJ6 HuGYItoYSRXlKPJEHWbaIyP63pdVQJQ2Jq7dcNX7ptm1hbgQ JAMc0239/MGIY bRVz6TpH1blu3BDIna7OBJweOZBj3qHH7tDolHInOytmKavoBF6vbunOAjQvEXBEj 6rMuiiiyyw0V5zvinof7YxwmdtyjIetD IEVqxiXtqc1Pyb3sVEsRuRFLbYawJ9uTLClRSJCUFY4vLQHfSzL7bWDH D3lJ04w5ujXdPMnG0z4aKy2UioP4xfUku0afWe4pTurZbeTnKKZ358iIxI/1qG0y8FEYv7jrMjpo3dh2ivcCQrmEBgMlDQTS5shNsnUj02M6RNwZlNC5NWSNKBpSsexe9B LrP5vUBwj8eQxIrYVn9hETgqb8u SGKZ8H8q2IWhocdps7JdMtP WP29pH5kPh3VKQEjykDJr8s1Zr  Y3oWWhxCuWUZhrqYje0Bx/D8HW/dVqqTHClTeWC1yg5ff jKnZ0RKUnO1XzEFt87UXIDbd56p RgfrEIZovCftoBnHOi1XzHtvlfKhplvHLsGfQN/jgDZV4ooge0QrM8plsRPfuaEA1uHY5UOKcaqKj9Jpp2l6ctJ4OWvQnUzxEyuD2SCg7mFWFCgjsqn783Tib7tZMo qjLGgRWc4riuAW7SKb YAAI0rZxH3Z1RuK7KbFCdszMl9xNSptMF6kJfUl/ocRK0AUn3jbguSGPdSe08D4Tx1pqhBiJHpCE1T04PsZEdykte9sei5CZnxcZ5XdEpbADNHe2y9u6vQwxh3QrpJasQJRBhJuKHXntjab5IgZsDzc2Ez0iQlfiHVz9Gp9zkM8NTgC4Auv3T9iYi7rxff3LEoOZKEU1tizGIEP/gThfE8thMytWsLW3d2hgw/B74grX4IzwrM2tyvR2oa3alOLEkDwpKFSDWFYXFMokeMKLTk14re8TpD9cJ0Jf4vN8PNQeTCWFyEp/XISn6f4T8jm6rf1nf7A2OsfZA6gdvrWc7u61F8CMBeBDc8lUp1Rv8zDM5K80xpQt39x5PaBMj66gnM/01onH7g9O3xZN50eiiuT5LVkARbPtiTJi4Dxruu UVf2e86CjGVQRu1cDtCqlQZ2Y ylTQXepg/rxK/34KZ3olTXvo0L0yfymbO8e5IX0eXlMJ1OlilBJbssET3ohCTqjMrJDR3IF5raz1xeDoJMTvuZAbIC84ksNl90JJDz1zjkrN2TiUMtD19M0B TjbyLn/4Tw/bjHrBkpmx2c7DIk0e7L7wG9rThcBT7hjPnHbEk7XNpysssgrtp3m/LR3X TY 7P8ALMSUAkVMymhyZ2Ur6zmgQyP5VGzJzbx66VZPCPM4 hfBLr1teivvEgAompM2uMxrOx0nv3/29LL/4hlbXSwQO2dk85KshJqzCnWdTbGDUzZIvRMlTo3MgAFyH1vXu/FM/oP8gyBZh sbcA69UVXMFGSnpouP3wa0t2tBE7hOe9jIHYnb p Ng9cMwyh13WqNIstvp5qHb8GDiQGzzb/Z4p7PHALL5JWIhC zhlnp2TElKC1oii2zRF1/FGre1qVpj4AZOWPmzt qTfNk9DbdNq66334vMi3tKoDrTuKRmz9pr5ZlQ==",
    "treatMissingKeysAs": "PermanentFailure",
    "results": [
        {
            "taskId": "AAAAAAAAAAGWxTheY PQE987jdKDGe9EI4x7Njgb9 4ogcLMgjqtAXk7YuyLCZx138AOKrT9C8huBaHNg4kzvhEJnAIFtvkBUZ6uZ6igTYN0vnsl1DitqineddsGvD6Nt1jldgrshZuEe fFMt/ak/56ESDW1hE8652hcxrZS6icV/uRtwtj2k3Xmc5IoNZugtx6mw5BhPqErv9yjZMBYzbZ8vVUQo7ccalfVVzGlv2oO8nEVjamBbIdpu xrzZuD/x4X4epphXqCzuNQpdRUXkL465K yVs9J0y bktcQ8au85FCPM 6Ts7fAeUm2DSFKVJRXE9XDeHXdH6wvPDrcU9oC99vyouCGn5lkPwdug6N32Y5/pE9vMlRTSpnAOOWBO698ci/34OVgLxcuLl9rm1Z5e4B2 zZoAJSVmR7 /masL/brpJakUM9tU5hdkp4kAkNR7sxH4z2DwMaM9qfKPPmNhybfQqXVMmqEOtRAAJdUwZNgKtwR0fjH7uGAxxcha8cpCVAMBjU6VJw/bKBjmjFxsoi3oC Dj2toaxvc8LT3WegdCPKOfZcvZLTq3kajfXQGK0aiBpLL A2q2IzobafpBUEaApqa4m90LLxOP9TbKreaLOBqbyQKEVEskoYBXlbi6nvCxN4qIIwfXM4QsiMT2spPuen/odWo5HajyiaI/7WSsupY/je2y5opufO4Em/6pw8PXZMwcP3ati ocTyYIjSX3gpuS9GFUT9Sov8jAGQI71H4pPbBIxopK20BKeb6DT9VRlONECNfQaDFr6ovedTiBMPAjZeeX27VmvRpBMgAY/1fjtX6gEkmrkU7V8vzdvX/T3mInk1TzoI/K2eIE24asmrEsIkECgUB b54SiXZaZYCVmaiaDZgEnLqq4q IYo8l6e6f0cZNCbLD/IKslRU7QRJufvCHBMw/YeUv7XG1St170fqgzqfc/gvxlSYxBtewXRdCLk47CtE wFAfYNXjDT3PRG6DAOl3wyDjjCuJRijoIOkAArnsx5AdoKjhI2JTazl6jdwhk2bmFdC1D2gujNJbqCLjoGQinb/aJMw qYe8DPSV3YhAFsHQDAUnew5M2/qgIIv7Kps3Z9X6FspzQUFi AG2yu qrhJzzVCPPmRY4P8FuRLG8BuA1AhIPTRK1hZsXi/p 4n3S3w199I6TFY/BxDCV9lIu7coHIEv6lNQHh7wS7EWS9or0wz/02lMzf1 j04nOdokwW49IXFTXStXej3ZhdA==",
            "resultCode": "Succeeded",
            "resultString": "asdasdasd"
        }
    ]
}

And the error I'm receiving in report is:

Invalid JSON returned in Lambda payload: "{"invocationSchemaVersion":"1.0","invocationId":"AAAAAAAAAAG5pvT5OiQJUWXIYVazaz4A04maxTVgE ByVOSrNfLYyVakaNnKgyxewmpNGtixsr3SV9Aw txterQjFisOmpnq5BJ/v/TjMW0/L svjPFW8Vl7N8Ey50HR66rx6h4BGak91Tapr7Xvfk3bPY7T8sECPmcpk5oD2UwTd64uYej5WrZBnlrOIiv1lKSAugx1kxCESMae2ukD31ZzMVZyYknG5vnfoJETiO4F /iq4peMGVkwzcbuJ0KNbkyEvBbuOkbt3EOuE/721mbB0qcbJIBFpXgc45yw7IsJUyWllYls5l5uL33IBtFastnPHUMqUtv/Y3q6/xx8yjEvyNktxfKyhJf4CgnuC1BkbmaQ0xoruiR4uLkV/zgW2WoanYZEyWtBDGCqKSdGBhsfPTRvLkp8dJapNm/SYEVJ1MuL2NKZWSIaFYCBGa4gc2SSYT8Tp4MFG zughNUx6yBh7QxKO8 0ChtHjnirJggdxtcdVYaLyerdVI9miEWQKTue9DCLsE0fCzLDFrcgQxi/P87yDDlEH3Ij5L/Go1QsqG LhAjhZh7170wySWNoiC5L8zytly7QjfdJsT9I4P/kSL4j/Ap8qq9L9cUazLbuGBZBNpZRmNyBYFB3emIotwpTNcFII6xXZHadskMxal4xuDuGmU1GzUtf1KpdHjWTbCoYPgh5ZHTS4mvNgGLUhG 6X5SoX9uL/7vqK7b/yNaaKjP2rdz4EJdfmLKtAn6KCfyeds2yAZKxwKe643jXb0YtJuPGi9C50kb8rokt1Wcgo9W0j4kunADdMWxT3Ta67/yQ5s6kxewsVBHkcgAJiTU0OnelSI5pCPPOfQBx4BxwiXp7Avam9aaruWnx4Q3IlD9YuutEjdM9b4cHgaUhplxDW2Y0ujiRqze9eZsoNCXS5ok98

The first thing that caught my eye is that invocation id and task id are very long, much longer than in example, and even in error message they get truncated. But here's the object structure I'm receiving on lambda input and I'm using Invocation id and Task id correspondingly:

public class S3BatchEvent
{
    [JsonProperty("invocationSchemaVersion")]
    public string InvocationSchemaVersion { get; set; }

    [JsonProperty("invocationId")]
    public string InvocationId { get; set; }

    [JsonProperty("job")]
    public Job Job { get; set; }

    [JsonProperty("tasks")]
    public List<Task> Tasks { get; set; }
}

public class Task
{
    [JsonProperty("taskId")]
    public string TaskId { get; set; }

    [JsonProperty("s3BucketArn")]
    public string S3BucketArn { get; set; }

    [JsonProperty("s3Key")]
    public string S3Key { get; set; }

    [JsonProperty("s3VersionId")]
    public object S3VersionId { get; set; }
}

So I actually have several questions on that:

  1. What am I doing wrong in terms of forming response?
  2. I had to log incoming json and create corresponding types by myself, and also I'm forming response as anonymous object manually. Are there existing types in .Net SDK for these? I couldn't find any reference.

Also, maybe there is some other way of doing what I need? Maybe there's a way to run batch s3 copy with dynamic output? Or I was thinking about running bunch of lambdas manually, but I'm not sure how to balance the amount of records processed by each like batch job does. To make it clear, the crucial for my task is timing (with s3 batch copy job it takes couple seconds to copy 7k files 10-50 Mb each) and that tasks are being run remotely on s3 side, because the whole process is initiated from machine with very limited resources and such amount of tasks is literally killing it or slowing the whole operation up to 10 minutes if I'm throttling the number of concurrent tasks.

Any help will be greatly appreciated, thanks in advance

CodePudding user response:

Ok, so through the blood, sweat and tears I figured what was wrong. It appears that these huge ids are totally fine, you just have to return object instead of string. So, in case somebody will get into the same problem, the final attempt that worked is:

    return new
    {
        invocationSchemaVersion = s3Event.InvocationSchemaVersion,
        invocationId = s3Event.InvocationId,
        treatMissingKeysAs = "PermanentFailure",
        results = new[]
        {
            new
            {
                taskId = s3Event.Tasks[0].TaskId,
                resultCode = "Succeeded",
                resultString = "Succeeded"
            }
        }
            
           
    };

also make sure properties are in CamelCase, it will fail otherwise.

  • Related