Strapi - how to switch and migrate from Cloudinary to S3 in production-CodePudding

Given quite a steep cost of Cloudinary as multimedia hosting service (images and videos), our client decided that they want to switch to AWS S3 as file hosting.

The problem is that there are a lot of files (thousands of images and videos) already in the app, so merely switching the provider is not enough - we need to also migrate all the files and make it look like nothing really changed for the end user.

This topic is somehow covered on Strapi forum: https://forum.strapi.io/t/switch-from-cloudinary-to-s3/15285, but there is no solution posted besides vaguely described procedure.

Is there a way to reliably perform the migration, without losing any data and without the need to change anything on client (apps that communicate with Strapi by REST/GraphQL API) side?

CodePudding user response：

There are three steps to perform the migration:

switch provider from Cloudinary to S3 in Strapi
migrate files from Cloudinary to S3
perform database update to reroute Strapi from Cloudinary to S3

Switching provider

This is the only step that is actually well documented, so I will be brief here.

First, you need to uninstall your Cloudinary Strapi plugin by running yarn remove @strapi/provider-upload-cloudinary and install S3 Plugin by running yarn add @strapi/plugin-sentry.

After you do that, you need to create your AWS infrastructure (S3 bucket and IAM with sufficient permissions). Please follow official Strapi S3 plugin documentation https://market.strapi.io/providers/@strapi-provider-upload-aws-s3 and this guide https://dev.to/kevinadhiguna/how-to-setup-amazon-s3-upload-provider-in-your-strapi-app-1opc for steps to follow.

Check that you've done everything correctly by logging in to your Strapi Admin Panel and accessing Media Library. If everything went well, all images should be missing (you will see all metadata like sizes and extensions, but not actual images). Try to upload new image by clicking on 'Add new assets' button. This image should upload successfully and also appear in your S3 bucket.

After everything works as described above, proceed to actual data migration.

Files migration

Most simple (and error resistant) way to migrate files from Cloudinary to S3 is to download them locally, then use AWS Console to upload them. If you have only hundreds (or low thousands) of files to migrate, you might actually used Cloudinary Web UI to download them all (there is a limit of downloading 1000 files at once from Cloudinary Web App).

If this is not suitable for you, there is a CLI available that can easily download all files using your terminal:

pip3 install cloudinary-cli (download CLI)

cld config -url {CLOUDINARY_API_ENV} (api env can be found on first page you see when you log into cloudinary)

cld -C {CLOUD_NAME} sync --pull . / (This step begins the download. Based on how much files you have, it might take a while. Run this command from a directory you want to download the files in. {CLOUD_NAME} can be find just above {CLOUDINARY_API_ENV} on Cloudinary dashboard, you should also see it in after running second command in your terminal. For me, this command failed several times in the middle of the download, but you can just run it again and it will continue without any problem.)

After you download files to your computer, simply use drag and drop S3 feature to upload them into your S3 bucket.

Update database

Strapi saves links to all files in database. This means that even though you switched your provider to S3 and copied all files, Strapi still doesn't know where to find these files as links in database point to Cloudinary server.

You need to update three columns in Strapi database (this approach is tested on Postgres database, there might be minor changes when using other databases). Look into 'files' table, there should be url, formats and provider columns.

Provider column is trivial, just replace cloudinary by aws-s3.

Url and formats are harder as you need to replace only part of the string - to be more precise, Cloudinary stores urls in {CLOUDINARY_LINK}/{VERSION}/{FILE} format, while S3 uses {S3_BUCKET_LINK}/{FILE} format.

My friend and colleague came up with following SQL query to perform the update:

UPDATE files SET
    formats = REGEXP_REPLACE(formats::TEXT, '\"https:\/\/res\.cloudinary\.com\/{CLOUDINARY_PROJECT}\/((image)|(video))\/upload\/v\d{10}\/([\w\.] )\"', '"https://{BUCKET_NAME}.s3.{REGION}/\4"', 'g')::JSONB,
    url = REGEXP_REPLACE(url, 'https:\/\/res\.cloudinary\.com\/{CLOUDINARY_PROJECT}\/((image)|(video))\/upload\/v\d{10}\/([\w\.] )', 'https://{BUCKET_NAME}.s3.{REGION}/\4', 'g')

just don't forget to replace {CLOUDINARY_PROJECT}, {BUCKET_NAME} and {REGION} with correct strings (easiest way to see those values is to access the database, go to files table and check one of the old urls and url of file you uploaded at the end of Switching provider step.

Also, before running the query, don't forget to backup your database! Even better, make a copy of production database and run the query on it before you mess with the production.

And that's all! Strapi is now uploading files to S3 bucket and you also have access to all the data you previously had on Cloudinary.