Home > Software engineering >  Costs associated with Data Analysis (data cleaning) on the cloud
Costs associated with Data Analysis (data cleaning) on the cloud

Time:10-20

I am data analyst. My company is moving all data science to a cloud provider (it could be Azure, GCP,AWS). All the data science programming tools like Jupyter notebook will be installed on the cloud environment (there will be no local installations of Python, or Jupyter Notebooks on the laptop).

For most of my work, I will be reading/ingesting relational database tables directly from an on-premise Database. Also most of my data analysis work does not require any GPU instances for data processing. Sometimes, I also do simple research or experimentation data analysis programming such as data cleaning using Jupyter notebooks without the need for usage of GPU instances.

I would like to find out if it would be possible to do such activities without incurring any pay-per-use costs or unnecessary expenses for my company on their data science cloud computing platform given that none of my tasks utilize GPUs? Please advise, thank you.

CodePudding user response:

Jupyter Notebook can be installed in the cloud, but also on prem and on your workstation. You pay either resource in the cloud, on prem, or your worstation.

Of course, if you add large disk, GPUs, CPUs, memory, it costs more! The problem isn't the cost, it is more where do you want to run your notebook?


I think, there is a bad alternative. With Colab you have free Jupyter Notebook instance. But, AFAIK, it's not private, it's public instances and if you work for your company, you can have data leakage. (Not sure, to validate, but it's not a recommended solution in any case)

  • Related