I am new to Docker and confused about the concept of containerization. I wonder about the key difference between Docker container and Binder project. Here is the definition from Google search:
- A Docker container is a standardized, encapsulated environment that runs applications.
- A Binder or "Binder-ready repository" is a code repository that contains both code and content to run, and configuration files for the environment needed to run it.
Can anyone elaborate it a bit? Thanks!
CodePudding user response:
You confusion is understandable. Docker itself is a lot to follow and then adding in Binder, makes it even more complex if you look behind the curtain.
One big point to be aware of is that much of the use of MyBinder.org by a typical user is targeted at eliminating the need for those users to learn about Docker, Dockerfile syntax, and concepts of a container, etc.. The idea of the configuration files that you include to make your repository 'Binder-ready' is to make it easier to make a resulting container without the need for writing a dockerfile with the dockerfile syntax. You can more-or-less simply list the packages you need in requirements.txt
or environment.yml
and not deal with making a dockerfile while still getting those dependencies already installed in the container you end up working in. environment.yml
is a step above in complexity from requirements.txt
as the .yml
file has a syntax to it while requirements.txt
at it's most basic is simply a list. The active container the user gets at the end of the launch is not readily apparent to the typical user. Typically, they should go from launching a session to having the environment they specified on an active JupyterHub session on a remote computer.
Binder combines quite a bit of tech including docker to make things like MyBinder.org work. MyBinder.org is a public BinderHub. And a Binderhub is actually just a specialized JupyterHub running on a cluster that uses images to serve up containers to users.
You can point a MyBinder.org at a repo and it will spin up a JupyterHub session with that content, and an environment based on any configuration files in the repository. If there aren't any configuration files, you'll have the content but it just gives you a default Python stack.
Binder uses repo2docker to take a repository and make it into something that can work with docker. You can run repo2docker yourself locally on your own machine and use the produced image to spawn a running container if you want.
The built images specify the environment on the JupyterHub you get from MyBinder.org has backing it. In fact, the session you get served from MyBinder.org is a running docker container running on a Kubernetes cluster.