CodePudding user response:
Azure Synapse Analytics supports two development models:
- Synapse live development: The user develops/debugs code in Synapse Studio and then publishes it to save/execute it. Synapse Studio authors directly against the Synapse service.
- Git-enabled development: The user develops/debug code in Synapse Studio and commits changes to a working branch of a Git repository. The user can connect either Azure DevOps or GitHub git repository in the Synapse workspace.
Links:
- https://docs.microsoft.com/en-us/azure/synapse-analytics/cicd/source-control
- https://docs.microsoft.com/en-us/azure/synapse-analytics/security/synapse-workspace-access-control-overview#develop-and-execute-code-in-azure-synapse
CodePudding user response:
I find this to be one of the more confusing topics in Synapse. It also applies to Azure Data Factory (ADF). The short answer to your question is that it gets Published to the Live Synapse service. The longer version is below.
Azure Synapse has two modes: Synapse Live and (optionally) Git connected.
Live mode
Live mode is the "production" version. It contains all the artifacts (scripts, notebooks, pipelines, and others) that can be accessed by your users (assuming proper security, etc.) It is also what surfaces artifacts that can be executed externally, like Pipelines. When you execute a pipeline externally (say from a Logic App), it is the Live version that executes. [again, same in ADF]
Whether you work in the workspace directly (as your image implies) or in Git branches (more on this below), you can think of those as "development" versions. "Publish" promotes the artifacts from development to production.
In Live mode, the ONLY way to save the artifacts is to Publish, so in a way you are working directly in Production: your saved version is ALWAYS the Published version. For any real work involving teams, this can be troublesome. It is highly recommended that you connect your Workspace to a Git repository.
Git mode
When your workspace is connected to Git, you work in a branch. By default, this will most likely be the "main" branch. The main branch is your trunk, and you can only Publish from main. But you can work for a very long time in main without ever publishing, so it really becomes a true development environment.
In Git mode, you Commit (save) your artifact changes to your Git branch. At some point in the future, when you are ready to move the artifacts to production, then you Publish main. Publishing in this case updates a separate branch in Git typically named "adf_publish". This is a branch you should basically never touch or try to work in directly as I'm pretty sure it contains some Synapse specific items. [It is a personal wish list item for me to be able to auto-publish whenever main ets merged.]
Some Git advice: if you have a Team of people (meaning more than 1) working in the repository, you should set up your Git repository to ban commit to main. [In fact, even if it IS just you, I would do it this way regardless]. Individuals should always work from a different branch and use Pull Requests to merge code back into main. I can tell you from experience that multiple people working directly in main makes it possible to screw up your repo to the point it won't Publish, which is no fun to correct.
Back to Live mode
Even when you are Git connected, Live mode is still present. You can always switch back to it from the drop down. When you do, it is like a protected mode, because while you can write & execute scripts and notebooks, you can't save them to the Workspace. You can also have users that may only operate in Live mode, so they are consumers but not creators. When in Live mode, you will not be able to see or interact with the Git repo or branches. When you are ready to edit again, you can use the drop down to easily go back to Git mode.