Home > OS >  What is the difference between these two imports in python?
What is the difference between these two imports in python?

Time:01-29

I assume the operation is the same, but why two imports of the same class? Are there specific situations when to use the first syntax and when to use the second? In the current documentation (v2.1.x) there is a second way.

from itsdangerous import URLSafeTimedSerializer

from itsdangerous.url_safe import URLSafeTimedSerializer

CodePudding user response:

In the general case, the two are distinct; one imports a symbol from the parent package, and the other from the child package.

In practice, itsdangerous takes care to provide the same symbol via the parent package for convenience, so in this case, the two are equivalent.

More generally, you would expect one of them to throw an error for any package where this convenience mechanism is not present.

In pathological cases, it would be possible for the parent and the child to have classes or functions with the same name, but completely different contents.

CodePudding user response:

from itsdangerous

Using the above import means you are importing from the whole 'itsdangerous' library in your Python project.

from itsdangerous.url_safe

Whereas using the above import means you are importing from "url_safe" module within the 'itsdangerous' library.

Since you are importing only one method, URLSafeTimedSerializer, it wouldn't make a difference which import statement you use - because it's part of 'url_safe' module. It would help interpreter understand which module contains the method rather than going through whole library.

It would help the reader understand which module contains the method too.

CodePudding user response:

Summary

In this specific case, the itsdangerous library implements an alias, so that these two import lines do the same thing. The alias from itsdangerous import URLSafeTimedSerializer is meant for convenience; the module is actually defined in the itsdangerous.url_safe package.

Many real-world libraries use this technique so that users can choose whether to write the shorter line or to be explicit about the package structure. But by using the from ... import syntax, the class will be called URLSafeTimedSerializer (without any prefix) in the code anyway.

Some other real-world libraries use this technique with "internal" modules, which have names prefixed with _. The idea is that the user is not intended to import those modules (or sub-packages) directly, but their contents are still available directly from the package. Instead of writing one large module, making this sort of package allows for splitting up the implementation across multiple files.

In general, from X import Z means to take Z from X and use it. This can only work if X actually has Z in it. from X.Y import Z means to take Z from X.Y and use it. This can only work if X.Y has Z in it. Even if both sources contain a Z, it isn't necessarily the same Z. However, a library author can arrange so that X directly contains the same Z that was defined inside X.Y.

How from ... import works

from X import Y can work in three ways:

  1. X is a package, and Y is a module. The package will be loaded if needed, then the module is loaded if needed. Then the module is assigned to Y in your code.

  2. X is a package, and Y is a class. The package will be loaded if needed. Assuming there is no error, Y is already an attribute of X; that will be looked up, and assigned to Y in your code.

  3. X is a module, and Y is a class. If X is inside a package (this depends on the syntax used for X, not on the folder structure), that package (and any parent packages) will be loaded if needed. Assuming there is no error, the Y class is found inside the X module, and is assigned to the name Y in your code.

The above is a bit imprecise because, from Python's point of view, a package is a kind of module - so everything above should say "non-package module" rather than just "module".

Loading a package does not necessarily load any modules (including sub-packages) that it contains, but the package's __init__.py (if present) can explicitly import these things to load them. Loading a module that is part of a package, does necessarily attach it as an attribute of its package. (It also necessarily loads the package; otherwise, there would be nothing to attach to.)

Everything that is loaded is cached by name; trying to load it again with the same name will give the cached module object back.

How do classes become part of packages and other modules?

Notice that only packages and modules are "loaded" (i.e. imported), not classes. A module object is something that represents all the global variables in a module file's source code, after all its top-level code has run.

For ordinary modules, this is straightforward. For packages, the "top-level code" may be contained in a special file named __init__.py.

How can the top-level package alias a class defined in one of its modules?

Simple: it just explicitly imports the module using the same from ... import syntax. Remember, the imports are cached, so this doesn't cause a conflict or waste time; and it assigns the class name as a global variable within the package's code - which means that, when the package is loaded, it will be an attribute of the package.

Again, loading a package doesn't automatically load its contained modules; but loading them explicitly (using __init__.py) allows the package to alias the contents of its modules after loading them.

We can see this in the source code:

from .url_safe import URLSafeTimedSerializer as URLSafeTimedSerializer

(The use of as here is redundant since the class is not actually renamed. However, sometimes these aliases will rename something in order to avoid a naming conflict.)

Following along: when the itsdangerous package (which, being a package, is a module object) is loaded, it will explicitly load its contained url_safe module. It takes the URLSafeTimedSerializer attribute from url_safe (which is also a module), renames it URLSafeTimedSerializer, and then that is a global variable within the code of itsdangerous/__init__.py. Because it's a global there, when the itsdangerous object is created (and stored in the module cache), it will have a URLSafeTimedSerializer attribute, which is the class. That, in turn, allows the user's code to write from itsdangerous import URLSafeTimedSerializer, even though URLSafeTimedSerializer is not defined there.

CodePudding user response:

In both instances, you are importing the same class URLSafeTimedSerializer defined in itsdangerous.url_safe.

The first one: from itsdangerous import URLSafeTimedSerializer works the same as the second one: from itsdangerous.url_safe import URLSafeTimedSerializer because there are no other artefacts with conflicting names within the itsdangerous module.

I would also like to state that thinking the second import doesn't load the complete itsdangerous is not technically correct. In both cases the whole of itsdangerous gets loaded into sys.modules and in both instances the URLSafeTimedSerializer gets bound to sys.modules['itsdangerous'].url_safe. Check out this answer for more information on this front. Performance-wise they are also similar since the itsdangerous module gets loaded in both cases.

One advantage of the second import over the first is that it helps with readability. If someone wants to look into the definition for URLSafeTimedSerializer (without access to some ide tool that automatically finds references) they can do so easily knowing that they would have to look in url_safe.

Another advantage is added resilience to your code. If for some reason some newer version of itsdangerous has some other definition of URLSafeTimedSerializer outside of url_safe (which is honestly bad coding practice, but hey, it is entirely possible :) ), and your package manager installs this newer version of the module, then from itsdangerous import URLSafeTimedSerializer will start running into problems.

  • Related