Modules #
If you quit from the Python interpreter and enter it again, the definitions you have made (functions and variables) are lost.
Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead.
This is known as creating a script.
WHAT is this? #
My version: #
A module in Python is a file that contains Python code, including functions, classes, and variables. It helps you organize your code into separate files.
WHY is this important? #
My version: #
- Organize code: They keep your code structured and easy to read;
- Promote reusability: You can use the same module in different programs, saving time; and
- Avoid conflicts: Modules create separate spaces for your code, preventing naming issues.
WHY should I learn this? #
My version: #
- Build bigger programs: They help you larger projects more easily;
- Save time: You can reuse code instead of writing it from scracth; and
- Use libraries: Many useful tools in Python are available as modules.
WHEN will I need this? #
My version: #
- Working on large projects: Organizing your code into modules makes it easier to handle;
- Collaborating with others: Modules help keep code organized when working in teams; and
- Creating reusable code: If you want share your code or use it on different projects, modules are essential.
HOW does this work? #
A module is a file containing Python definitions and statements.
The file name is the module name with the suffix .py
appended.
Within a module, the module’s name (as a string) is available as the value of the global variable __name__
.
For instance, use your favorite text editor to create a file called fibo.py
in the current directory with the following contents:
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while a < n:
print(a, end=" ")
a, b = b, a + b
print()
def fib2(n): # return Fibonacci series up to n
result = []
a, b = 0, 1
while a < n:
result.append(a)
a, b = b, a + b
return result
Now enter the Python interpreter and import this module with the following command:
>>> import fibo
>>> fibo.fib(1000)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
>>> fibo.fib2(100)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> fib2 = fibo.fib2
>>> fib2(500)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]
more on modules #
A module can contain executable statements as well as function definitions.
These statements are intended to initialize the module.
They are executed only the first time the module name is encountered in an import statement.
Each module has its own private namespace, which is used as the global namespace by all functions defined in the module.
Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user’s global variables.
There is a variant of the import statement that imports names from a module directly into the importing module’s namespace. For example:
>>> from fibo import fib, fib2
>>> fib(500)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
There is even a variant to import all names that a module defines:
>>> from fibo import *
>>> fib(500)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
This imports all names except those beginning with an underscore (_
).
In most cases Python programmers do not use this facility since it introduces an unknown set of names into the interpreter, possibly hiding some things you have already defined.
If the module name is followed by as
, then the name following as is bound directly to the imported module.
>>> import fibo as fib
>>> fib.fib(500)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
It can also be used when utilising from
with similar effects:
>>> from fibo import fib as fibonacci
>>> fibonacci(500)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
executing modules as scripts #
When you run a Python module with
python fibo.py <arguments>
the code in the module will be executed, just as if you imported it, but with the __name__
set to "__main__"
. That means that by adding this code at the end of your module:
if __name__ == "__main__":
import sys
fib(int(sys.argv[1]))
you can make the file usable as a script as well as an importable module, because the code that parses the command line only runs if the module is executed as the “main” file:
python fibo.py 500
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377
If the module is imported, the code is not run:
>>> import fibo
>>>
This is often used either to provide a convenient user interface to a module, or for testing purposes (running the module as a script executes a test suite).
the module search path #
When a module named fibo
is imported, the interpreter first searches for a built-in module with that name.
These module names are listed in sys.builtin_module_names
.
If not found, it then searches for a file named fibo.py
in a list of directories given by the variable sys.path
.
sys.path
is initialized from these locations:
- The directory containing the input script (or the current directory when no file is specified).
PYTHONPATH
(a list of directory names, with the same syntax as the shell variablePATH
).- The installation-dependent default (by convention including a
site-packages
directory, handled by the site module).
“compiled” python files #
To speed up loading modules, Python caches the compiled version of each module in the __pycache__
directory under the name module.version.pyc
, where the version encodes the format of the compiled file.
For example, in CPython release 3.3 the compiled version of fibo.py
would be cached as __pycache__/fibo.cpython-33.pyc
.
This naming convention allows compiled modules from different releases and different versions of Python to coexist.
Python does not check the cache in two circumstances.
- First, it always recompiles and does not store the result for the module that’s loaded directly from the command line.
- Second, it does not check the cache if there is no source module.
Some tips for experts:
- You can use the
-O
or-OO
switches on the Python command to reduce the size of a compiled module.- The
-O
switch removes assert statements, the-OO
switch removes both assert statements and__doc__
strings. - Since some programs may rely on having these available, you should only use this option if you know what you’re doing.
- “Optimized” modules have an
opt-
tag and are usually smaller. - Future releases may change the effects of optimization.
- The
- A program doesn’t run any faster when it is read from a
.pyc
file than when it is read from a.py
file; the only thing that’s faster about.pyc
files is the speed with which they are loaded. - The module
compileall
can create.pyc
files for all modules in a directory. - There is more detail on this process, including a flow chart of the decisions, in (PEP 3147)[https://peps.python.org/pep-3147/].
standard modules #
Python comes with a library of standard modules.
Some modules are built into the interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built in, either for efficiency or to provide access to operating system primitives such as system calls.
The set of such modules is a configuration option which also depends on the underlying platform.
the dir()
function
#
The built-in function dir()
is used to find out which names a module defines. It returns a sorted list of strings:
>>> import fibo, sys
>>> dir(fibo)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'fib', 'fib2']
>>> dir(sys)
['__breakpointhook__', '__displayhook__', '__doc__', '__excepthook__', '__interactivehook__', '__loader__', '__name__', '__package__', '__spec__', '__stderr__', '__stdin__', '__stdout__', '__unraisablehook__', '_base_executable', '_clear_type_cache', '_current_exceptions', '_current_frames', '_debugmallocstats', '_framework', '_getframe', '_getframemodulename', '_git', '_home', '_setprofileallthreads', '_settraceallthreads', '_stdlib_dir', '_xoptions', 'abiflags', 'activate_stack_trampoline', 'addaudithook', 'api_version', 'argv', 'audit', 'base_exec_prefix', 'base_prefix', 'breakpointhook', 'builtin_module_names', 'byteorder', 'call_tracing', 'copyright', 'deactivate_stack_trampoline', 'displayhook', 'dont_write_bytecode', 'exc_info', 'excepthook', 'exception', 'exec_prefix', 'executable', 'exit', 'flags', 'float_info', 'float_repr_style', 'get_asyncgen_hooks', 'get_coroutine_origin_tracking_depth', 'get_int_max_str_digits', 'getallocatedblocks', 'getdefaultencoding', 'getdlopenflags', 'getfilesystemencodeerrors', 'getfilesystemencoding', 'getprofile', 'getrecursionlimit', 'getrefcount', 'getsizeof', 'getswitchinterval', 'gettrace', 'getunicodeinternedsize', 'hash_info', 'hexversion', 'implementation', 'int_info', 'intern', 'is_finalizing', 'is_stack_trampoline_active', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'monitoring', 'orig_argv', 'path', 'path_hooks', 'path_importer_cache', 'platform', 'platlibdir', 'prefix', 'ps1', 'ps2', 'pycache_prefix', 'set_asyncgen_hooks', 'set_coroutine_origin_tracking_depth', 'set_int_max_str_digits', 'setdlopenflags', 'setprofile', 'setrecursionlimit', 'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdlib_module_names', 'stdout', 'thread_info', 'unraisablehook', 'version', 'version_info', 'warnoptions']
packages #
Packages are a way of structuring Python’s module namespace by using “dotted module names”.
For example, the module name A.B
designates a submodule named B
in a package named A
.
Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages like NumPy
or Pillow
from having to worry about each other’s module names.
sound/ Top-level package
__init__.py Initialize the sound package
formats/ Subpackage for file format conversions
__init__.py
wavread.py
wavwrite.py
aiffread.py
aiffwrite.py
auread.py
auwrite.py
...
effects/ Subpackage for sound effects
__init__.py
echo.py
surround.py
reverse.py
...
filters/ Subpackage for filters
__init__.py
equalizer.py
vocoder.py
karaoke.py
...
The __init__.py
files are required to make Python treat directories containing the file as packages (unless using a namespace package).
This prevents directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path.
In the simplest case, __init__.py
can just be an empty file, but it can also execute initialization code for the package or set the __all__
variable.
Users of the package can import individual modules from the package, for example:
import sound.effects.echo
An alternative way of importing the submodule is:
from sound.effects import echo
Yet another variation is to import the desired function or variable directly:
from sound.effects.echo import echofilter
importing * from a package #
When a user writes from sound.effects import *
, Python does not automatically search the filesystem for all submodules in the package.
Instead, it relies on the package author to define which submodules should be imported using a list named __all__
in the package’s __init__.py
file.
Explicit Index with __all__
:
If __all__
is defined in __init__.py
, it specifies the submodules to import when using from package import *
.
__all__ = ["echo", "surround", "reverse"]
This means from sound.effects import *
will import the echo
, surround
, and reverse
submodules.
Name shadowing:
If a function or variable in __init__.py
has the same name as a submodule, it will shadow that submodule.
def reverse(msg: str):
return msg[::-1]
In this case, from sound.effects import *
would import only echo
and surround
, not the reverse
submodule, because the local reverse
function takes precedence.
Behavior without __all__
:
If __all__
is not defined, from sound.effects import *
will not import all submodules.
It will only import names defined in the package, including those explicitly loaded by previous import statements.
Best practices:
Using from package import specific_submodule
is recommended over import *
to avoid confusion and maintain clarity in the code.
Importing specific submodules helps prevent naming conflicts and makes the code easier to read and maintain.
intra-package references #
When working with packages and subpackages in Python, you can use both absolute and relative imports to reference submodules.
Absolute imports:
You can refer to submodules using their full path. For example, if the sound.filters.vocoder
module needs to use the echo
module from the sound.effects
package, it can use:
from sound.effects import echo
Relative imports:
Relative imports use leading dots to indicate the current and parent packages. For example, within the surround
module, you might write:
from . import echo # Imports echo from the current package
from .. import formats # Imports formats from the parent package
from ..filters import equalizer # Imports equalizer from the sibling filters package
Main module consideration:
Relative imports depend on the name of the current module.
Since the main module is always named __main__
, modules intended to be run as the main program should always use absolute imports to avoid issues.
In summary, absolute imports provide a clear path to submodules, while relative imports allow for concise references within a package structure. However, when creating a module intended to be executed as the main program, always use absolute imports.
packages in multiple directories #
In Python, packages have a special attribute called __path__
, which is a sequence of strings that contains the directory name holding the package’s __init__.py
file before the code in that file is executed.
- Initialization: The
__path__
attribute is initialized to the directory of the package when the package is imported. - Modification: You can modify the
__path__
attribute to change the directories that Python searches for modules and subpackages within the package. - Use Case: While not commonly needed, modifying
__path__
can be useful for extending the set of modules found in a package, allowing for more flexible package structures across multiple directories.
In summary, the __path__
attribute allows for customization of module search paths within packages, enabling the inclusion of modules from different directories.