Modules#

If your Python program gets longer, you may want to split it into several files for easier maintenance. To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module.

Run the cell below to create a file named fibo.py with several functions inside:

%%file fibo.py
""" Simple module with
    two functions to compute Fibonacci series """

def fib1(n):
   """ write Fibonacci series up to n """
   a, b = 0, 1
   while b < n:
      print(b, end=', ')
      a, b = b, a+b

def fib2(n):   
    """ return Fibonacci series up to n """
    result = []
    a, b = 0, 1
    while b < n:
        result.append(b)
        a, b = b, a+b
    return result

if __name__ == "__main__":
    import sys
    fib1(int(sys.argv[1]))
Overwriting fibo.py

You can use the function fib by importing fibo which is the name of the file without .py extension.

import fibo
print(fibo.__name__)
print(fibo.__file__)
fibo.fib1(1000)
fibo
/Users/navaro/PycharmProjects/python-notebooks/fibo.py
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 
%run fibo.py 1000
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 
help(fibo)
Help on module fibo:

NAME
    fibo

DESCRIPTION
    Simple module with
    two functions to compute Fibonacci series

FUNCTIONS
    fib1(n)
        write Fibonacci series up to n
    
    fib2(n)
        return Fibonacci series up to n

FILE
    /Users/navaro/PycharmProjects/python-notebooks/fibo.py

Executing modules as scripts#

When you run a Python module with

$ python fibo.py <arguments>

the code in the module will be executed, just as if you imported it, but with the name set to “main”. The following code will be executed only in this case and not when it is imported.

if __name__ == "__main__":
    import sys
    fib(int(sys.argv[1]))

In Jupyter notebook, you can run the fibo.py python script using magic command.

%run fibo.py 1000
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 

The module is also imported.

fib1(1000)
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 

Different ways to import a module#

import fibo
import fibo as f
from fibo import fib1, fib2
from fibo import *
  • Last command with ‘*’ imports all names except those beginning with an underscore (_). In most cases, do not use this facility since it introduces an unknown set of names into the interpreter, possibly hiding some things you have already defined.

  • If a function with same name is present in different modules imported. Last module function imported replace the previous one.

from numpy import sqrt
from scipy import sqrt
sqrt(-1)
<ipython-input-7-f3f47bc91153>:3: DeprecationWarning: scipy.sqrt is deprecated and will be removed in SciPy 2.0.0, use numpy.lib.scimath.sqrt instead
  sqrt(-1)
1j
from scipy import sqrt
from numpy import sqrt
sqrt(-1)
<ipython-input-8-8a25f477b688>:3: RuntimeWarning: invalid value encountered in sqrt
  sqrt(-1)
nan
import numpy as np
import scipy as sp

print(np.sqrt(-1+0j), sp.sqrt(-1))
1j 1j
<ipython-input-9-235de4d5ffbb>:4: DeprecationWarning: scipy.sqrt is deprecated and will be removed in SciPy 2.0.0, use numpy.lib.scimath.sqrt instead
  print(np.sqrt(-1+0j), sp.sqrt(-1))
  • For efficiency reasons, each module is only imported once per interpreter session. Therefore, if you change your modules, you must restart the interpreter – If you really want to test interactively after a long run, use :

import importlib
importlib.reload(modulename)

The Module Search Path#

When a module is imported, the interpreter searches for a file named module.py in a list of directories given by the variable sys.path.

  • Python programs can modify sys.path

  • export the PYTHONPATH environment variable to change it on your system.

import sys
sys.path
['/Users/navaro/PycharmProjects/python-notebooks',
 '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python38.zip',
 '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8',
 '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/lib-dynload',
 '',
 '/usr/local/lib/python3.8/site-packages',
 '/usr/local/lib/python3.8/site-packages/IPython/extensions',
 '/Users/navaro/.ipython']
import collections
collections.__path__
['/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/collections']

sys.path is a list and you can append some directories:

sys.path.append("/Users/navaro/python-notebooks/")
print(sys.path)
['/Users/navaro/PycharmProjects/python-notebooks', '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python38.zip', '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8', '/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/lib-dynload', '', '/usr/local/lib/python3.8/site-packages', '/usr/local/lib/python3.8/site-packages/IPython/extensions', '/Users/navaro/.ipython', '/Users/navaro/python-notebooks/']

When you import a module foo, following files are searched in this order:

  • foo.dll, foo.dylib or foo.so

  • foo.py

  • foo.pyc

  • foo/__init__.py

Packages#

  • A package is a directory containing Python module files.

  • This directory always contains a file name __init__.py

sklearn
├── base.py
├── calibration.py
├── cluster
│   ├── __init__.py
│   ├── _kmeans.py
│   ├── _mean_shift.py
├── ensemble
│   ├── __init__.py
│   ├── _bagging.py
│   ├── _forest.py

cluster __init__.py

from ._mean_shift import mean_shift, MeanShift
from ._kmeans import k_means, KMeans, MiniBatchKMeans

Relative imports#

These imports use leading dots to indicate the current and parent packages involved in the relative import. In the sugiton module, you can use:

from . import cluster # import module in the same directory
from .. import base   # import module in parent directory
from ..ensemble import _forest # import module in another subdirectory of the parent directory

Reminder#

Don’t forget that importing * is not recommended

sum(range(5),-1)
9
from numpy import *
sum(range(5),-1)
10
del sum # delete imported sum function from numpy 
help(sum)
Help on built-in function sum in module builtins:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.
import numpy as np
help(np.sum)
Help on function sum in module numpy:

sum(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)
    Sum of array elements over a given axis.
    
    Parameters
    ----------
    a : array_like
        Elements to sum.
    axis : None or int or tuple of ints, optional
        Axis or axes along which a sum is performed.  The default,
        axis=None, will sum all of the elements of the input array.  If
        axis is negative it counts from the last to the first axis.
    
        .. versionadded:: 1.7.0
    
        If axis is a tuple of ints, a sum is performed on all of the axes
        specified in the tuple instead of a single axis or all the axes as
        before.
    dtype : dtype, optional
        The type of the returned array and of the accumulator in which the
        elements are summed.  The dtype of `a` is used by default unless `a`
        has an integer dtype of less precision than the default platform
        integer.  In that case, if `a` is signed then the platform integer
        is used while if `a` is unsigned then an unsigned integer of the
        same precision as the platform integer is used.
    out : ndarray, optional
        Alternative output array in which to place the result. It must have
        the same shape as the expected output, but the type of the output
        values will be cast if necessary.
    keepdims : bool, optional
        If this is set to True, the axes which are reduced are left
        in the result as dimensions with size one. With this option,
        the result will broadcast correctly against the input array.
    
        If the default value is passed, then `keepdims` will not be
        passed through to the `sum` method of sub-classes of
        `ndarray`, however any non-default value will be.  If the
        sub-class' method does not implement `keepdims` any
        exceptions will be raised.
    initial : scalar, optional
        Starting value for the sum. See `~numpy.ufunc.reduce` for details.
    
        .. versionadded:: 1.15.0
    
    where : array_like of bool, optional
        Elements to include in the sum. See `~numpy.ufunc.reduce` for details.
    
        .. versionadded:: 1.17.0
    
    Returns
    -------
    sum_along_axis : ndarray
        An array with the same shape as `a`, with the specified
        axis removed.   If `a` is a 0-d array, or if `axis` is None, a scalar
        is returned.  If an output array is specified, a reference to
        `out` is returned.
    
    See Also
    --------
    ndarray.sum : Equivalent method.
    
    add.reduce : Equivalent functionality of `add`.
    
    cumsum : Cumulative sum of array elements.
    
    trapz : Integration of array values using the composite trapezoidal rule.
    
    mean, average
    
    Notes
    -----
    Arithmetic is modular when using integer types, and no error is
    raised on overflow.
    
    The sum of an empty array is the neutral element 0:
    
    >>> np.sum([])
    0.0
    
    For floating point numbers the numerical precision of sum (and
    ``np.add.reduce``) is in general limited by directly adding each number
    individually to the result causing rounding errors in every step.
    However, often numpy will use a  numerically better approach (partial
    pairwise summation) leading to improved precision in many use-cases.
    This improved precision is always provided when no ``axis`` is given.
    When ``axis`` is given, it will depend on which axis is summed.
    Technically, to provide the best speed possible, the improved precision
    is only used when the summation is along the fast axis in memory.
    Note that the exact precision may vary depending on other parameters.
    In contrast to NumPy, Python's ``math.fsum`` function uses a slower but
    more precise approach to summation.
    Especially when summing a large number of lower precision floating point
    numbers, such as ``float32``, numerical errors can become significant.
    In such cases it can be advisable to use `dtype="float64"` to use a higher
    precision for the output.
    
    Examples
    --------
    >>> np.sum([0.5, 1.5])
    2.0
    >>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32)
    1
    >>> np.sum([[0, 1], [0, 5]])
    6
    >>> np.sum([[0, 1], [0, 5]], axis=0)
    array([0, 6])
    >>> np.sum([[0, 1], [0, 5]], axis=1)
    array([1, 5])
    >>> np.sum([[0, 1], [np.nan, 5]], where=[False, True], axis=1)
    array([1., 5.])
    
    If the accumulator is too small, overflow occurs:
    
    >>> np.ones(128, dtype=np.int8).sum(dtype=np.int8)
    -128
    
    You can also start the sum with a value other than zero:
    
    >>> np.sum([10], initial=5)
    15