modulegraph.modulegraph
— Find modules used by a script¶
This module defines ModuleGraph
, which is used to find
the dependencies of scripts using bytecode analysis.
A number of APIs in this module refer to filesystem path. Those paths can refer to
files inside zipfiles (for example when there are zipped egg files on sys.path
).
Filenames referring to entries in a zipfile are not marked any way, if "somepath.zip"
refers to a zipfile, that is "somepath.zip/embedded/file"
will be used to refer to
embedded/file
inside the zipfile.
The actual graph¶
-
class
modulegraph.modulegraph.
ModuleGraph
([path[, excludes[, replace_paths[, implies[, graph[, debug]]]]]])¶ Create a new ModuleGraph object. Use the
run_script()
method to add scripts, and their dependencies to the graph.Parameters: - path – Python search path to use, defaults to
sys.path
- excludes – Iterable with module names that should not be included as a dependency
- replace_paths – List of pathname rewrites
(old, new)
. When this argument is supplied theco_filename
attributes of code objects get rewritten before scanning them for dependencies. - implies – Implied module dependencies, a mapping from a module name to the list
of modules it depends on. Use this to tell modulegraph about dependencies that cannot
be found by code inspection (such as imports from C code or using the
__import__()
function). - graph – A precreated
Graph
object to use, the default is to create a new one. - debug – The
ObjectGraph
debug level.
-
run_script
(pathname[, caller])¶ Create, and return, a node by path (not module name). The pathname should refer to a Python source file and will be scanned for dependencies.
The optional argument caller is the the node that calls this script, and is used to add a reference in the graph.
-
import_hook
(name[[, caller[, fromlist[, level, [, attr]]]])¶ Import a module and analyse its dependencies
Parameters: - name – The module name
- caller – The node that caused the import to happen
- fromlist – The list of names to import, this is an empty list for
import name
and a list of names forfrom name import a, b, c
. - level – The import level. The value should be
-1
for classical Python 2 imports,0
for absolute imports and a positive number for relative imports ( where the value is the number of leading dots in the imported name). - attr – Attributes for the graph edge.
-
implyNodeReference
(node, other, edgeData=None)¶ Explictly mark that node depends on other. Other is either a
node
or the name of a module that will be searched for as if it were an absolute import.
-
createReference
(fromnode, tonode[, edge_data])¶ Create a reference from fromnode to tonode, with optional edge data.
The default for edge_data is
"direct"
.
-
getReferences
(fromnode)¶ Yield all nodes that fromnode refers to. That is, all modules imported by fromnode.
Node
None
is the root of the graph, and refers to all notes that were explicitly imported byrun_script()
orimport_hook()
, unless you use an explicit parent with those methods.New in version 0.11.
-
getReferers
(tonode, collapse_missing_modules=True)¶ Yield all nodes that refer to tonode. That is, all modules that import tonode.
If collapse_missing_modules is false this includes refererences from
MissingModule
nodes, otherwiseMissingModule
nodes are replaced by the “real” nodes that reference this missing node.New in version 0.12.
-
foldReferences
(pkgnode)¶ Hide all submodule nodes for package pkgnode and add ingoing and outgoing edges to pkgnode based on the edges from the submodule nodes.
This can be used to simplify a module graph: after folding ‘email’ all references to modules in the ‘email’ package are references to the package.
-
findNode
(name)¶ Find a node by identifier. If a node by that identifier exists, it will be returned.
If a lazy node exists by that identifier with no dependencies (excluded), it will be instantiated and returned.
If a lazy node exists by that identifier with dependencies, it and its dependencies will be instantiated and scanned for additional depende
-
create_xref
([out])¶ Write an HTML file to the out stream (defaulting to
sys.stdout
).The HTML file contains a textual description of the dependency graph.
-
graphreport
([fileobj[, flatpackages]])¶ Todo
To be documented
-
report
()¶ Print a report to stdout, listing the found modules with their paths, as well as modules that are missing, or seem to be missing.
- path – Python search path to use, defaults to
Mostly internal methods¶
The methods in this section should be considered as methods for subclassing at best, please let us know if you need these methods in your code as they are on track to be made private methods before the 1.0 release.
Warning
The methods in this section will be refactored in a future release, the current architecture makes it unnecessarily hard to write proper tests.
-
class
modulegraph.modulegraph.
ModuleGraph
-
_determine_parent
(caller)¶ Returns the node of the package root voor caller. If caller is a package this is the node itself, if the node is a module in a package this is the node of for the package and otherwise the caller is not a package and the result is
None
.
-
_find_head_package
(parent, name[, level])¶ Todo
To be documented
-
_load_tail
(mod, tail)¶ This method is called to load the rest of a dotted name after loading the root of a package. This will import all intermediate modules as well (using
import_module()
), and returns the modulenode
for the requested node.Note
When tail is empty this will just return mod.
Parameters: - mod – A start module (instance of
Node
) - tail – The rest of a dotted name, can be empty
Raises: ImportError – When the requested (or one of its parents) module cannot be found
Returns: the requested module
- mod – A start module (instance of
-
_ensure_fromlist
(m, fromlist)¶ Yield all submodules that would be imported when importing fromlist from m (using
from m import fromlist...
).m must be a package and not a regular module.
-
_find_all_submodules
(m)¶ Yield the filenames for submodules of in the same package as m.
-
_import_module
(partname, fqname, parent)¶ Perform import of the module with basename partname (
path
) and full name fqname (os.path
). Import is performed by parent.This will create a reference from the parent node to the module node and will load the module node when it is not already loaded.
-
_load_module
(fqname, fp, pathname, (suffix, mode, type))¶ Load the module named fqname from the given pathame. The argument fp is either
None
, or a stream where the code for the Python module can be loaded (either byte-code or the source code). The (suffix, mode, type) tuple are the suffix of the source file, the open mode for the file and the type of module.Creates a node of the right class and processes the dependencies of the
node
by scanning the byte-code for the node.Returns the resulting
node
.
-
_scan_code
(code, m)¶ Scan the code object for module m and update the dependencies of m using the import statemets found in the code.
This will automaticly scan the code for nested functions, generator expressions and list comprehensions as well.
-
_load_package
(fqname, pathname)¶ Load a package directory.
-
_find_module
(name, path[, parent])¶ Locates a module named name that is not yet part of the graph. This method will raise
ImportError
when the module cannot be found or when it is already part of the graph. The name can not be a dotted name.The path is the search path used, or
None
to use the default path.When the parent is specified name refers to a subpackage of parent, and path should be the search path of the parent.
Returns the result of the global function
find_module
.
-
itergraphreport
([name[, flatpackages]])¶ Todo
To be documented
-
_replace_paths_in_code
(co)¶ Replace the filenames in code object co using the replace_paths value that was passed to the contructor. Returns the rewritten code object.
-
_calc_setuptools_nspackages
()¶ Returns a mapping from package name to a list of paths where that package can be found in
--single-version-externally-managed
form.This method is used to be able to find those packages: these use a magic
.pth
file to ensure that the package is added tosys.path
, as they do not contain an___init__.py
file.Packages in this form are used by system packages and the “pip” installer.
-
Graph nodes¶
The ModuleGraph
contains nodes that represent the various types of modules.
-
class
modulegraph.modulegraph.
Alias
(value)¶ This is a subclass of string that is used to mark module aliases.
-
class
modulegraph.modulegraph.
Node
(identifier)¶ Base class for nodes, which provides the common functionality.
Nodes can by used as mappings for storing arbitrary data in the node.
Nodes are compared by comparing their identifier.
-
debug
¶ Debug level (integer)
-
graphident
¶ The node identifier, this is the value of the identifier argument to the constructor.
-
identifier
¶ The node identifier, this is the value of the identifier argument to the constructor.
-
filename
¶ The filename associated with this node.
-
packagepath
¶ The value of
__path__
for this node.
-
code
¶ The
code object
associated with this node
-
globalnames
¶ The set of global names that are assigned to in this module. This includes those names imported through startimports of Python modules.
-
startimports
¶ The set of startimports this module did that could not be resolved, ie. a startimport from a non-Python module.
-
__contains__
(name)¶ Return if there is a value associated with name.
This method is usually accessed as
name in aNode
.
-
__setitem__
(name, value)¶ Set the value of name to value.
This method is usually accessed as
aNode[name] = value
.
-
__getitem__
(name)¶ Returns the value of name, raises
KeyError
when it cannot be found.This method is usually accessed as
value = aNode[name]
.
-
-
class
modulegraph.modulegraph.
AliasNode
(name, node)¶ A node that represents an alias from a name to another node.
The value of attribute graphident for this node will be the value of name, the other
Node
attributed are references to those attributed in node.
-
class
modulegraph.modulegraph.
BadModule
(identifier)¶ Base class for nodes that should be ignored for some reason
-
class
modulegraph.modulegraph.
ExcludedModule
(identifier)¶ A module that is explicitly excluded.
-
class
modulegraph.modulegraph.
MissingModule
(identifier)¶ A module that is imported but cannot be located.
-
class
modulegraph.modulegraph.
InvalidRelativeImport
(relative_path, from_name)¶ A module that was imported using a relative import statement that references a file outside of a toplevel package.
-
class
modulegraph.modulegraph.
Script
(filename)¶ A python script.
-
filename
¶ The filename for the script
-
-
class
modulegraph.modulegraph.
BaseModule
(name[, filename[, path]])¶ The base class for actual modules. The name is the possibly dotted module name, filename is the filesystem path to the module and path is the value of
__path__
for the module.-
graphident
¶ The name of the module
-
filename
¶ The filesystem path to the module.
-
path
¶ The value of
__path__
for this module.
-
-
class
modulegraph.modulegraph.
BuiltinModule
(name)¶ A built-in module (one in
sys.builtin_module_names
).
-
class
modulegraph.modulegraph.
SourceModule
(name)¶ A module for which the python source code is available.
-
class
modulegraph.modulegraph.
InvalidSourceModule
(name)¶ A module for which the python source code is available, but where that source code cannot be compiled (due to syntax errors).
This is a subclass of
SourceModule
.New in version 0.12.
-
class
modulegraph.modulegraph.
CompiledModule
(name)¶ A module for which only byte-code is available.
-
class
modulegraph.modulegraph.
Package
(name)¶ Represents a python package
-
class
modulegraph.modulegraph.
NamespacePackage
(name)¶ Represents a python namespace package.
This is a subclass of
Package
.
-
class
modulegraph.modulegraph.
Extension
(name)¶ A native extension
Warning
A number of other node types are defined in the module. Those modules aren’t used by modulegraph and will be removed in a future version.
Edge data¶
The edges in a module graph by default contain information about the edge, represented
by an instance of DependencyInfo
.
-
class
modulegraph.modulegraph.
DependencyInfo
(conditional, function, tryexcept, fromlist)¶ This class is a
namedtuple
for representing the information on a dependency between two modules.All attributes can be used to deduce if a dependency is essential or not, and are particularly useful when reporting on missing modules (dependencies on
MissingModule
).-
fromlist
¶ A boolean that is true iff the target of the edge is named in the “import” list of a “from” import (“from package import module”).
When the target module is imported multiple times this attribute is false unless all imports are in “import” list of a “from” import.
-
function
¶ A boolean that is true iff the import is done inside a function definition, and is false for imports in module scope (or class scope for classes that aren’t definined in a function).
-
tryexcept
¶ A boolean that is true iff the import that is done in the “try” or “except” block of a try statement (but not in the “else” block).
-
conditional
¶ A boolean that is true iff the import is done in either block of an “if” statement.
When the target of the edge is imported multiple times the
function
,tryexcept
andconditional
attributes of all imports are merged: when there is an import where all these attributes are false the attributes are false, otherwise each attribute is set to true if it is true for at least one of the imports.For example, when a module is imported both in a try-except statement and furthermore is imported in a function (in two separate statements), both
tryexcept
andfunction
will be true. But if there is a third unconditional toplevel import for that module as well all three attributes are false.-
Utility functions¶
-
modulegraph.modulegraph.
find_module
(name[, path])¶ A version of
imp.find_module()
that works with zipped packages (and other PEP 302 importers).
-
modulegraph.modulegraph.
moduleInfoForPath
(path)¶ Return the module name, readmode and type for the file at path, or None if it doesn’t seem to be a valid module (based on its name).
-
modulegraph.modulegraph.
addPackagePath
(packagename, path)¶ Add path to the value of
__path__
for the package named packagename.
-
modulegraph.modulegraph.
replacePackage
(oldname, newname)¶ Rename oldname to newname when it is found by the module finder. This is used as a workaround for the hack that the
_xmlplus
package uses to inject itself in thexml
namespace.