Module systems in programming languages

Up: Programming language design

Modern programming languages have module systems and package managers, for easy structuring of large applications and reusing source code. There are plenty different ways in which such module systems can work, though, and this note attempts to describe some of the ideas.

Some definitions:

Note that the terms “module” and “package” are not always used in the way described above. For example, Java (with its new module system) has swapped the two.

Locations

A nontrivial module will load reusable source code from three places:

The local package refers to the product that’s being developed, and that contains the module that does the importing.

In my opinion, it is important to be clear where an import statement loads modules from:

What are modules?

Modules can correspond one-to-one to files. This is the approach used in Python, Ruby, JavaScript, and others. Modules can correspond one-to-one to directories. This is the approach that Go uses. Modules can correspond to a combination of files and directories as well; this is the case with Rust, Java, and others.

A module name can be implicitly derived from the filename (usually when modules correspond to files, e.g. in Python or JavaScript), explicitly given (e.g. with a package mypkgname declaration, like in Java or Go).

If modules and files correspond one-to-one, then it is useful to have a main module (file) that re-exports declarations from other modules inside the package. For example, a math package might have dozens of modules, but its main module would export some constants and functions defined in submodules: math.trig.sin might be re-exported as math.sin for convenience.

A main module could be a regular module (file), or it could have a special filename such as __init__ (which is what Python uses), which the interpreter/compiler looks for. This is useful for simplifying the directory structure. Compare with __init__:

src/
  math/
    __init__.xx
    trig.xx
    vec.xx

… and without __init__:

src/
  math.xx
  math/
    trig.xx
    vec.xx

Open question: If modules correspond to directories rather than files, are re-exports still useful?

Visibility

Visibility determines whether a symbol is accessible to other modules. Typically there is private (not accessible to other module), and public (accessible to other modules). More granular options are possible, e.g. package-private (visible to modules in the current package).

To do: Private by default, with export keyword? Go-style public/private? Underscore for private?

To do: Re-exports: how do those work?

Visibility is not a property of the symbol

The visibility of a symbol is a property of the symbol’s relationship to the module, rather than a property of the symbol itself. This is relevant in languages that determine visibility based on the symbol name. For example:

import math

println(math.Cos(3.14))

In the example above, Cos is exported from math and is public, because it uses the Go-style convention of public symbols starting with an uppercase letter. There is no issue in this case.

It gets complicated when importing symbols directly:

import Cos from math

println(Cos(3.14))

In this case, the Cos symbol is imported from the math module, where Cos is public. However, in the example above, Cos appears to be public with regard to the current module too, which might not be intentional or desired.

In other words, if visibility is determined by the symbol name (e.g. through the use of an initial uppercase letter, or the presence of a leading underscore), then importing symbols directly can lead to confusion of what visibility rules apply.

Non-source imports

JavaScript (or at least Webpack) allows importing non-source files, such as CSS files.

This is interesting, because it provides a mechanism to bundle resources into a single output file, which makes distribution easier.

With compile-time function execution, it would even be possible to import a non-source file and process it ahead of time. For example, it might allow converting Sass to CSS ahead of time. (Though perhaps not directly: Sass can include other files, which means that a non-source import would need to be able to access other files as well.)

Within the JavaScript ecosystem, usually the non-source imports are handled by Webpack. Unfortunately, the Webpack configuration to make that work is too opaque; a language with built-in support for this (with zero configuration) would be nice to have.

Non-source imports only work when modules map one-to-one to files. It might be necessary to have an alternative import-file instruction apart from the regular import-module instruction.

Miscellaneous

Note last edited August 2024.
Incoming links: Programming language design.