Skip to content

Commit e2e3422

Browse files
committed
stubgen: support for pattern files
This commit adds a last-resort option for further customization of generated stubs by providing a *pattern file* of replacement rules.
1 parent 9536543 commit e2e3422

File tree

10 files changed

+464
-108
lines changed

10 files changed

+464
-108
lines changed

cmake/nanobind-config.cmake

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -360,7 +360,7 @@ function(nanobind_add_module name)
360360
endfunction()
361361

362362
function (nanobind_add_stub name)
363-
cmake_parse_arguments(PARSE_ARGV 1 ARG "VERBOSE;INCLUDE_PRIVATE;EXCLUDE_DOCSTRINGS;INSTALL_TIME;EXCLUDE_FROM_ALL" "MODULE;OUTPUT;MARKER_FILE;COMPONENT" "PYTHON_PATH;DEPENDS")
363+
cmake_parse_arguments(PARSE_ARGV 1 ARG "VERBOSE;INCLUDE_PRIVATE;EXCLUDE_DOCSTRINGS;INSTALL_TIME;EXCLUDE_FROM_ALL" "MODULE;OUTPUT;MARKER_FILE;COMPONENT;PATTERN_FILE" "PYTHON_PATH;DEPENDS")
364364

365365
if (EXISTS ${NB_DIR}/src/stubgen.py)
366366
set(NB_STUBGEN "${NB_DIR}/src/stubgen.py")
@@ -388,7 +388,11 @@ function (nanobind_add_stub name)
388388
list(APPEND NB_STUBGEN_ARGS -i "${TMP}")
389389
endforeach()
390390

391-
if (ARG_MARKER_FILE)
391+
if (ARG_PATTERN_FILE)
392+
list(APPEND NB_STUBGEN_ARGS -p "${ARG_PATTERN_FILE}")
393+
endif()
394+
395+
if (ARG_MARKER)
392396
list(APPEND NB_STUBGEN_ARGS -M "${ARG_MARKER_FILE}")
393397
list(APPEND NB_STUBGEN_OUTPUTS "${ARG_MARKER_FILE}")
394398
endif()
@@ -413,7 +417,7 @@ function (nanobind_add_stub name)
413417
OUTPUT ${NB_STUBGEN_OUTPUTS}
414418
COMMAND ${NB_STUBGEN_CMD}
415419
WORKING_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}"
416-
DEPENDS ${ARG_DEPENDS} "${NB_STUBGEN}"
420+
DEPENDS ${ARG_DEPENDS} "${NB_STUBGEN}" "${ARG_PATTERN_FILE}"
417421
${NB_STUBGEN_EXTRA}
418422
)
419423
add_custom_target(${name} ALL DEPENDS ${NB_STUBGEN_OUTPUTS})

docs/api_cmake.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,9 @@ Nanobind's CMake tooling includes a convenience command to interface with the
445445
empty file named ``py.typed`` in each module directory. When this
446446
parameter is specified, :cmake:command:`nanobind_add_stub` will
447447
automatically generate such an empty file as well.
448+
* - ``PATCH_FILE``
449+
- Specify a patch file used to replace declarations in the stub. The
450+
syntax is described in the section on :ref:`stub generation <stubs>`.
448451
* - ``COMPONENT``
449452
- Specify a component when ``INSTALL_TIME`` stub generation is used.
450453
This is analogous to ``install(..., COMPONENT [name])`` in other

docs/changelog.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,11 @@ This required work on three fronts:
121121
and docstrings. This provides more targeted control compared to overriding
122122
the entire function signature.
123123

124+
* Finally, nanobind's stub generator supports :ref:`pattern files
125+
<pattern_files>` containing custom stub replacement rules.This catch-all
126+
solution addresses the needs of advanced binding projects, for which the
127+
above list of features may still not be sufficient.
128+
124129
Most importantly, it was possible to support these improvements with minimal
125130
changes to the core parts of nanobind.
126131

docs/typing.rst

Lines changed: 147 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,17 @@
55
Typing
66
======
77

8-
This section covers two broad typing-related topics:
8+
This section covers three broad typing-related topics:
99

1010
1. How to create rich type annotation in C++ bindings so that projects
1111
using them can be effectively type-checked.
1212

1313
2. How to :ref:`automatically generate stub files <stubs>` that are needed to
1414
enable static type checking and autocompletion in Python IDEs.
1515

16+
3. How to write :ref:`pattern files <pattern_files>` to handle advanced use
17+
cases requiring significant stub customization.
18+
1619
Signature customization
1720
-----------------------
1821

@@ -72,9 +75,10 @@ decorators.
7275
The modified signature is shown in generated stubs, docstrings, and error
7376
messages (e.g., when a function receives incompatible arguments).
7477

75-
Besides :cpp:class:`nb::sig <sig>`, nanobind also provides a lighter-weight
76-
alternative to only modify the signature of a specific function default
77-
argument via :cpp:func:`nb::arg("name").sig("signature") <arg::sig>`.
78+
In cases where a custom signature is only needed to tweak how nanobind renders
79+
the signature of a default argument, the more targeted
80+
:cpp:func:`nb::arg("name").sig("signature") <arg::sig>` annotation is
81+
preferable to :cpp:class:`nb::sig <sig>`.
7882

7983
.. _typing_signature_classes:
8084

@@ -184,8 +188,8 @@ Creating generic types
184188
Python types inheriting from `types.Generic
185189
<https://docs.python.org/3/library/typing.html#typing.Generic>`__ can be
186190
*parameterized* by other types including generic `type variables
187-
<https://docs.python.org/3/library/typing.html#typing.TypeVar>`__ that act as a
188-
placeholder. Such constructions enable more effective static type checking. In
191+
<https://docs.python.org/3/library/typing.html#typing.TypeVar>`__ that act as
192+
placeholders. Such constructions enable more effective static type checking. In
189193
the snippet below, tools like `MyPy <https://github.com/python/mypy>`__ or
190194
`PyRight <https://github.com/microsoft/pyright>`__ can infer that ``x`` and
191195
``y`` have types ``Wrapper[int]`` and ``int``, respectively.
@@ -207,12 +211,16 @@ the snippet below, tools like `MyPy <https://github.com/python/mypy>`__ or
207211
def get(self) -> T:
208212
return self.value
209213
210-
x = Wrapper(3)
211-
y = x.get()
214+
# Based on the typed constructor, MyPy knows that 'x' has type 'Wrapper[int]'
215+
x = Wrapper(3)
216+
217+
# Based on the typed 'Wrapped.get' method, 'y' is inferred to have type 'int'
218+
y = x.get()
212219
213-
Note that generic type parameterization doesn't change the underlying type and is not to
214-
be confused with C++ template instantiation. The feature mainly enables
215-
propagating more fine-grained type information through extension code.
220+
Note that parameterization of a generic type doesn't generate new code or
221+
modify its functionality. It is not to be confused with C++ template
222+
instantiation. The feature only exists to propagate fine-grained type
223+
information and thereby aid static type checking.
216224

217225
Similar functionality can also be supported in nanobind-based binding projects.
218226
This looks as follows:
@@ -456,18 +464,20 @@ The program has the following command line options:
456464
Generate stubs for nanobind-based extensions.
457465
458466
options:
459-
-h, --help show this help message and exit
460-
-o FILE, --output-file FILE write generated stubs to the specified file
461-
-O PATH, --output-dir PATH write generated stubs to the specified directory
462-
-i PATH, --import PATH add the directory to the Python import path (can
463-
specify multiple times)
464-
-m MODULE, --module MODULE generate a stub for the specified module (can
465-
specify multiple times)
466-
-M FILE, --marker FILE generate a marker file (usually named 'py.typed')
467-
-P, --include-private include private members (with single leading or
468-
trailing underscore)
469-
-D, --exclude-docstrings exclude docstrings from the generated stub
470-
-q, --quiet do not generate any output in the absence of failures
467+
-h, --help show this help message and exit
468+
-o FILE, --output-file FILE write generated stubs to the specified file
469+
-O PATH, --output-dir PATH write generated stubs to the specified directory
470+
-i PATH, --import PATH add the directory to the Python import path (can
471+
specify multiple times)
472+
-m MODULE, --module MODULE generate a stub for the specified module (can
473+
specify multiple times)
474+
-M FILE, --marker-file FILE generate a marker file (usually named 'py.typed')
475+
-p FILE, --pattern-file FILE apply the given patterns to the generated stub
476+
(see the docs for syntax)
477+
-P, --include-private include private members (with single leading or
478+
trailing underscore)
479+
-D, --exclude-docstrings exclude docstrings from the generated stub
480+
-q, --quiet do not generate any output in the absence of failures
471481
472482
473483
Python interface
@@ -494,3 +504,117 @@ containing the stub declarations.
494504
Note that for now, the ``nanobind.stubgen.StubGen`` API is considered
495505
experimental and not subject to the semantic versioning policy used by the
496506
nanobind project.
507+
508+
.. _pattern_files:
509+
510+
Pattern files
511+
-------------
512+
513+
In complex binding projects requiring static type checking, the previously
514+
discussed mechanisms for controlling typed signatures (:cpp:class:`nb::sig
515+
<sig>`, :cpp:class:`nb::typed <typed>`) may be insufficient. Two common reasons
516+
are as follows:
517+
518+
- the ``@typing.overload`` chain associated with a function may sometimes
519+
require significant deviations from the actual overloads present on the C++
520+
side.
521+
522+
- Some members of a module could be inherited from existing Python packages or
523+
extension libraries, in which case patching their signature via
524+
:cpp:class:`nb::sig <sig>` is not even an option.
525+
526+
``stubgen`` supports *pattern files* as a last-resort solution to handle such
527+
advanced needs. These are files written in a *domain-specific language* (DSL)
528+
that specifies replacement patterns to dynamically rewrite stubs during
529+
generation. To use one, simply add it to the :cmake:command:`nanobind_add_stub`
530+
command.
531+
532+
.. code-block:: cmake
533+
534+
nanobind_add_stub(
535+
...
536+
PATTERN_FILE <PATH>
537+
...
538+
)
539+
540+
A pattern file contains sequence of patterns. Each pattern consists of a query
541+
and an (arbitrarily) indented replacement block to be applied when the query
542+
matches.
543+
544+
.. code-block:: text
545+
546+
# This is the first pattern
547+
query 1:
548+
replacement 1
549+
550+
# And this is the second one
551+
query 2:
552+
replacement 2
553+
554+
Empty lines and lines beginning with ``#`` are ignored.
555+
556+
When the stub generator traverses the module, it computes the *fully qualified
557+
name* of every type, function, property, etc. (for example:
558+
``"my_ext.MyClass.my_function"``). The queries in a pattern file are checked
559+
against these qualified names one by one until the first one matches.
560+
561+
For example, suppose that we had the following lackluster stub entry:
562+
563+
.. code-block:: python
564+
565+
class MyClass:
566+
def my_function(arg: object) -> object: ...
567+
568+
The pattern below matches this function stub and inserts an alternative with
569+
two typed overloads.
570+
571+
.. code-block:: text
572+
573+
my_ext.MyClass.my_function:
574+
@overload
575+
def my_function(arg: int) -> int:
576+
"""A helpful docstring"""
577+
578+
@overload
579+
def my_function(arg: str) -> str: ...
580+
581+
Patterns can also *remove* entries, by simply not specifying a replacement
582+
block. Also, queries don't have to match the entire qualified name. For
583+
example, the following pattern deletes all occurrences of anything containing
584+
the string ``secret`` somewhere in its name
585+
586+
.. code-block:: text
587+
588+
secret:
589+
590+
In fact (you may have guessed it), the queries are *regular expressions*! The
591+
query supports all features of Python's builtin `re
592+
<https://docs.python.org/3/library/re.html>`__ library.
593+
594+
When the query uses *groups*, the replacement block may access the contents of
595+
each numbered group using using the syntax ``\1``, ``\2``, etc. This permits
596+
writing generic patterns that can be applied to a number of stub entries at
597+
once:
598+
599+
.. code-block:: text
600+
601+
__(eq|ne)__:
602+
def __\1__(self, arg, /) -> bool: ...
603+
604+
Named groups are also supported:
605+
606+
.. code-block:: text
607+
608+
__(?P<op>eq|ne)__:
609+
def __\op__(self, arg, /) -> bool : ...
610+
611+
Finally, sometimes, it is desirable to rewrite only the signature of a function
612+
in a stub but to keep its docstring so that it doesn't have to be copied into
613+
the pattern file. The special escape code ``\doc`` references the previously
614+
existing docstring.
615+
616+
.. code-block:: text
617+
618+
my_ext.lookup:
619+
def lookup(array: Array[T], index: int) -> T:
620+
\doc

include/nanobind/nb_types.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ inline void delattr(handle h, handle key) {
314314

315315
class module_ : public object {
316316
public:
317-
NB_OBJECT(module_, object, "module", PyModule_CheckExact);
317+
NB_OBJECT(module_, object, "ModuleType", PyModule_CheckExact);
318318

319319
template <typename Func, typename... Extra>
320320
module_ &def(const char *name_, Func &&f, const Extra &...extra);

0 commit comments

Comments
 (0)