Description
This is based on #42117 but is a bit different proposal that might be breaking for Julia 1.x based on different considerations thus I decided to open a separate issue to make reading a bit easier.
In short, I'd like a similar macro/keyword that marks an object:
- public, accessible via
<module>.<name>
or<name>
after explicitly imported byimport/using
- unmarked names are considered
private
, which means one cannot access them at all from outside
the syntax may vary due to compatibility concerns, e.g
Keep things non-breaking
because we currently do not hide things by default, we will need a marker for things we want to make private
@public <something>
@private <something>
Breaking but simpler
if we choose something more breaking, it could be a public
/export
keyword combined with #39235, where export
/public
could just mean "public" accessible APIs, and things are not accessible from the outside module directly.
On the other hand, using XXX
and import XXX
needs to be private by default, unless marked with export
/public
so we can prevent someone using a long module path to access some deep dependencies from a package (see the point 2 in Why).
Additionally
I'd like a macro that marks certain API's stability status, this is something I find quite nice from the rust community, where they have things like
#[cfg_attr(not(test), rustc_diagnostic_item = "IpAddr")]
#[stable(feature = "ip_addr", since = "1.7.0")]
#[derive(Copy, Clone, Eq, PartialEq, Hash, PartialOrd, Ord)]
pub enum IpAddr {
/// An IPv4 address.
#[stable(feature = "ip_addr", since = "1.7.0")]
V4(#[stable(feature = "ip_addr", since = "1.7.0")] Ipv4Addr),
/// An IPv6 address.
#[stable(feature = "ip_addr", since = "1.7.0")]
V6(#[stable(feature = "ip_addr", since = "1.7.0")] Ipv6Addr),
}
It marks certain things' stability at the same place where it's defined programmatically so that a linter can warn users based on their current toolchain version.
In Julia, we only have a poor docstring saying "use at your own risk", which is something I think could be improved by this. Having this shouldn't break, it could be one macro marks struct fields and function arguments about their availability and stability. This could make the experimental feature easier to provide and play with in downstream.
Why?
#42117 has overlapped with this proposal thus the reasons @DilumAluthge listed also apply here, I'd like to provide a few other motivations tho
- potential reduction on pkgimage/cache/sysimg size, because many functions in a package are used by the package rather than the downstream user, the methods corresponding to these functions that are not used by downstream can be deleted in the downstream package cache/sysimg in principle, but we currently cannot do it because users are allowed access them by a deep chain of module path (e.g
A.B.C.<a private function>
. I think quite a few AOT languages have a similar mechanism to mark things private so they can be tree-shake away. I'm not an expert on package cache or system images, but I think this might be one of the low-hanging fruit that can be improved by changing the semantics a bit, so please correct me if this is not what can be improved in Julia's case.
A demonstration of this can be resolving the issue that https://expronicon.rogerluo.dev/intro/bootstrap and https://github.com/thautwarm/DevOnly.jl trying to solve automatically:
MLStyle is an extreme example of this, what @match
generates only depends on Base, but the only reason why downstream packages will still load MLStyle is only that users are allowed to access MLStyle's @match
via AAA.BBB.CCC.@match
if CCC contains using MLStyle: @match
, and one can get a rough estimation in this extreme case on loading time improvements
julia> @time using MLStyle
0.041998 seconds (148.07 k allocations: 10.553 MiB)
julia> @time using ExproniconLite
0.020682 seconds (53.83 k allocations: 3.991 MiB)
here you can see even Expronicon depends on MLStyle, by removing MLStyle from loading entirely we can get twice faster loading time than depending on MLStyle.
- prevent downstream users hacking unstable things, this is more or less an effect of having 1, but I want to argue it has certain advantages, one example is the non-public APIs from
Base
, we currently have a very vague way of distinguishing them by whether the function has a docstring or not, IMO even functions only made for developers deserves a docstring in the dev docs. If we have a mark to distinguish such APIs, then the reference page can be generated automatically for manual and dev docs of the corresponding functions. And APIs likeBase.print_matrix
etc. can be more clear to people that whether they should use and whether this is maybe broken in future versions.
But I'd like to mention one real-world example which is the usage of DiffEq, most of the time one only uses one ODE solver from that giant package, thus in principle downstream package should not be loading the whole thing, but that ODE solver code only. But because currently you are allowed to access other solvers by something like MyPackage.DiffEq.OrdinaryDiffEq.Vern8
we will have to load the whole thing which is super slow.
Ideally, the compiler should cache the corresponding solver code into the downstream package image and only load that piece when using MyPackage
without explicit using DiffEq
. But I think this is not allowed because users can technically do MyPackage.DiffEq
to access anything inside DiffEq