Comparing to the v9.1 revision, this revision (v9.a)
- removes the final value concept, thus this revision doesn't need the
final
keyword and becomes (almost) Go 1 compatible. - removes the restriction that sending to and receiving from read-only channels are disallowed.
Any criticisms and improvement ideas are welcome, for
- I have not much compiler-related knowledge, so the following designs may have flaws.
- I haven't found a perfect syntax notation set for this proposal yet.
The problems this proposal tries to solve:
- no ways to declare package-level exported immutable non-basic values.
- no ways to declare read-only function parameters and results.
- many inefficiencies caused by lacking of immutable and read-only values.
Detailed rationales are listed in at the end of this proposal. Some solutions for the drawbacks mentioned by Russ are also listed there.
Basically, this proposal can be viewed as a combination of issue#6386 and issue#22876, plus several new ideas and much more details. the new ideas and more details make this proposal more practicable to implement.
This proposal also has some similar ideas with evaluation of read-only slices written by Russ.
This propsoal is not intended to be accepted, at least soon. It is more a documentation to show the problems and possible solutions in supporting immutable and read-only values in Go. I hope this proposal can help others get a good understanding on the problems and current solutions in supporting read-only values in Go, so that they can get inspired and improve the current read-only value proposals or make their own better read-only value proposals.
This proposal introduces a value role concept.
There are four kinds of value roles:
- reader role. The values directly referenced by a reader value are all read-only values (from the view of the reader value).
- read-only role. A read-only value may not be modified (through the read-only value itself). If the read-only value can reference some values, then the read-only value is also a reader value, so the values directly referenced by it are all read-only values too (from the view of the read-only value).
- writer role. The values directly referenced by a writer value are all writable values (from the view of the writer value).
- writable role. A writable value may be modified (through the writable value itself). If the writable value can reference some values, then the writable value is also a writer value, so the values directly referenced by it are all writable values too (from the view of the writeable value).
Please note,
- A reader value (as a left-hand-side value), if it is not a read-only value, might be modifiable.
- A writer value (as a left-hand-side value), if it is not a writable value, might not be modifiable. (However, non-writable writer values, such as function pointer results, will never be used as left-hand-side values in the current version of this proposal.)
- When a value is used as a right-hand-side values (R-values), the reader and read-only roles are indistinguishable for it. The same is for the writer and writable roles.
Also note, values of some types never reference any values, or they are referencing some interval values which can't be modified in any ways in user code. Such types are called no-references types below. The reader and writer roles are non-sense for values of no-references types. Some no-references type examples:
- basic types
- function types
- struct types with all field types are no-references types
- array types with no-references element types
- channel types with no-references element types
Values of no-references types have adaptive roles when they are used as R-values. In other words,
- Values of no-references types are viewed as writer values when they are assigned to writer or writable values.
- Values of no-references types are viewed as reader values when they are assigned to reader or read-only values.
We can think values of no-references types as unroled values when they are used as R-values.
Literal and constant values are all unroled values.
As a special, the predeclared nil
is also an unroled value.
The following rules stand if they stand without considering value roles.
- Reader or read-only values can be bound to read-only values. Read-only values can't be re-assigned to later.
- Reader or read-only values can be assigned to reader values.
- Writer or writable values can be explicitly converted to read-only values, not vice versa. (Here the explicitly means the
:rr
suffix is required. See below for details.) The conversion results can be assigned to reader values or bound to read-only values. (About why writer or writable values can't be implicitly converted to reader or read-only values, please read the problem mentioned in v9.) - Writer or writable values can be assigned to writer or writable values.
- Unroled values can be assigned/bound to values with any roles.
A suffix notation form :role
is introduced as value suffixes to indicate the roles of some values, where role
in the form can only be rr
(reader) or ro
(read only).
- We can use
var:ro
to delcare read-only variables. (:ro
can be only used invar:ro
.) A declared read-only variable can be bound to an initial value. - We can use
var:rr
to declare reader variables. - We can use
v:rr
to explicitly convert writer or writable valuev
to a reader value.
For example,
// x is a read-only (and reader) value.
// The elements of x are practical immutable values.
var:ro x = []int{1, 2, 3}
x = nil // error
// y is reader value.
// The elements of y are practical immutable values,
// but y itself is modifiable.
var:rr y = []int{1, 2, 3}
y = nil // ok
// w is a writable (and writer) value.
var w = []int{1, 2, 3}
// z is a reader value.
// It and w share elements.
// The elements are read-only from the view of z,
// but they are writable from the view of w.
// So the elements are not immutable values.
var:rr z = w:rr
z[0] = 9 // error
w[0] = 9 // ok
Please note, in the above example, the two :rr
suffixes in the var:rr z = w:rr
line are both required.
The reason is w
is a roled value.
On the other hand, literal and constant values are all unroled values.
And again, implicit role convertions are disallowed here.
There are no ways to declare read-only variables in short declarations:
var w = []int{1, 2, 3}
{
// u is a writer value.
// v is a reader value.
u, v := w, w:rr
... // use u and v
}
Although roles are properties of values, to make code (function prototype literals specificly) look compact, we can think of they are also properties of types.
The notation T:role
is introduced to represent a type T
with a specified role.
And to ease the syntax design, only :rr
is allowed to be used as type suffixes.
However, please note,
- The notation
T:rr
is not allowed to appear in type declarations to declare reader types. - The notation
T:rr
can only be used to specify function parameter and result types. It may not be used to declare package-level variables and struct fields. - In the
T:rr
notation, the:rr
portion must be the last protion. For example, the notation[]*chan T:rr
can only mean([]*chan T):rr
, whereas[]*chan (T:rr)
,[]*((chan T):rr)
and[]((*chan T):rr)
are all invalid notations. For more examples, it is not allowed to be used to- specify the roles of struct field types.
- specify the roles of array/slice/map/channel element types.
- specify the roles of map key types.
- specify the roles of pointer base types.
Although it is a non-sense, to avoid const-poisoning alike problems,
the :rr
sufix is allowed to follow no-references types.
A detail related to the T:rr
notation need to be noted:
as long as one result of a function is specified with a reader role,
then the result list part of the function prototype literal
must be enclosed in a pair of ()
.
For example, the notations func() (T:rr)
and func() T:rr
are different.
The former denotes a result type with the reader role,
but the latter denotes a function type with the reader role
(a non-sense role, for function types are no-references types).
To avoid function declaration splitting problem, a parameterized role concept is introduced.
The notation ::r
(two colons) produces a parameterized role r
.
For example, we can declare the Split
function in the bytes
standard package as
// Here, the two "q" must be consistent.
func Split(v []byte::q, sep []byte:rr) ([][]byte::q) {...}
to avoid declaring it as two functions
func SplitReader(v []byte:rr, sep []byte:rr) ([][]byte:rr) {...}
func SplitWriter(v []byte, sep []byte:rr) [][]byte {...}
An example containing some function declarations and calls:
func Double(x []byte) {
for i, v := range x {
x[i] = v+v
}
}
func DoubleDup(x []byte:rr) []byte {
y := make([]byte, len(x))
for i, v := range x {
y[i] = v+v
}
return y
}
var w = []byte{2, 3, 5}
Double(w)
fmt.Println(w) // [4 6 10]
var v = DoubleDup(w:rr)
fmt.Println(w) // [4 6 10]
fmt.Println(v) // [8 12 20]
// We can use strings as reader byte slices.
// This makes the current append([]byte, string) and
// copy([]byte, string) become not syntax exceptions.
v = DoubleDup("hello")
fmt.Println(v) // [208 202 216 216 222]
- Dereference of a reader pointer results a read-only value.
- Dereference of a writer pointer results a writable value.
- Taking address of an addressable a reader or read-only value results a reader pointer.
- Taking address of an addressable writer value results a writer pointer.
Example:
var:ro x = []int{1, 2, 3}
func foo() {
y := &x // y is reader pointer variable
z := *y // z is deduced as a reader variable
w := x // w is deduced as a reader variable
z[0] = 9 // error: z[0] is read-only.
u := &z // u is like y, a reader pointer
p1 := &x[1] // p1 is a reader pointer variable.
p2 := &z[1] // p2 is a reader pointer variable.
...
}
Note: taking address of a string results a read-only pointer values. Example:
var s = "hello word"
var:rr p1 = &bs[6] // ok
var p2 = &bs[6] // error
- Reader pointers and writer pointers can be both converted to unsafe pointers. This means the read-only rules built by this proposal can be broken by the unsafe mechanism. (This is important for reflection implementation.)
Example:
func mut(x []int:rr) []int {
return *((*[]int)(unsafe.Pointer(&x)))
}
- Fields of reader struct values are also reader values.
- Fields of read-only struct values are also read-only values.
- Fields of writer struct values are also writer values.
- Fields of writable struct values are also writable values.
- Elements of reader array values are also reader values.
- Elements of read-only array values are also read-only values.
- Elements of writer array values are also writer values.
- Elements of writable array values are also writable values.
- Elements of reader or read-only slice values are read-only values.
- Elements of writer or writable slice values are writer values.
- We can't append elements to writer or writable slice values, but can't append elements to reader or read-only slice values.
- Subslice:
- The subslice result of a reader or read-only slice is a read-only slice.
- The subslice result of a writer or writable slice is still a writer slice.
- The subslice result of a reader of read-only array is a read-only slice.
Note: converting a string to a byte slice results a read-only byte slice value. Example:
var s = "hello word"
// "bs" is a reader byte slice.
// A clever compiler will not allocate a
// duplicate underlying byte sequence here.
var:rr bs = []byte(s)
Note, internally, the cap
field of a reader byte slice is set to -1
if the byte slice is converted from a string, so that Go runtime knows
its elements are immutable. Converting such a reader byte slice to
a string doesn't need to duplicate the underlying bytes.
- Elements of reader or read-only map values are read-only values.
- Elements of writer or writable map values are writer values. (Each writable map element must be modified as a whole.)
- Keys (exposed in for-range) of reader or read-only map values are read-only values.
- Keys (exposed in for-range) of writer or writable map values are writer values.
- We can't append new entries to (or replace entries of, or delete old entries from) reader or read-only map values.
- Send
- We can only send reader or read-only values to a reader or read-only channel.
- We can only send writer or writable values to a writer channel.
- Receive
- Receiving from a reader channel results a read-only value.
- Receiving from a writer or writable channel results a writer value.
Function parameters and results can be declared as reader variables.
In the following function proptotype, parameter x
and result w
are declared as reader variables.
func fa(x Tx:rr, y Ty) (z Tz, w Tw:rr) {...}
A func()(T)
value is assignable to a func()(T:rr)
value, not vice versa.
A func(T:rr)
value is assignable to a func(T)
value, not vice versa.
A declared func(Tx::q) (Ty::q)
function can be used as R-values
and assigned to func(Tx:rr) (Ty:rr)
or func(Tx) Ty
values.
But it can't be used as L-values, for it has not a certain type.
(Yes, it is an untyped value.)
We can delcare methods with recievers of reader types.
When a method M
is explicitly declared for reader type T:rr
, then a corresponding method with the same name must be declared for writer type T
by compilers if no explicit method with the same name declared for writer type T
. The rule ensures that the method set of reader type T:rr
is always a subset of writer type T
.
func (T:rr) Mx() {} // explicitly declared. (A read-only method)
func(T) My() {}
/*
func (t T) Mx() {t:rr.Mx()} // implicitly declared. (A writer method)
*/
var t T
t.Mx() // <=> t:rr.Mx()
In the above code snippet, the method set of reader type T:rr
contains one method: Mx
, however the method set of type T
contains two method: Mx
and My
.
For type T
and *T
, if methods can be declared for them (either explicitly or implicitly), the method set of type T:rr
is a subset of type *T:rr
.
When the receiver type is specified with a parameterized role in a method declaration, then two methods are declared actually. One is for writer receiver type, the other is for reader receiver type. For example,
func (T::q) M{Tx} Ty::q {...}
is equivalent to
func (T) M(Tx) Ty {...}
func (T:rr) M(Tx) Ty:rr {...}
An interface type can specify some read-only methods. For example:
type I interface {
M0(Ta) Tb // a writer method
rr M2(Tx) Ty // a reader method (exported)
}
Similar to non-interface types, there is an implicit method M2
specified for the writer version of the above shown interface type.
The method set specified by type I
contains two methods actually, M0
and M2
.
The method set specified by type I:rr
only contains one method, M2
.
In the following code snippet, the type T1
implements the interface I
shown in the above code snippet, but the type T2
doesn't. The reason is type T2:rr
has not a M2
method.
type T1 struct{}
func (T1) M0(Ta) Tb {var b Tb; return b}
func (T1:rr) M2(Tx) Ty {var y Ty; return y} // the receiver type is a reader type.
type T2 struct{}
func (T2) M0(Ta) Tb {var b Tb; return b}
func (T2) M2(Tx) Ty {var y Ty; return y} // the receiver type is a writer type.
Please note, the type T3
in the following code snippet also implements I
.
Please read the above function section for reasons.
type T3 struct{}
func (T3) M0(Ta:rr) Tb {var b Tb; return b}
func (T3:rr) M2(Tx:rr) Ty {var y Ty; return y}
If a writer type T
implements a writer interface type I
, then the reader type T:rr
also implements the reader interface type I:rr
for sure.
Boxing and assertion rules:
- The value boxing rules are like the assignment and conversion rules mentioned above.
- Assertion rules:
- A type assertion on a reader or read-only interface value results a read-only value.
- A type assertion on a writer or writable interface value results a writer value.
Many function and method implementations in the refect
package should be modified accordingly.
The refect.Value
type shoud have a reader property,
and the result of an Elem
method call should inherit the reader property from the receiver argument.
The current reflect.Value.CanSet
method will report whether or not a value can be modified.
A reflect.ReaderValueOf
function is needed to create
reflect.Value
values representing reader Go values.
Its prototype is
func ReaderValueOf(i interface{}:rr) Value
For the standard Go compiler, in implementaion,
one bit can be borrowed from the 23+ bits method number
to represent the reader
proeprty.
All parameters of type reflect.Value
of the functions and methods
in thereflect
package, including receiver parameters,
should be declared as reader variables.
However, the reflect.Value
return results should be declared as writers.
A reflect.Value.ToReader
method is needed to
make a reflect.Value
value represent a reader Go value.
A reflect.Value.ReaderInterface
method is needed,
it returns a reader interface value.
The old Interface
method panics on reader values.
A method reflect.Type.Reader
is needed to get the reader version of a writer type.
A method reflect.Type.Writer
is needed to get the writer version of a reader type.
The method sets of reader type T:rr
is the subset of the writer type T
.
Their respective other properties should be identical.
A method reflect.Type.Role
is needed,
it may return Reader
or Writer
(two constants).
I'm not familiar with the compiler development things. It is just my feeling, by my experience, that most of the rules mentioned in this proposal can be enforced by compiler without big technology obstacles.
At compile phase, compiler should maintain two bits to represent a value role, so that to make decisions according to the rules mentioned above.
Except the changes mentioned in the above reflection section, the impact on runtime made by this proposal is not large.
Each internal method value should maintain a reader
property.
This information is useful when boxing a reader value into an interface.
As above has mentioned, the cap field of a reader byte slice should be set to -1
if its byte elements are immutable.
There are two Go 1 incompatible cases.
As above mentioned, when a writer or writable value is passed to a reader parameter,
the writer or writable value must be explicitly converted to a reader or read-only value to act as a legal reader argument. This will break much user code which contains calls to functions in standard packages, such as fmt.Print
, for the prototype of the fmt.Print
function is expected to change to func(...interface{}:rr)
.
Solution 1: To avoid such incompatibilities, the prototype of the fmt.Print
can be changed to func(...interface{}::q)
instead.
This solution can only apply to standard packages. (Similarly, should the unroled predeclared nil
be declared with var::q
?)
Solution 2: Use go fix
to fix the calls in user code.
Solution 3: Publish v2 version of some standard modules.
Many exported error
values declared (with var
now) in standard packages are expected to be changed as read-only values (declared with var:ro
later). However, there might be some user code in whcih these error
values are assigned to some variables. From the rules mentioned above, such assignments will become illegal later.
Solution 1: Use go fix
to fix these assignments in user code.
Solution 2: Publish v2 version of some standard modules.
In evaluation of read-only slices, Russ mentions some inefficiencies caused by lacking of read-only values.
- Now, the type of the only parameter of
io.Writer.Write
method is[]byte
. In fact, it should be a read-only parameter, for two reasons:- so that a
Write
call can take a string value argument without making a duplicate underlying bytes of the string argument. (More efficient) - the current method prototype doesn't prevent a custom
Writer
from modifying the elements of the passed[]byte
argument. (More secure)
- so that a
- By specifying a parameter of a function as read only, users of the function will clearly know the corresponding passed arguments will not be modified in this funciton. (Better code as document)
Besides these improvements, immutable values (this proposal supports)
can raise the security of Go code.
For example, by changing many exported error
values in standard packages
to immutable values, the securities of Go programs are improved.
Immutable slice values can also let compilers to do more BCE (bounds check elimination) optimizations for them.
The "Strengths of This Proposal" section in @jba's propsoal also makes a good summary of the benefits of read-only values.
It is more compact than var:rr T
. I think func (Ta:rr) (Tx:rr)
has a better readibility than func (var:rr Ta)(var:rr Tx)
.
In evaluation of read-only slices, Russ mentions some problems of read-only values. This proposal provides solutions to avoid each of these drawbacks.
- function duplication
Solved by role parameter.
Please see the end of the function section.
- immutability and memory allocation
Solved by setting the capacities of immutable byte slices as -1
at run time.
Please see the end of the slice section.
- loss of generality
Solved by letting interface {M(T:rr)}
implement interface {M(T)}
and letting interface {M() T}
implement interface {M() T:rr}
.
Please see the interface section for details.
Sometimes, people may need partial read-only for struct values.
type Counter struct {
n uint64
// mu will be always writable,
// even if its containing struct
// value is a read-only value.
mu sync.Mutex:writable
}
func (c *Counter:rr) Value() uint64 {
// ok. c.mu will be modified,
// which is allowed, even if
// *c is a read-only value.
c.mu.Lock()
defer c.mu.Unlock() // ok
return c.n
}
Partial read-only will make this proposal much more complex, so I decided not to support it.
The current proposal determines whether or not a value is referencing other values by checking whether or not its type is a no-references type. The checking happens at compile time. However, in fact, many run-time values can reference other values but at a certain time they are not referencing any values. Such values are called no-references values. In theory, it is not a problem to assign such no-references values with reader role to writer values. But this can't be determined at compile time, so an invalid such assignment must panic. For simplity, the current proposal adopts no-references types instead of no-reference values.