-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable and composable storage #40
base: master
Are you sure you want to change the base?
Conversation
1. The API clearly communicates that the underlying storage might be uninitialized which can cause certain operations, like e.g. `StorageBox::read()` to fail. _All_ operations that can fail must have the `try_` prefix and return an `Option`. | ||
1. Every `try_` operation has its "non-try" counterpart that returns the value and reverts if the operation fails internally. The counterpart is always named the same as the `try_`, but without the `try_` prefix. E.g., `read()` for `try_read()`. | ||
1. Uninitialized storage instances are always safe to use. This way we benefit of the possibility to not pay for default initialization and still safely use the storage instance. Every storage type will provide its own semantics for uninitialized storage state. E.g. an uninitialized [`StorageVec`](../files/0013-configurable-and-composable-storage/api-design/storage_vec.sw) has the same semantics as an empty `StorageVec`. | ||
1. Thus, the failing of `try_` methods can always be seen as a semantic failure. E.g., `StorageVec::try_get()` can fail if the `StorageVec` is not initialized or because it requested element index is out of bounds. Since in the first case we consider the vector to be empty, it is also a semantic error. | ||
1. Certain operations might have a version with a "special behavior". E.g., in the case of the `StorageVec`, we want to provide a possibility to `get()` an element without the expensive length check. In such cases, the method will be named by the original name, and the suffix indicating the "special behavior". E.g., `StorageVec::get_unchecked()`. | ||
1. When the special behavior consists of doing deep clear via `DeepClearStorage` or deep read via `DeepReadStorage`, the suffix will always be `_deep_clear` and `_deep_read`, respectively. | ||
|
||
Thus, every storage type operation might come in all or some of these flavours: | ||
- `fn <operation>() -> T`: reverts in case of uninitialized storage or other error. | ||
- `fn try_<operation>() -> Option<T>` | ||
- `fn <operation>_<special_behavior>() -> T`: reverts in case of uninitialized storage or other error. | ||
- `fn try_<operation>_<special_behavior>() -> Option<T>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed changing the naming in the API previously here: FuelLabs/sway#4740
I think we should go with the opposite of what's said in this paragraph.
Most of the time, you want an operation to return Option
. I'd much rather we standardize on read
+read_unchecked
than try_read
+ read
. Reverts should be the exception not the rule and it should be clear when an operation potentially reverts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL; DR
The proposal (for the next discussion round 😄) is:
fn <operation>() -> Option<T>
:None
in case of uninitialized storage or other, semantic, error.fn <operation>_or_revert() -> T
fn <operation>_<special_behavior>() -> Option<T>
:None
in case of uninitialized storage or other, semantic, error.fn <operation>_<special_behavior>_or_revert() -> T
fn <operation>() -> T
: operation never fails.fn <operation>_<special_behavior>() -> T
: operation with special behavior never fails.
Looking at the storage types we have, there will never be an explosion of those variants per operation, and we still have a clear guidelines that capture the overall complexity (explained below).
Detailed Analysis
After spending a good portion of the last week walking in circles and pondering how to capture the complexity of the storage access semantics into consistent API guidelines, here I am again, doing the same 😄
I am completely fine with promoting the Option
returning operations to be the default ones.
But we still need to capture the overall matrix of possible combinations. Since FuelLabs/sway/#4740 is not going into details there, let's do it here.
Before starting, just to emphasise the difference, the term "unchecked" in the #4740 and in the RFC paragraph, means two different things. In the #4740 it is a synonim for unwrap while in the RFC it represents a "special behavior". In the concrete example, running the StorageVec::get()
without checking the vector length and out-of-bounds access.
To the matrix of possible combinations and what we want to capture in the API.
Every operation can potentially:
- fail if the storage is not initialized. E.g.,
StorageBox::read()
on uninitializedStorageBox
. - fail because of a semantically wrong usage. E.g., out of bounds
StorageVec::get()
on initializedStorageVec
. - have one or more "special behaviors" that affect what the operation do. E.g.,
StorageVec::get()
without lenght checks. E.g.,StorageVec::pop()
with deep clear.- we expect a small number of special behaviors per operation, very often none or mostly one.
- special behaviors might require additional trait constraints. E.g.
DeepClearStorage
. - special behavior operations can also fail because of the first two points.
- have none of the above; neither it can fail nor it has any special behaviors. E.g.,
StorageMap::get()
.
The guidelines should reflect the above matrix.
What RFC proposes, to simplify the matrix and the API, and assuming that we are not losing any valuable information there, is to treat the access to the uninitialized storage as a semantically wrong usage and not as a special technical error.
Means, every storage type assigns a meaningful default semantics to the uninitialized state. E.g, an uninitialized StorageVec
behaves the same as an empty StorageVec
. For all the storage types we currently support, this is possible in an intuitive way.
It also has a benefit of just creating an instance with new()
without the need for initialization. StorageVec::new()
behaves the same as StorageVec::init([])
but does not produce any storage writes until used. For storage types I've played with and the sample StoragePair
having this meaningful default behavior was always possible.
If we do not join those two points and consistently want to distinguish between the failing because of the storage being uninitialized and because of a semantically wrong usage, we will necessarily end up in the cumbersome API like, e.g., StorageVec::get()
returning Option<Option<T>>
where the first option means failure on uninitialized and the second out-of-bounds. I doubt that developers will ever want to distinguish those two cases.
One more thing to consider in the guidelines is:
it should be clear when an operation potentially reverts
How to achieve this via naming convention only? (We are talking here about the computational effects. We had this discussion internally triggered by the #[storage]
attributes not being a part of the function type.)
One possibility would be:
<operation>() -> Option<T>
: does not revert.<operation>_<unchecked or some better name>() -> T
: potentially reverts.<operation>() -> T
: does not revert. E.g.,StorageMap::get() -> T
.
And all this should also be aplicable to "special behavior" variants.
Using the term "unchecked" as a suffix I see as misleading. Because, in the case of, e.g., StorageBox::read_unchecked()
it actually does check and it will revert if that check fails.
"unchecked" as a word implies that we are actually skipping checks and who knows what can happen. Developers responsibility. E.g., in the RFC, the StorageVec::get_unchecked()
cannot fail because it literally does not check the bounds and returns the calculated element. Accessing that element later on can fail, but the get_unchecked()
itself cannot.
I am arguing here to adopt some other terminology for the <operation>_<give me the result or revert if it is not there>()
, and not not to have that variant.
Perhaps <operation>_or_revert()
?
To wrap it up, the proposal (for the next discussion round 😄) is:
fn <operation>() -> Option<T>
:None
in case of uninitialized storage or other, semantic, error.fn <operation>_or_revert() -> T
fn <operation>_<special_behavior>() -> Option<T>
:None
in case of uninitialized storage or other, semantic, error.fn <operation>_<special_behavior>_or_revert() -> T
fn <operation>() -> T
: operation never fails.fn <operation>_<special_behavior>() -> T
: operation with special behavior never fails.
Looking at the storage types we have, there will never be an explosion of those variants per operation, and we still have a clear guidelines that capture the overall complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question that immediately comes to my mind is why _or_revert()/unchecked()
at all, when we can simply call unwrap()
?
The <operation>() <-> try_<operation>()
pairs are there from the beginning if that's a valid reason. <operation>().unwrap()
results in a potentially much bigger code then <operation>_or_revert()
(see FuelLabs/sway/#5982), but this should be handled in optimizations.
Also, other APIs, aside from the storage API, does not have such pairs. E.g. Vec
has only get() -> Option<T>
.
So why to have _or_revert()/unchecked()
so prominently in the storage API?
## Storage references | ||
|
||
[storage-references]: #storage-references | ||
|
||
Sometimes we want to be able to store in the storage a "reference" to another storage instance. Storage API and the compiler will provide support for such use cases. On the API level, the `StorageRef` type (defined in the [`storage.sw`](../files/0013-configurable-and-composable-storage/sway-libs/storage.sw)) will contain a type-safe "reference" to a storage element. This "reference" will internally just be the `StorageKey` of the referenced storage element. | ||
|
||
```Sway | ||
pub enum StorageRef<TStorage> where TStorage: Storage { | ||
Null: (), | ||
Ref: StorageKey, | ||
} | ||
``` | ||
|
||
`StorageRef`s will be stored in [`StorageRefBox`es](../files/0013-configurable-and-composable-storage/sway-libs/storage/storage_ref_box.sw) and once retrieved from the storage it will be possible to dereference them to access the referenced storage. | ||
|
||
Every storage type automatically provides the `StorageRef` that references it, via the `Storage::as_ref` method. | ||
|
||
Thus, storage references provide a convenient, type-safe way to express referencing storage elements, but with a price that might be considerable in some cases. Namely, the `StorageRef` requires two storage slots to store a reference. This means that developers might consider other, less storage consuming, manual approaches to "reference" a storage entity. E.g., if referenced elements are stored in a `StorageVec`, "referencing" them by storing their vector indices as "references" might be less storage-costly. In this case, we are trading type-safety and clear built-in API for the performance. | ||
|
||
Storage reference will be supported in the `storage` declarations, by allowing the `storage` keyword to be used on the RHS of the _configure with_ operator: | ||
|
||
```Sway | ||
type StVecOfU64 = StorageVec<StorageBox<u64>>; | ||
type StPairOfStVecOfU64 = StoragePair<StVecOfU64, StVecOfU64>; | ||
|
||
storage { | ||
vec_1: StVecOfU64 := [11, 22, 33], | ||
vec_2: StVecOfU64 := [44, 55, 66], | ||
|
||
pair_1: StPairOfStVecOfU64 := ([0, 0, 0], [1, 1, 1]), | ||
pair_2: StPairOfStVecOfU64 := ([], []), | ||
|
||
map_1: StorageMap<str[3], StVecOfU64> := [ | ||
("abc", [1, 2, 3]), | ||
("def", [11, 22, 33]), | ||
("ghi", [111, 222, 333]), | ||
] | ||
|
||
//-- | ||
// Storage references can be used in storage configurations. | ||
// They can reference other storage elements, or their parts. | ||
//-- | ||
vec_of_refs: StorageVec<StorageRefBox<StVecOfU64>> := [ | ||
StorageRef::Null, | ||
storage.vec_1.as_ref(), // <<-- Note using the `storage` keyword here, as well as `as_ref`. | ||
storage.pair_1.first().as_ref(), | ||
storage.map_1.get("abc").as_ref(), | ||
StorageRef::Ref(some_const_fn_that_returns_the_storage_key_of_the_referenced_storage()), | ||
], | ||
} | ||
``` | ||
|
||
For more examples on using storage references, see the [`demo_contract_storage_references.sw`](../files/0013-configurable-and-composable-storage/contracts/demo_contract_storage_references.sw). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think StorageRef
is an unnecessary complication.
I don't see what this adds over Option<StorageKey>
which would already be supported. If anything it is worse because it reintroduces a potential implicit null.
Unless there's a compelling reason to have a special type, I say we should remove this and recommend storing StorageKey
s and Option<StorageKey>
s for references to other parts of the storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the enum and the Null
and using Option
instead I like a lot!
Actually, the first version of the StorageRef
was a struct type-safely wrapping the StorageKey
but then I ended up thinking how to support not having a reference and introduced Null
motivated by the similar programming language concept. The result was putting the burden of checking for Null
when dereferencing always, even in the situations where the reference will always be there. Using Option<StorageRef>
where the StorageRef
is a struct is definitely the right way!
To the question of having the dedicated StorageRef
, or using the general StorageKey
directly, the benefit of having the struct StorageRef<TStorage>
is there, because of these two reasons:
- Type-safety.
- Having easy-to-use, zero-(additional)-cost abstraction that clearly communicates the intent and provides a standardized usage pattern.
Ad 1) StorageRef<TStorage>
referrs to a storage of a particular type and that is type-checked by the compiler. It also clearly communicates that it referrs to a storage of a particular type. Using StorageKey
would be more like having a raw pointer into the storage.
Ad 2) Using the naked StorageKey
will require from developers to implement their own deref
equivalents every time they use it for referencing. Those will be cumbersome and error-prone, because they will need to use the proper way to create the proper storage type etc. In the worse case, we will end up having that code in contracts, and in the best case, having it packed in various libraries. Having the StorageRef
provided by the standard library gives a simple type that clearly communicates the intent and gives all convenience in a standardized and zero-cost way.
``` | ||
<Config> := StorageConfig<TValue> | ||
| [<Config>] | ||
| (<Config>, ...); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mechanism is sound, but we need better naming than "config". It should be clearer that StorageConfig
represents a single address initializer for consecutive slots and that Config
is a whole, potentially composed, initializer.
Maybe StorageInit
and StorageAddressInit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal! I am not happy with the term Config
either. While the Value
as the name of the associated type is intuitive, Config
does not carry such clarity.
Here we need to be cautios of the overall "initialization" vs "configuration" confusion. Since configurable
s and storage
are actually configured in code and that configuration can be changed at deployment time, the term "configuration" is consistently used everywhere in the RFC. Sticking to that term will help to remove existing confusion, where the choice of the syntax and wording implies "initialization" and "assignment" to something that is rather a "configuration".
E.g., the Storage::init
constructor on the other side, is an "initialization" because it will create a storge type instance at the runtime and actually stores the provided values into the storage.
To the origin of the current, non-optimal name Config
, which I tried to change (my thinking below). Since both the associated type Config
and the StorageConfig
are used only by the compiler during the configuration, I've kept the name Config
although I don't like it.
Alternative names I've played with for the Config
associated type were:
ValueLayout
: because it provides the information on how the value (described by theValue
type) is layed out in the storage. The reason I didn't choose it is, that storage types can store additional information as well, e.g.,StorageVec
its length, and not only what is in theValue
.StorageLayout
: it clashed with theStorageLayout
enum.StorageContent
: because it describes which content is stored in the storage and where exactly.
Using "Init" in the name, as explained about, can lead back to the current confusion.
Perhaps SlotsConfig
because it provides the information on how to configure the slots, or even better SlotsContent
?
Additional proposals are welcome 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, regardless of the terminology we chose at the end, I like the proposed formulation (but to change "initializing" to "configuring"):
StorageConfig represents a single address initializer for consecutive slots and that Config is a whole, potentially composed, initializer
Hmmm, perhaps:
Storage(Address)Config/Content
represents a configuration for storing a single value/content
in consecutive slots starting at the address specified by the storage_key
. The SlotsConfig/Content
is a whole, potentially composed, configuration, for storing the entire value contained in the Storage
.
Whatever the final terminology will be, it should sound well in that sentence 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalized STD is confusing because the standard library is already made up of std
and core
, either use those specifically or just say "the standard library".
files/0013-configurable-and-composable-storage/sway-libs/storage.sw
Outdated
Show resolved
Hide resolved
|
||
The STD will still provide the low-level storage `read`, `write`, and `clear` functions. However, unlike now when they are actually sometimes needed in contract code, e.g., to store arrays, in the new storage implementation there should never be a need to use them in contracts. Instead, the atomic `StorageBox` and `StorageEncodedBox` should be used. They provide the same low-level functionality while offering safe storage access. | ||
|
||
The low-level API will be used only when implementing storage types, and even in those cases not always. Thus, the proposal is to move them to the module named [`internal`](../files/0013-configurable-and-composable-storage/sway-libs/storage/internal.sw) to emphasize that they are, similar to `Storage::internal_` functions, meant to be used only when developing custom storage types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct programmatic manipulation of storage should not be encouraged but it should not be this discouraged either. There are cases where it's a better more straightforward pattern to use to implement what you want.
I do think having it in a separate module is fine, but I don't like the implication that it is "internal", it's merly raw or low level. But it's perfectly valid for the end user and not specific to storage type implementors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree!
Encouraging and/or discouraging usage is a separate topic, similar to the one of using raw pointers in contract code. It is to be reflected in the way how we convey and document the library/language features, patterns, etc. But in both cases, even discouraging the usage does not justify the name "internal".
The current module name is storage_api
which we can keep. But, I am more for raw_api
or some other classifier which will better convay that this is the low-level API, rarely needed, if at all, in the contract code.
@bitzoic Do you have a view or proposal here?
|
||
[drawbacks]: #drawbacks | ||
|
||
The only drawback I can think of is the time and effort needed to implement the proposal and to deal with the [breaking changes](#breaking-changes). However, the impact of _not_ improving the current storage handling will over time surely be higher. It would mean carrying on the issues mentioned in the [Motivation](#motivation) and, worst of all, living with an API that is error-prone by-design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have some discussion of the performance implications of this change. Can storage get slower as a result of this, if employed correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I say - no 😄 (to the negative implications, not to the discussion 😄)
Or at least I don't see any negative impact on the performance, compared to the approach we have now. On the contrary:
- Performance was in focus and considered carefully. It resulted, e.g, in the
StorageLayout
enum being introduced. - Just using the
StorageLayout
information consistently will helpStorage
implementors to provide implementations that optimally pack and store data across the whole hierarchy of the stored data. - Having the convention for "special behavior" methods should encourage providing optimized versions of methods where high-performance is needed and safety guaranteed by surrounding code.
But it can be that I am missing something, so yes we can surely discuss it further. And of course reviewing the proposal from the performance standpoint is highly encouraged.
- naming of proposed code elements like, e.g., method/function names, type names, etc. Since those names will become the part of the storage API it is of highest importance to come up with good and expressive names for abstractions. | ||
|
||
Also, I expect the current storage attributes, `#[storage(read, write)]`, to be discussed. In the sample code, the existing attributes are used. However, there are two questions to be decided on: | ||
- if the meaning of `write` remains "read and write" as it is now, the attributes should be renamed to `readonly` and `readwrite`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are important reasons (for standards, notably) that we cannot be explicitly bound to read and write and must have read only/read write semantics.
Maybe we can just have the smallest change:
#[storage(read)]
#[storage(read_write)]
|
||
const fn new(self_key: &StorageKey) -> Self; | ||
|
||
const fn internal_get_config(self_key: &StorageKey, value: &Self::Value) -> Self::Config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with StorageInit
instead of Config
, maybe this can be internal_make_initializer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the verb make
! I didn't like get
and was thinking about alternatives, but didn't like them, e.g. create
, either. They all sounded very "runtimeysh". make
somehow does not have that apeal. Here is your self key and the value to store, make me the <config (good name still to find 😄)>.
As mentioned, initializer
can mislead and in the and, the final name behind make
will depend on how we name the Config
associated type.
This is where the `Storage::Config` associated type comes in play, together with the `const fn` function `internal_get_config()`. | ||
|
||
Before generating the slots, the compiler already knows: | ||
- the `StorageKey` at which a particular `storage` element will be stored. This key is calculated based on the element name and namespace, or it is given using the `in` keyword. This storage key is, as mentioned below, called the _self key_ of the storage element. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of "self key" is an artifact of our thought process and it's littered all over this proposal, but I think we should try to come up with a better one since it's no longer involved with self
as a keyword.
Maybe "field key"? Since it's the StorageKey
for a given storage
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, self_key
is the best name 😄 IMO, of course.
It very precisely communicates what it is, and that's why it is introduced and consistently used in the RFC:
An important term to establish will be the self key. Self key is the
StorageKey
at which an instance of a storage type is stored.
Why I see it having properties of a perfectly good name:
- In the implementations of
Storage
types, the termself_key
immediately communicates that "this is my storage key, the storage key at which I am stored". - The public
self_key()
method is equally clear. It gives a clear answer to the question "what is your key?". (We need this method for, e.g., referencing or any other scenario where the knowledge of the slot location of the storage type instance is needed.) - In the documentation and communication, being reflexive it immediately points to the storage instance we are talking about.
Any other word that does not imply reflexive relationship (like "self", or "me") will be to broad and possibly applicable to several instances and the question "whose, of what" will not be immediately incorporated in the word.
E.g., having the field_key
everywhere rises the question "which field? whose field?". The sentence "Field key is the StorageKey
at which the instance of a storage type is stored" calls for explaining why the word "field". We have to keep in mind that storage type instances can be used independent of the storage
which makes the term "field" even less understandable and hard to justify.
Also, some storage types could have clashing terminology with the word we choose. E.g., if we chose "element key", it will clash with what an element in the StorageVec
is.
That's why I am for the name that implies reflexive relationship and see "self" as the perfect candidate. It also corresponds to what "self" is in Sway types. Something that points to, or denotes, myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On first look, I am unsure how I feel about including StorageBox
in the storage definition from a user perspective. It makes sense however with the current storage implementation the StorageKey
is behind the scenes and makes it much more readable:
storage {
box_1: StorageBox<u64> := 0,
}
vs
storage {
box_1: u64 = 0, // This is actually a StorageKey<u64>
}
It is unclear for me whether this is still achievable on the compiler side.
As AbiEncode
is implemented for primitive types such as u64
, couldn't StorageEncodedBox
be used? How do we know we are using using "the correct one"? A lot of the confusion in storage that we have seen from external developers is around having separate types for storage i.e. StorageVec
vs Vec
. Requiring that developers need to make the choice between StorageBox
and StorageEncodedBox
could further that confusion.
I am also wondering what the difference is here between StorageVec
and Vec
:
vec_of_encoded_val_1: StorageVec<StorageEncodedBox<Vec<bool>> := [Vec::from([true, false, true]), Vec::default(), Vec::from([true])],
As Vec
is a heap type and is supported, wouldn't this obsolete StorageVec
? In what case would I use one over the other? How would we ensure that developers know which to use?
I am very much so liking the use of write_deep_clear
and it's associated functions as well as StorageLayout
files/0013-configurable-and-composable-storage/api-design/storage_pair.sw
Outdated
Show resolved
Hide resolved
@bitzoic A big part of this change is trying to separate and give a proper interface between the structure of what is being stored and the storage method itself. This is why we can't stay with implicit Being explicit about the storage types also has the nice benefit that It also means that we do not need to be opinionated as to how people store things, and they can build their own storage data structures as libraries (as is already being done by some users). |
Co-authored-by: Cameron Carstens <[email protected]> Co-authored-by: IGI-111 <[email protected]>
@bitzoic Why we insist on Currently, whatever the type in the storage field declaration is, it is wrapped in the The questions regarding the choices between the types are great! I expect them to be well addressed in the documentation. Actually, discussing them was also a part of the initial RFC, but I've removed them to shorten the overall length of already lengthy RFC. But I think now I'll put them back, at least a short explanation.
Currently, the negative impl in the In both cases, developers will be properly informed and guided by the compiler if they tried to do
The intended usage of the So, I expect having something like The interesting consequence of this is that the I'll address this in the RFC. We will definitely provide sufficient guidelines and examples in the documentation so that developers know which storage type to use. |
@ironcev Is there any situation in which i can use StorageBox and would want to use StorageEncodedBox instead? |
The RFC looks good to me and is a noticeable improvement on what we have now. I noticed that |
@SwayStar123 I am not aware of any. In the proposal, this line: impl<T> !Storage for StorageEncodedBox<T> where T: Serializable { } forbids using However, I've left the discussion question asking the same as you, is there a valid reason to allow that and treat it as a warning rather then forbidding it: // TODO-DISCUSSION: Shell we forbid encoded-boxing serializable types and thus force
// them to be boxed in `StorageBox` or should this only be a compiler warning?
// Essentially, if a type is `Serializable` encoding it unnecessarily
// is a huge waste of computational and storage resources. Note that we can always start as proposed, having a compiler error if a serializable type and thus |
@esdrubal Currently, this is exactly what we do when storing types into storage, copying memory representation as is. At compile time storage configuration uses the Sway memory layout information to calculate the slots. The If the memory representation changes in the future, the issue might indeed arise if we e.g. have code accessing storage that gets LDCed into a contract that used old memory layout. This is a valid concern. I see it being in the same category like some others where we traded undefined behavior for performance. Paying for decoding/encoding at every storage access would definitely be a very high cost to pay for a change of memory layout that we will potentially never introduce. Also, note that we would have a similar issue with the That's why I am for addressing this concern as a breaking change in existing contracts if it ever happens. As a side note, what we might introduce in the future is the configurable memory layout equivalent to #[repr(...)] in Rust: FuelLabs/sway/#5286 In that case, we would still be copying the memory. The compiler would be aware of the CC: @IGI-111 |
Im confused then, the stated reason for having explicit storagebox and storageencodedbox is that the developer should make a conscious decision to choose between them, but apparently theres no choice at all? |
Here are some additional remarks and concerns coming from discussion with @SwayStar123.
|
Fair point 😄 We've discussed the topic of choice and the need for both the Please see the Explain difference between CC: @SwayStar123, @esdrubal, @IGI-111 |
Hmm, the two dead links in the
|
By extending the convo with @ironcev from Slack I would like to propose a concept/implementation for the StorageMap concept. I along with one of the SRs from the attackathon, have put all the info regarding that in the gist here along with the example implementation. It might be a bit clumsy right now, but at least it presents the idea. I reworked a bit the original I think it should also support nested maps, but need to test it. The API of this |
## Description This PR fixes #6317 by implementing `storage_domains` experimental feature. This feature introduces two _storage domains_, one for the storage fields defined inside of the `storage` declaration, and a different one for storage slots generated inside of the `StorageMap`. The PR strictly fixes the issue described in #6317 by applying the recommendation proposed in the issue. A general approach to storage key domains will be discussed as a part of the [Configurable and composable storage RFC](FuelLabs/sway-rfcs#40). Additionally, the PR: - adds expressive diagnostics for the duplicated storage keys warning. - removes the misleading internal compiler error that always followed the storage key type mismatch error (see demo below). Closes #6317, #6701. ## Breaking Changes The PR changes the way how storage keys are generated for `storage` fields. Instead of `sha256("storage::ns1::ns2.field_name")` we now use `sha256((0u8, "storage::ns1::ns2.field_name"))`. This is a breaking change for those who relied on the storage key calculation. ## Demo Before, every type-mismatch was always followed by an ICE, because the compilation continued despite the type-mismatch. ![Mismatched types and ICE - Before](https://github.com/user-attachments/assets/ac7915f7-3458-409e-a2bb-118dd4925234) This is solved now, and the type-mismatch has a dedicated help message. ![Mismatched types and ICE - After](https://github.com/user-attachments/assets/570aedd3-4c9c-4945-bfd0-5f12d68dbead) ## Checklist - [x] I have linked to any relevant issues. - [x] I have commented my code, particularly in hard-to-understand areas. - [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book). - [ ] If my change requires substantial documentation changes, I have [requested support from the DevRel team](https://github.com/FuelLabs/devrel-requests/issues/new/choose) - [x] I have added tests that prove my fix is effective or that my feature works. - [x] I have added (or requested a maintainer to add) the necessary `Breaking*` or `New Feature` labels where relevant. - [x] I have done my best to ensure that my PR adheres to [the Fuel Labs Code Review Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md). - [x] I have requested a review from the relevant team or maintainers. --------- Co-authored-by: Sophie Dankel <[email protected]> Co-authored-by: IGI-111 <[email protected]>
Rendered
This RFC introduces a concept of a
Storage
trait, as well as theStorageBox
andStorageEncodedBox
structs. Those types are cornerstones upon which we build a fully flexible, feature rich, and robust access to storage. These concepts simplify defining dynamic storage types, like e.g.,StorageVec
. They also provide a dedicated way of configuring (in thestorage
declaration) and initializing (in the code) arbitrary storage types composed of existing storage types.Additionally, the RFC provides API design guidelines for storage type's APIs. Those guidelines ensure that various aspects of a storage access like, e.g., accessing uninitialized storage, are alway treated in the same way, across various storage types.
Closes #34.