-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathcontent
162 lines (103 loc) · 12.1 KB
/
content
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
1 Example C++ memory safety example (CWE top 25)
2 Origin of Rust (mozilla 2015, servo & multithreaded firefox)
3 Linear typesystem, ownership, borrowing (scope, lifetime)
4 Enums
5 Error handling (return types vs. C++ exception)
6 Polymorphism (Generics, traits and trait objects)
7 Concurrency (send&sync, Arcs, mutex)
8 Unsafe
9 Macros
10 Modules
11 Std::lib
12 Cargo
13 Integrated tests & benchmark
14 Releases & development
We explain syntax at each example
0. Introduction
------------------
... intro Erdem ...
My name is Micha Hergarden and I am part of the SICS team at EUV source. I have a background in Software Engineering. I came into contact with Rust at the 2015 Fosdem software conference. I got to use it for my thesis in 2018.
Our aim for tonight is to give you a short tour of the Rust and to show in what aspects it differs from its cousins c and c++.
### NEXT SHEET ###
1. Example C++ memory safety
------------------
Will this work? The answer is: maybe. In some cases the vector will allocate new memory to enlarge and move its elements to this newly allocated block. The problem is that the reference 'ref' is then pointing into unused memory. This opens the door for segmentation faults or information leaking.
### NEXT SHEET ###
The CWE website tracks common weaknesses in software development.
The top 25 shows numerous issues that may arise in software where memory management is not done properly. An internal investigation at Microsoft also showed that about 70% of the security bugs found in their products are related to memory management. The Chromium project reports the same figures for their Chrome derived browser.
This seems to suggest that even with good engineers working on the code, it is still difficult to prevent these types of errors from occuring. A common denominator for these issues is the use of languages like C or C++. These languages offer runtime efficiency and low level control, but have the trade off of requiring rigorous testing and code checking to make sure no unintended issues are introduced. In contrast, languages with a garbage collector offer better protection from some of these types of issues but have the trade off of lacking the low level control needed in certain application domains, i.e. realtime systems. Enter Rust...
### NEXT SHEET ###
2. Origin of Rust
------------------
In 2006 a Mozilla Research employee named Graydon Hoare started the Rust language project as a means to design a language that provides better memory safety guarantees while still offering the low level control that c and c++ offer.
Mozilla started sponsoring the project in 2009 and announced it in 2010. The first major release of the language was in 2015. Initially the language was used for the Servo browser project. Servo was developed in cooperation with Samsung as a means to explore the concurrency and memory safety properties of Rust in order to create a parallelized web browser. The CSS style engine has already found its way into the Firefox browser.
Rust gained popularity outside of Mozilla and has earned the honour of being the "most loved programming language" on the Stack Overflow developer survey since 2016. The industry also started adopting the language and according to the project page around 148 companies are using Rust in production code. Microsoft and Google have shown serious interest and researched the language for use in security critical components.
### NEXT SHEET ###
3. Linear typesystem, ownership, borrowing and lifetimes. (take image from thesis)
------------------
The main property that differentiates Rust from languages like C and C++ is its typesystem, which is known as a linear typesystem. Linear typesystems allow you to bind a value to a variable exactly once. To clarify this further I will explain this in terms of a c++ string object.
### NEXT SHEET ###
In c++ a string is represented by two structures. First a structure that is associated with the string variable name. This structure contains a pointer to the second structure, a length and a capacity. The length is the number of characters in the string data, and the capacity is the maximum number of characters that can be held in the string data. The pointer points to the actual characters that make up the string and are stored in the heap.
### NEXT SHEET ###
In c++ you can make a shallow copy, meaning that you create a copy of the first struct and associate that with a different name. The pointer to the heap data is identical to the first string object. The advantage of that is that you do not need to copy a potentially large amount of heap data. The drawback is that the administration in both variables need to be kept in sync.
### NEXT SHEET ###
In a linear typesystem, this is considered double use and thus prohibited. When a new variable is associated with the data, the initial struct is marked 'invalid'. Any attempt to access the data via the first variable will result in a compile time error. This concept is called 'Ownership'. It can be compared to a person owning a physical object like a book and giving it to a friend.
To summarize: at any given time, a value must have an owner and can have only one owner. When the owner is destroyed, the value is also destroyed.
### NEXT SHEET ###
In Rust it is possible to associate a second variable name to the string by means of a reference. The reference contains then a pointer to the first variable, which in turn points to the actual string data. This referencing mechanism is called 'Borrowing' and can be compared to a person lending a book to a friend. The friend may read the book, as long as the book is returned to the owner in due time. By default, a reference is immutable. Any attempt to modify the value through a reference will lead to a compile time error. A refence can explicitly be made mutable, but then there can only be a single reference.
To summarize: at any given time there may be many immutable references to a value or a single mutable reference. A reference must always be valid.
A nice consequence of using linear types is that there is no need for a garbage collector. The compiler can deduce which variable a value belongs to and what the scope of that variable is. In Rust this scope is called the Lifetime of a variable. The absence of a garbage collector lets Rust enter the domain of languages like c and c++ and this is why it is often touted as a system programming language. In the event that you really do need a copy of a value, you can create one by calling the 'clone()' method on a type. This creates a deep copy of the value.
The design principles behind Rust are there to eliminate undefined behavior and to have as much memory safety as possible.
### NEXT SHEET ###
4. Enums
------------------
----- ERDEM ----------
5. Error Handling (Don't panic!)
------------------
### NEXT SHEET ###
Rust divides error handling in two categories: non-recoverable and recoverable. A non-recoverable error is an condition in the program that should never happen and is cause for the program to end prematurely. An example of that would be a division by zero as that leads to an undefined result. In Rust such an error is called a 'panic'. In the event of a panic, the program halts and by default a panic handler is called.
Rust will start unwinding the stack and print out a stacktrace for further analysis. One thing to note is that a panic will stay within the boundaries of a thread. So if one thread panics, it does not mean that your whole program will crash. Instead, the thread will abort and the parent thread will be able to retrieve a thread result. In this case the result will be an error. This makes for more robust programs and allows the program to continue the other threads.
Panicking is generally viewed as a last resort to prevent undifined behavior.
### NEXT SHEET ###
Let's move on to recoverable errors. The Rust language does not have an exception mechanism. Instead it relies on return values. The Result enum type contains either an OK type, which contains the actual value, or an Error type which holds context information like an exception would.
Similarly, the option type contains either a Some with a value or a None type when it is OK to return nothing.
The advantage of using these Enums is that the caller must do something with them. It is not possible to accidentally ignore the error and that again reduces undefined behavior.
6. Polymorphism
------------------
### NEXT SHEET ###
Polymorphism in Rust is enabled by the use of Traits, which are similar to Interfaces in Java or Abstract classed in C++. Rust does not have classes or inheritance, but you can create structs and add methods to them by implementing a trait which declares the methods. An example of a trait and implementation is shown here. The trait or behavior Sound can then be implemented by the Dog.
Rust traits can be implemented on built in types as well, even on an integer. The compiler can often optimize away the virtual method call to make it more effictient.
A novelty in the trait system is the trait object. This allows you to create a vector of mixed types for instance. Unfortunately we cannot dive further into this now, but it is an interestic topic to look up.
### NEXT SHEET ###
Generic methods
One other way to achieve polymorphism is the use of generic methods. This is similar to Java Generics and C++ templates.
The capital T denotes any type with which this function will be called. Like templates, Rust will generate the machine code compile time for each type.
A generic type can be bound or restricted to one or more traits. In this case it is bound to the Ord trait so the function can only be used with types that can be compared.
fn min<T: Ord>(value1: T, value2: T) -> T {
if value1 <= value2 {
value1
} else {
value2
}
}
### NEXT SHEET ###
7. Concurrency
------------------
The strength of the linear typesystem lies in its application in concurrent programming. Concurrent programming has numerous pitfalls such as data race conditions, deadlocks and livelocks. These pitfalls only come into play when you have shared mutable state. The common solution to this is to remove mutability by adding locks like mutexes. The downside of this is that it is easy to forget to lock or unlock data. And it can be hard to tell if a design is deadlock or livelock free.
Rust tries to solve this in a different manner. Instead of removing the mutability, it removes the shared part by means of the ownership system.
Rust supports a number of ways to support concurrency. Such as:
- fork join
- channels
- arcs/mutex
- condition variables
### NEXT SHEET ###
A good example of the use of the linear typesystem is the channels mechanism.
Channels are thread safe one way queues for multiple producer and single consumer. Channels can be synchronous or async. When a producer sends data it transfers ownership of that data to the channel.
This guarantees that the producer will never modify the content after sending, thus preventing data races. A consumer can take data from the channel and gain ownership from the queue. After that the consumer is free to use the data in any way it wants.
Contrast this with other type systems where you either send a potentilly large copy or you send a pointer which cannot provide the guarantee of immutability. It also relieves the user from having to work with locking or shared memory.
Unfortunately we cannot dive further in this topic here,but we have coding example at the end of this presentation will show an example of fork/join.
### NEXT SHEET ###
One last word on Arcs and Mutexes. In Rust you can share immutable data between threads by means of a so called Arc (atomic reference count). This is a thread safe reference counted type, which ensures the data contained in it will not be destroyed until the counter reaches 0.
Rust does support a mutex type if you really do need to share mutable data.
The advantage of this type is that is wraps a mutex around a value, so you cannot get to the data unless you have the lock. A lock is scoped and automatically unlocked. If a thread aborts, the mutex is marked as poisoned, so that other thread cannot read the data which may be corrupted.
Unfortunately, Rust cannot protect you from deadlocks and livelocks, so consider carefully before using a mutex.