|
3 | 3 | {
|
4 | 4 | "cell_type": "markdown",
|
5 | 5 | "metadata": {
|
6 |
| - "id": "view-in-github", |
7 |
| - "colab_type": "text" |
| 6 | + "colab_type": "text", |
| 7 | + "id": "view-in-github" |
8 | 8 | },
|
9 | 9 | "source": [
|
10 | 10 | "<a href=\"https://colab.research.google.com/github/huggingface/deep-rl-class/blob/main/notebooks/unit3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
|
|
41 | 41 | },
|
42 | 42 | {
|
43 | 43 | "cell_type": "markdown",
|
| 44 | + "metadata": { |
| 45 | + "id": "ykJiGevCMVc5" |
| 46 | + }, |
44 | 47 | "source": [
|
45 | 48 | "### 🎮 Environments: \n",
|
46 | 49 | "\n",
|
|
51 | 54 | "### 📚 RL-Library: \n",
|
52 | 55 | "\n",
|
53 | 56 | "- [RL-Baselines3-Zoo](https://github.com/DLR-RM/rl-baselines3-zoo)"
|
54 |
| - ], |
55 |
| - "metadata": { |
56 |
| - "id": "ykJiGevCMVc5" |
57 |
| - } |
| 57 | + ] |
58 | 58 | },
|
59 | 59 | {
|
60 | 60 | "cell_type": "markdown",
|
|
72 | 72 | },
|
73 | 73 | {
|
74 | 74 | "cell_type": "markdown",
|
| 75 | + "metadata": { |
| 76 | + "id": "TsnP0rjxMn1e" |
| 77 | + }, |
75 | 78 | "source": [
|
76 | 79 | "## This notebook is from Deep Reinforcement Learning Course\n",
|
77 | 80 | "<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/deep-rl-course-illustration.jpg\" alt=\"Deep RL Course illustration\"/>"
|
78 |
| - ], |
79 |
| - "metadata": { |
80 |
| - "id": "TsnP0rjxMn1e" |
81 |
| - } |
| 81 | + ] |
82 | 82 | },
|
83 | 83 | {
|
84 | 84 | "cell_type": "markdown",
|
|
114 | 114 | },
|
115 | 115 | {
|
116 | 116 | "cell_type": "markdown",
|
117 |
| - "source": [ |
118 |
| - "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." |
119 |
| - ], |
120 | 117 | "metadata": {
|
121 | 118 | "id": "7kszpGFaRVhq"
|
122 |
| - } |
| 119 | + }, |
| 120 | + "source": [ |
| 121 | + "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." |
| 122 | + ] |
123 | 123 | },
|
124 | 124 | {
|
125 | 125 | "cell_type": "markdown",
|
|
142 | 142 | },
|
143 | 143 | {
|
144 | 144 | "cell_type": "markdown",
|
| 145 | + "metadata": { |
| 146 | + "id": "Nc8BnyVEc3Ys" |
| 147 | + }, |
145 | 148 | "source": [
|
146 | 149 | "## An advice 💡\n",
|
147 | 150 | "It's better to run this colab in a copy on your Google Drive, so that **if it timeouts** you still have the saved notebook on your Google Drive and do not need to fill everything from scratch.\n",
|
|
151 | 154 | "Also, we're going to **train it for 90 minutes with 1M timesteps**. By typing `!nvidia-smi` will tell you what GPU you're using.\n",
|
152 | 155 | "\n",
|
153 | 156 | "And if you want to train more such 10 million steps, this will take about 9 hours, potentially resulting in Colab timing out. In that case, I recommend running this on your local computer (or somewhere else). Just click on: `File>Download`. "
|
154 |
| - ], |
155 |
| - "metadata": { |
156 |
| - "id": "Nc8BnyVEc3Ys" |
157 |
| - } |
| 157 | + ] |
158 | 158 | },
|
159 | 159 | {
|
160 | 160 | "cell_type": "markdown",
|
| 161 | + "metadata": { |
| 162 | + "id": "PU4FVzaoM6fC" |
| 163 | + }, |
161 | 164 | "source": [
|
162 | 165 | "## Set the GPU 💪\n",
|
163 | 166 | "- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n",
|
164 | 167 | "\n",
|
165 | 168 | "<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/gpu-step1.jpg\" alt=\"GPU Step 1\">"
|
166 |
| - ], |
167 |
| - "metadata": { |
168 |
| - "id": "PU4FVzaoM6fC" |
169 |
| - } |
| 169 | + ] |
170 | 170 | },
|
171 | 171 | {
|
172 | 172 | "cell_type": "markdown",
|
| 173 | + "metadata": { |
| 174 | + "id": "KV0NyFdQM9ZG" |
| 175 | + }, |
173 | 176 | "source": [
|
174 | 177 | "- `Hardware Accelerator > GPU`\n",
|
175 | 178 | "\n",
|
176 | 179 | "<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/gpu-step2.jpg\" alt=\"GPU Step 2\">"
|
177 |
| - ], |
178 |
| - "metadata": { |
179 |
| - "id": "KV0NyFdQM9ZG" |
180 |
| - } |
| 180 | + ] |
181 | 181 | },
|
182 | 182 | {
|
183 | 183 | "cell_type": "markdown",
|
| 184 | + "metadata": { |
| 185 | + "id": "wS_cVefO-aYg" |
| 186 | + }, |
184 | 187 | "source": [
|
185 | 188 | "# Install RL-Baselines3 Zoo and its dependencies 📚\n",
|
186 | 189 | "\n",
|
187 | 190 | "If you see `ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.` **this is normal and it's not a critical error** there's a conflict of version. But the packages we need are installed."
|
188 |
| - ], |
189 |
| - "metadata": { |
190 |
| - "id": "wS_cVefO-aYg" |
191 |
| - } |
| 191 | + ] |
192 | 192 | },
|
193 | 193 | {
|
194 | 194 | "cell_type": "code",
|
195 |
| - "source": [ |
196 |
| - "# For now we install this update of RL-Baselines3 Zoo\n", |
197 |
| - "!pip install git+https://github.com/DLR-RM/rl-baselines3-zoo@update/hf" |
198 |
| - ], |
| 195 | + "execution_count": null, |
199 | 196 | "metadata": {
|
200 | 197 | "id": "hLTwHqIWdnPb"
|
201 | 198 | },
|
202 |
| - "execution_count": null, |
203 |
| - "outputs": [] |
| 199 | + "outputs": [], |
| 200 | + "source": [ |
| 201 | + "# For now we install this update of RL-Baselines3 Zoo\n", |
| 202 | + "!pip install git+https://github.com/DLR-RM/rl-baselines3-zoo@update/hf" |
| 203 | + ] |
204 | 204 | },
|
205 | 205 | {
|
206 | 206 | "cell_type": "markdown",
|
207 |
| - "source": [ |
208 |
| - "IF AND ONLY IF THE VERSION ABOVE DOES NOT EXIST ANYMORE. UNCOMMENT AND INSTALL THE ONE BELOW" |
209 |
| - ], |
210 | 207 | "metadata": {
|
211 | 208 | "id": "p0xe2sJHdtHy"
|
212 |
| - } |
| 209 | + }, |
| 210 | + "source": [ |
| 211 | + "IF AND ONLY IF THE VERSION ABOVE DOES NOT EXIST ANYMORE. UNCOMMENT AND INSTALL THE ONE BELOW" |
| 212 | + ] |
213 | 213 | },
|
214 | 214 | {
|
215 | 215 | "cell_type": "code",
|
216 |
| - "source": [ |
217 |
| - "#!pip install rl_zoo3==2.0.0a9" |
218 |
| - ], |
| 216 | + "execution_count": null, |
219 | 217 | "metadata": {
|
220 | 218 | "id": "N0d6wy-F-f39"
|
221 | 219 | },
|
222 |
| - "execution_count": null, |
223 |
| - "outputs": [] |
| 220 | + "outputs": [], |
| 221 | + "source": [ |
| 222 | + "#!pip install rl_zoo3==2.0.0a9" |
| 223 | + ] |
224 | 224 | },
|
225 | 225 | {
|
226 | 226 | "cell_type": "code",
|
227 |
| - "source": [ |
228 |
| - "!apt-get install swig cmake ffmpeg" |
229 |
| - ], |
| 227 | + "execution_count": null, |
230 | 228 | "metadata": {
|
231 | 229 | "id": "8_MllY6Om1eI"
|
232 | 230 | },
|
233 |
| - "execution_count": null, |
234 |
| - "outputs": [] |
| 231 | + "outputs": [], |
| 232 | + "source": [ |
| 233 | + "!apt-get install swig cmake ffmpeg" |
| 234 | + ] |
235 | 235 | },
|
236 | 236 | {
|
237 | 237 | "cell_type": "markdown",
|
|
244 | 244 | },
|
245 | 245 | {
|
246 | 246 | "cell_type": "code",
|
247 |
| - "source": [ |
248 |
| - "!pip install gymnasium[atari]\n", |
249 |
| - "!pip install gymnasium[accept-rom-license]" |
250 |
| - ], |
| 247 | + "execution_count": null, |
251 | 248 | "metadata": {
|
252 | 249 | "id": "NsRP-lX1_2fC"
|
253 | 250 | },
|
254 |
| - "execution_count": null, |
255 |
| - "outputs": [] |
| 251 | + "outputs": [], |
| 252 | + "source": [ |
| 253 | + "!pip install gymnasium[atari]\n", |
| 254 | + "!pip install gymnasium[accept-rom-license]" |
| 255 | + ] |
256 | 256 | },
|
257 | 257 | {
|
258 | 258 | "cell_type": "markdown",
|
| 259 | + "metadata": { |
| 260 | + "id": "bTpYcVZVMzUI" |
| 261 | + }, |
259 | 262 | "source": [
|
260 | 263 | "## Create a virtual display 🔽\n",
|
261 | 264 | "\n",
|
262 | 265 | "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n",
|
263 | 266 | "\n",
|
264 | 267 | "Hence the following cell will install the librairies and create and run a virtual screen 🖥"
|
265 |
| - ], |
266 |
| - "metadata": { |
267 |
| - "id": "bTpYcVZVMzUI" |
268 |
| - } |
| 268 | + ] |
269 | 269 | },
|
270 | 270 | {
|
271 | 271 | "cell_type": "code",
|
|
283 | 283 | },
|
284 | 284 | {
|
285 | 285 | "cell_type": "code",
|
| 286 | + "execution_count": null, |
| 287 | + "metadata": { |
| 288 | + "id": "BE5JWP5rQIKf" |
| 289 | + }, |
| 290 | + "outputs": [], |
286 | 291 | "source": [
|
287 | 292 | "# Virtual display\n",
|
288 | 293 | "from pyvirtualdisplay import Display\n",
|
289 | 294 | "\n",
|
290 | 295 | "virtual_display = Display(visible=0, size=(1400, 900))\n",
|
291 | 296 | "virtual_display.start()"
|
292 |
| - ], |
293 |
| - "metadata": { |
294 |
| - "id": "BE5JWP5rQIKf" |
295 |
| - }, |
296 |
| - "execution_count": null, |
297 |
| - "outputs": [] |
| 297 | + ] |
298 | 298 | },
|
299 | 299 | {
|
300 | 300 | "cell_type": "markdown",
|
|
310 | 310 | "\n",
|
311 | 311 | "This is a template example:\n",
|
312 | 312 | "\n",
|
313 |
| - "```\n", |
| 313 | + "```yaml\n", |
314 | 314 | "SpaceInvadersNoFrameskip-v4:\n",
|
315 | 315 | " env_wrapper:\n",
|
316 | 316 | " - stable_baselines3.common.atari_wrappers.AtariWrapper\n",
|
|
755 | 755 | },
|
756 | 756 | {
|
757 | 757 | "cell_type": "markdown",
|
758 |
| - "source": [ |
759 |
| - "See you on Bonus unit 2! 🔥 " |
760 |
| - ], |
761 | 758 | "metadata": {
|
762 | 759 | "id": "Kc3udPT-RcXc"
|
763 |
| - } |
| 760 | + }, |
| 761 | + "source": [ |
| 762 | + "See you on Bonus unit 2! 🔥 " |
| 763 | + ] |
764 | 764 | },
|
765 | 765 | {
|
766 | 766 | "cell_type": "markdown",
|
|
773 | 773 | }
|
774 | 774 | ],
|
775 | 775 | "metadata": {
|
| 776 | + "accelerator": "GPU", |
776 | 777 | "colab": {
|
| 778 | + "include_colab_link": true, |
777 | 779 | "private_outputs": true,
|
778 |
| - "provenance": [], |
779 |
| - "include_colab_link": true |
| 780 | + "provenance": [] |
780 | 781 | },
|
| 782 | + "gpuClass": "standard", |
781 | 783 | "kernelspec": {
|
782 | 784 | "display_name": "Python 3 (ipykernel)",
|
783 | 785 | "language": "python",
|
|
823 | 825 | "_Feature"
|
824 | 826 | ],
|
825 | 827 | "window_display": false
|
826 |
| - }, |
827 |
| - "accelerator": "GPU", |
828 |
| - "gpuClass": "standard" |
| 828 | + } |
829 | 829 | },
|
830 | 830 | "nbformat": 4,
|
831 | 831 | "nbformat_minor": 0
|
|
0 commit comments