Skip to content

Commit 5982edc

Browse files
authored
intermediate/forward_ad_usage.py λ²ˆμ—­ (#1031)
* translation/intermediate_source_forward_ad_usage * translation/intermediate_source_forward_ad_usage * Reflect correction * Reflect correction * Reflect correction * Reflect Correction * Fix missing part * fix tesnor * minor fix
1 parent a776f7c commit 5982edc

File tree

1 file changed

+86
-106
lines changed

1 file changed

+86
-106
lines changed
Lines changed: 86 additions & 106 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,21 @@
11
# -*- coding: utf-8 -*-
22
"""
3-
Forward-mode Automatic Differentiation (Beta)
3+
μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ λ―ΈλΆ„(Beta)
44
=============================================
55
6-
This tutorial demonstrates how to use forward-mode AD to compute
7-
directional derivatives (or equivalently, Jacobian-vector products).
6+
**λ²ˆμ—­**: `κΉ€κ²½λ―Ό <https://github.com/BcKmini>`_
87
9-
The tutorial below uses some APIs only available in versions >= 1.11
10-
(or nightly builds).
8+
이 νŠœν† λ¦¬μ–Όμ€ μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ λ―ΈλΆ„(Forward-mode Automatic Differentiation)을 μ‚¬μš©ν•˜μ—¬ λ°©ν–₯μ„± λ„ν•¨μˆ˜(directional derivative) λ˜λŠ” μ•Όμ½”λΉ„μ•ˆ-벑터 κ³±(Jacobian-vector product)을 κ³„μ‚°ν•˜λŠ” 방법을 λ³΄μ—¬μ€λ‹ˆλ‹€.
119
12-
Also note that forward-mode AD is currently in beta. The API is
13-
subject to change and operator coverage is still incomplete.
10+
μ•„λž˜ νŠœν† λ¦¬μ–Όμ€ 1.11 이상 버전(λ˜λŠ” λ‚˜μ΄ν‹€λ¦¬ λΉŒλ“œ)μ—μ„œλ§Œ μ‚¬μš©ν•  수 μžˆλŠ” 일뢀 APIλ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
1411
15-
Basic Usage
12+
λ˜ν•œ, μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ 미뢄은 ν˜„μž¬ 베타 λ²„μ „μž…λ‹ˆλ‹€. λ”°λΌμ„œ APIκ°€ 변경될 수 있으며, 아직 일뢀 μ—°μ‚°μžλŠ” μ§€μ›λ˜μ§€ μ•Šμ„ 수 μžˆμŠ΅λ‹ˆλ‹€.
13+
14+
κΈ°λ³Έ μ‚¬μš©λ²•
1615
--------------------------------------------------------------------
17-
Unlike reverse-mode AD, forward-mode AD computes gradients eagerly
18-
alongside the forward pass. We can use forward-mode AD to compute a
19-
directional derivative by performing the forward pass as before,
20-
except we first associate our input with another tensor representing
21-
the direction of the directional derivative (or equivalently, the ``v``
22-
in a Jacobian-vector product). When an input, which we call "primal", is
23-
associated with a "direction" tensor, which we call "tangent", the
24-
resultant new tensor object is called a "dual tensor" for its connection
25-
to dual numbers[0].
26-
27-
As the forward pass is performed, if any input tensors are dual tensors,
28-
extra computation is performed to propagate this "sensitivity" of the
29-
function.
16+
μ—­μ „νŒŒ λͺ¨λ“œ μžλ™ λ―ΈλΆ„(Reverse-mode Automatic Differentiation)κ³Ό 달리, μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ 미뢄은 μˆœμ „νŒŒ(forward pass)λ₯Ό μ§„ν–‰ν•˜λ©° 기울기(gradient)λ₯Ό μ¦‰μ‹œ(계산을 미루지 μ•Šκ³ ) κ³„μ‚°ν•©λ‹ˆλ‹€. μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ λ―ΈλΆ„μœΌλ‘œ λ°©ν–₯μ„± λ„ν•¨μˆ˜λ₯Ό κ³„μ‚°ν•˜λ €λ©΄, λ¨Όμ € μž…λ ₯을 λ°©ν–₯μ„± λ„ν•¨μˆ˜μ˜ λ°©ν–₯을 λ‚˜νƒ€λ‚΄λŠ” λ‹€λ₯Έ tensor(μ•Όμ½”λΉ„μ•ˆ-벑터 곱의 `v`에 ν•΄λ‹Ή)와 μ—°κ²°ν•œ λ’€ 이전과 같이 μˆœμ „νŒŒλ₯Ό μˆ˜ν–‰ν•˜λ©΄ λ©λ‹ˆλ‹€. 'primal'이라고 λΆ€λ₯΄λŠ” μž…λ ₯이 'tangent'라고 λΆ€λ₯΄λŠ” 'λ°©ν–₯' tensor와 연결될 λ•Œ, 결과둜 λ‚˜μ˜€λŠ” μƒˆλ‘œμš΄ tensor κ°μ²΄λŠ” μ΄μ€‘μˆ˜(dual numbers) [0] μ™€μ˜ κ΄€λ ¨μ„± λ•Œλ¬Έμ— '이쀑 tensor(dual tensor)'라고 λΆˆλ¦½λ‹ˆλ‹€.
17+
18+
μˆœμ „νŒŒκ°€ μˆ˜ν–‰λ  λ•Œ, μž…λ ₯ tensor 쀑 ν•˜λ‚˜λΌλ„ 이쀑 tensor이면 ν•¨μˆ˜μ˜ '민감도(sensitivity)'λ₯Ό μ „νŒŒν•˜κΈ° μœ„ν•΄ 좔가적인 연산이 μˆ˜ν–‰λ©λ‹ˆλ‹€.
3019
3120
"""
3221

@@ -39,49 +28,45 @@
3928
def fn(x, y):
4029
return x ** 2 + y ** 2
4130

42-
# All forward AD computation must be performed in the context of
43-
# a ``dual_level`` context. All dual tensors created in such a context
44-
# will have their tangents destroyed upon exit. This is to ensure that
45-
# if the output or intermediate results of this computation are reused
46-
# in a future forward AD computation, their tangents (which are associated
47-
# with this computation) won't be confused with tangents from the later
48-
# computation.
31+
# λͺ¨λ“  μˆœμ „νŒŒ μžλ™ λ―ΈλΆ„ 연산은 ``dual_level`` μ»¨ν…μŠ€νŠΈ μ•ˆμ—μ„œ μˆ˜ν–‰ν•΄μ•Ό ν•©λ‹ˆλ‹€.
32+
# 이 μ»¨ν…μŠ€νŠΈμ—μ„œ μƒμ„±λœ λͺ¨λ“  이쀑 tensor의 νƒ„μ  νŠΈλŠ” μ»¨ν…μŠ€νŠΈλ₯Ό λ²—μ–΄λ‚  λ•Œ μ†Œλ©Έλ©λ‹ˆλ‹€.
33+
# μ΄λŠ” ν•΄λ‹Ή μ—°μ‚°μ˜ 좜λ ₯μ΄λ‚˜ 쀑간 κ²°κ³Όκ°€ ν–₯ν›„ λ‹€λ₯Έ μˆœμ „νŒŒ μžλ™ λ―ΈλΆ„ 연산에 μž¬μ‚¬μš©λ  λ•Œ,
34+
# ν˜„μž¬ 연산에 μ†ν•œ νƒ„μ  νŠΈκ°€ λ‚˜μ€‘ μ—°μ‚°μ˜ νƒ„μ  νŠΈμ™€ ν˜Όλ™λ˜λŠ” 것을 λ°©μ§€ν•˜κΈ° μœ„ν•¨μž…λ‹ˆλ‹€.
4935
with fwAD.dual_level():
50-
# To create a dual tensor we associate a tensor, which we call the
51-
# primal with another tensor of the same size, which we call the tangent.
52-
# If the layout of the tangent is different from that of the primal,
53-
# The values of the tangent are copied into a new tensor with the same
54-
# metadata as the primal. Otherwise, the tangent itself is used as-is.
36+
# 이쀑 tensorλ₯Ό λ§Œλ“€λ €λ©΄ 'primal' tensorλ₯Ό 같은 크기의 λ‹€λ₯Έ tensor,
37+
# 즉 'νƒ„μ  νŠΈ(tangent)'와 μ—°κ²°ν•©λ‹ˆλ‹€.
38+
# λ§Œμ•½ νƒ„μ  νŠΈμ˜ λ ˆμ΄μ•„μ›ƒμ΄ primal의 λ ˆμ΄μ•„μ›ƒκ³Ό λ‹€λ₯΄λ©΄,
39+
# νƒ„μ  νŠΈμ˜ 값은 primalκ³Ό λ™μΌν•œ 메타데이터λ₯Ό κ°–λŠ” μƒˆ tensor에 λ³΅μ‚¬λ©λ‹ˆλ‹€.
40+
# κ·Έλ ‡μ§€ μ•ŠμœΌλ©΄ νƒ„μ  νŠΈ μžμ²΄κ°€ κ·ΈλŒ€λ‘œ μ‚¬μš©λ©λ‹ˆλ‹€.
5541
#
56-
# It is also important to note that the dual tensor created by
57-
# ``make_dual`` is a view of the primal.
42+
# ``make_dual`` 둜 μƒμ„±λœ 이쀑 tensorλŠ” primal tensor의 **λ·°(데이터λ₯Ό κ³΅μœ ν•˜λŠ” μ°Έμ‘°)** λΌλŠ” 점도
43+
# μ€‘μš”ν•©λ‹ˆλ‹€.
5844
dual_input = fwAD.make_dual(primal, tangent)
5945
assert fwAD.unpack_dual(dual_input).tangent is tangent
6046

61-
# To demonstrate the case where the copy of the tangent happens,
62-
# we pass in a tangent with a layout different from that of the primal
47+
# νƒ„μ  νŠΈκ°€ λ³΅μ‚¬λ˜λŠ” 경우λ₯Ό 보여주기 μœ„ν•΄,
48+
# primalκ³Ό λ‹€λ₯Έ λ ˆμ΄μ•„μ›ƒμ„ κ°€μ§„ νƒ„μ  νŠΈλ₯Ό μ „λ‹¬ν•©λ‹ˆλ‹€.
6349
dual_input_alt = fwAD.make_dual(primal, tangent.T)
6450
assert fwAD.unpack_dual(dual_input_alt).tangent is not tangent
6551

66-
# Tensors that do not have an associated tangent are automatically
67-
# considered to have a zero-filled tangent of the same shape.
52+
# νƒ„μ  νŠΈκ°€ μ—°κ²°λ˜μ§€ μ•Šμ€ tensorλŠ” μžλ™μœΌλ‘œ
53+
# 같은 shape을 κ°€μ§€λ©° 0으둜 μ±„μ›Œμ§„ νƒ„μ  νŠΈλ₯Ό κ°€μ§„ κ²ƒμœΌλ‘œ κ°„μ£Όλ©λ‹ˆλ‹€.
6854
plain_tensor = torch.randn(10, 10)
6955
dual_output = fn(dual_input, plain_tensor)
7056

71-
# Unpacking the dual returns a ``namedtuple`` with ``primal`` and ``tangent``
72-
# as attributes
57+
# 이쀑 tensorλ₯Ό ν’€λ©΄(unpack) ``primal`` κ³Ό ``tangent`` λ₯Ό
58+
# μ†μ„±μœΌλ‘œ κ°–λŠ” ``namedtuple`` 이 λ°˜ν™˜λ©λ‹ˆλ‹€.
7359
jvp = fwAD.unpack_dual(dual_output).tangent
7460

7561
assert fwAD.unpack_dual(dual_output).tangent is None
7662

7763
######################################################################
78-
# Usage with Modules
64+
# λͺ¨λ“ˆκ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜κΈ°
7965
# --------------------------------------------------------------------
80-
# To use ``nn.Module`` with forward AD, replace the parameters of your
81-
# model with dual tensors before performing the forward pass. At the
82-
# time of writing, it is not possible to create dual tensor
83-
# `nn.Parameter`s. As a workaround, one must register the dual tensor
84-
# as a non-parameter attribute of the module.
66+
# ``nn.Module`` 을 μˆœμ „νŒŒ μžλ™ λ―ΈλΆ„κ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜λ €λ©΄, μˆœμ „νŒŒλ₯Ό μˆ˜ν–‰ν•˜κΈ° 전에
67+
# λͺ¨λΈμ˜ λ§€κ°œλ³€μˆ˜(parameter)λ₯Ό 이쀑 tensor둜 ꡐ체해야 ν•©λ‹ˆλ‹€. ν˜„μž¬ 이쀑 tensor둜 된
68+
# `nn.Parameter` λŠ” 생성할 수 μ—†μŠ΅λ‹ˆλ‹€. 이에 λŒ€ν•œ ν•΄κ²° λ°©λ²•μœΌλ‘œ,
69+
# 이쀑 tensorλ₯Ό λͺ¨λ“ˆμ˜ λ§€κ°œλ³€μˆ˜κ°€ μ•„λ‹Œ 일반 μ†μ„±μœΌλ‘œ 등둝해야 ν•©λ‹ˆλ‹€.
8570

8671
import torch.nn as nn
8772

@@ -100,52 +85,52 @@ def fn(x, y):
10085
jvp = fwAD.unpack_dual(out).tangent
10186

10287
######################################################################
103-
# Using the functional Module API (beta)
88+
# ν•¨μˆ˜ν˜• λͺ¨λ“ˆ API μ‚¬μš©ν•˜κΈ° (Beta)
10489
# --------------------------------------------------------------------
105-
# Another way to use ``nn.Module`` with forward AD is to utilize
106-
# the functional Module API (also known as the stateless Module API).
90+
# ``nn.Module`` 을 μˆœμ „νŒŒ μžλ™ λ―ΈλΆ„κ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜λŠ” 또 λ‹€λ₯Έ 방법은
91+
# ν•¨μˆ˜ν˜• λͺ¨λ“ˆ APIλ₯Ό ν™œμš©ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€. (μƒνƒœκ°€ μ—†λŠ” λͺ¨λ“ˆ API라고도 함)
10792

10893
from torch.func import functional_call
10994

110-
# We need a fresh module because the functional call requires the
111-
# the model to have parameters registered.
95+
# functional_call은 λͺ¨λΈμ— λ§€κ°œλ³€μˆ˜κ°€ λ“±λ‘λ˜μ–΄ μžˆμ–΄μ•Ό ν•˜λ―€λ‘œ
96+
# μƒˆλ‘œμš΄ λͺ¨λ“ˆμ΄ ν•„μš”ν•©λ‹ˆλ‹€.
11297
model = nn.Linear(5, 5)
11398

11499
dual_params = {}
115100
with fwAD.dual_level():
116101
for name, p in params.items():
117-
# Using the same ``tangents`` from the above section
102+
# μœ„ μ„Ήμ…˜κ³Ό λ™μΌν•œ ``tangents`` λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
118103
dual_params[name] = fwAD.make_dual(p, tangents[name])
119104
out = functional_call(model, dual_params, input)
120105
jvp2 = fwAD.unpack_dual(out).tangent
121106

122-
# Check our results
107+
# κ²°κ³Ό 확인
123108
assert torch.allclose(jvp, jvp2)
124109

125110
######################################################################
126-
# Custom autograd Function
111+
# μ‚¬μš©μž μ •μ˜ autograd ν•¨μˆ˜
127112
# --------------------------------------------------------------------
128-
# Custom Functions also support forward-mode AD. To create custom Function
129-
# supporting forward-mode AD, register the ``jvp()`` static method. It is
130-
# possible, but not mandatory for custom Functions to support both forward
131-
# and backward AD. See the
132-
# `documentation <https://pytorch.org/docs/master/notes/extending.html#forward-mode-ad>`_
133-
# for more information.
113+
# μ‚¬μš©μž μ •μ˜ ν•¨μˆ˜ λ˜ν•œ μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ 미뢄을 μ§€μ›ν•©λ‹ˆλ‹€. μˆœμ „νŒŒ λͺ¨λ“œ μžλ™ 미뢄을
114+
# μ§€μ›ν•˜λŠ” μ‚¬μš©μž μ •μ˜ ν•¨μˆ˜λ₯Ό λ§Œλ“€λ €λ©΄ ``jvp()`` 정적 λ©”μ†Œλ“œλ₯Ό
115+
# 등둝해야 ν•©λ‹ˆλ‹€. μ‚¬μš©μž μ •μ˜ ν•¨μˆ˜κ°€ μˆœμ „νŒŒμ™€ μ—­μ „νŒŒ μžλ™ 미뢄을 λͺ¨λ‘ μ§€μ›ν•˜λŠ” 것도
116+
# κ°€λŠ₯ν•˜μ§€λ§Œ ν•„μˆ˜λŠ” μ•„λ‹™λ‹ˆλ‹€. 더 μžμ„Έν•œ μ •λ³΄λŠ”
117+
# `λ¬Έμ„œ <https://pytorch.org/docs/master/notes/extending.html#forward-mode-ad>`_
118+
# λ₯Ό μ°Έκ³ ν•˜μ„Έμš”.
134119

135120
class Fn(torch.autograd.Function):
136121
@staticmethod
137122
def forward(ctx, foo):
138123
result = torch.exp(foo)
139-
# Tensors stored in ``ctx`` can be used in the subsequent forward grad
140-
# computation.
124+
# ``ctx`` 에 μ €μž₯된 tensorλŠ” μ΄ν›„μ˜ μˆœμ „νŒŒ 기울기
125+
# 계산에 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
141126
ctx.result = result
142127
return result
143128

144129
@staticmethod
145130
def jvp(ctx, gI):
146131
gO = gI * ctx.result
147-
# If the tensor stored in`` ctx`` will not also be used in the backward pass,
148-
# one can manually free it using ``del``
132+
# ``ctx`` 에 μ €μž₯된 tensorκ°€ μ—­μ „νŒŒμ— μ‚¬μš©λ˜μ§€ μ•Šμ„ 경우,
133+
# ``del`` 을 μ‚¬μš©ν•˜μ—¬ μˆ˜λ™μœΌλ‘œ λ©”λͺ¨λ¦¬μ—μ„œ ν•΄μ œν•  수 μžˆμŠ΅λ‹ˆλ‹€.
149134
del ctx.result
150135
return gO
151136

@@ -159,33 +144,30 @@ def jvp(ctx, gI):
159144
dual_output = fn(dual_input)
160145
jvp = fwAD.unpack_dual(dual_output).tangent
161146

162-
# It is important to use ``autograd.gradcheck`` to verify that your
163-
# custom autograd Function computes the gradients correctly. By default,
164-
# ``gradcheck`` only checks the backward-mode (reverse-mode) AD gradients. Specify
165-
# ``check_forward_ad=True`` to also check forward grads. If you did not
166-
# implement the backward formula for your function, you can also tell ``gradcheck``
167-
# to skip the tests that require backward-mode AD by specifying
168-
# ``check_backward_ad=False``, ``check_undefined_grad=False``, and
169-
# ``check_batched_grad=False``.
147+
# μ‚¬μš©μž μ •μ˜ autograd ν•¨μˆ˜κ°€ 기울기λ₯Ό μ˜¬λ°”λ₯΄κ²Œ κ³„μ‚°ν•˜λŠ”μ§€ ν™•μΈν•˜λ €λ©΄
148+
# ``autograd.gradcheck`` λ₯Ό μ‚¬μš©ν•˜λŠ” 것이 μ€‘μš”ν•©λ‹ˆλ‹€. 기본적으둜
149+
# ``gradcheck`` λŠ” μ—­μ „νŒŒ λͺ¨λ“œ(reverse-mode) μžλ™ λ―ΈλΆ„ 기울기만 ν™•μΈν•©λ‹ˆλ‹€.
150+
# ``check_forward_ad=True`` λ₯Ό μ§€μ •ν•˜μ—¬ μˆœμ „νŒŒ κΈ°μšΈκΈ°λ„ ν™•μΈν•˜λ„λ‘ ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
151+
# λ§Œμ•½ ν•¨μˆ˜μ— λŒ€ν•œ μ—­μ „νŒŒλ₯Ό κ΅¬ν˜„ν•˜μ§€ μ•Šμ•˜λ‹€λ©΄, ``check_backward_ad=False``,
152+
# ``check_undefined_grad=False``, ``check_batched_grad=False`` λ₯Ό μ§€μ •ν•˜μ—¬
153+
# ``gradcheck`` κ°€ μ—­μ „νŒŒ λͺ¨λ“œ μžλ™ 미뢄이 ν•„μš”ν•œ ν…ŒμŠ€νŠΈλ₯Ό κ±΄λ„ˆλ›°λ„λ‘ ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
170154
torch.autograd.gradcheck(Fn.apply, (primal,), check_forward_ad=True,
171155
check_backward_ad=False, check_undefined_grad=False,
172156
check_batched_grad=False)
173157

174158
######################################################################
175-
# Functional API (beta)
159+
# ν•¨μˆ˜ν˜• API (Beta)
176160
# --------------------------------------------------------------------
177-
# We also offer a higher-level functional API in functorch
178-
# for computing Jacobian-vector products that you may find simpler to use
179-
# depending on your use case.
161+
# FunctorchλŠ” μ•Όμ½”λΉ„μ•ˆ-벑터 곱을 κ³„μ‚°ν•˜κΈ° μœ„ν•œ κ³ μˆ˜μ€€ ν•¨μˆ˜ν˜• API도
162+
# μ œκ³΅ν•˜λ©°, μ‚¬μš© 사둀에 따라 더 κ°„λ‹¨ν•˜κ²Œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
180163
#
181-
# The benefit of the functional API is that there isn't a need to understand
182-
# or use the lower-level dual tensor API and that you can compose it with
183-
# other `functorch transforms (like vmap) <https://pytorch.org/functorch/stable/notebooks/jacobians_hessians.html>`_;
184-
# the downside is that it offers you less control.
164+
# ν•¨μˆ˜ν˜• API의 μž₯점은 μ €μˆ˜μ€€μ˜ 이쀑 tensor APIλ₯Ό μ΄ν•΄ν•˜κ±°λ‚˜ μ‚¬μš©ν• 
165+
# ν•„μš”κ°€ μ—†μœΌλ©°, λ‹€λ₯Έ `functorch λ³€ν™˜(vmap λ“±)κ³Ό κ²°ν•© <https://pytorch.org/functorch/stable/notebooks/jacobians_hessians.html>`_
166+
# ν•  수 μžˆλ‹€λŠ” κ²ƒμž…λ‹ˆλ‹€. 단점은 μ„Έλ°€ν•œ μ œμ–΄κ°€ μ–΄λ ΅λ‹€λŠ” μ μž…λ‹ˆλ‹€.
185167
#
186-
# Note that the remainder of this tutorial will require functorch
187-
# (https://github.com/pytorch/functorch) to run. Please find installation
188-
# instructions at the specified link.
168+
# 이 νŠœν† λ¦¬μ–Όμ˜ λ‚˜λ¨Έμ§€ 뢀뢄을 μ‹€ν–‰ν•˜λ €λ©΄ functorch
169+
# (https://github.com/pytorch/functorch) κ°€ ν•„μš”ν•©λ‹ˆλ‹€.
170+
# μ„€μΉ˜ 방법은 ν•΄λ‹Ή λ§ν¬μ—μ„œ ν™•μΈν•΄μ£Όμ„Έμš”.
189171

190172
import functorch as ft
191173

@@ -197,14 +179,15 @@ def jvp(ctx, gI):
197179
def fn(x, y):
198180
return x ** 2 + y ** 2
199181

200-
# Here is a basic example to compute the JVP of the above function.
201-
# The ``jvp(func, primals, tangents)`` returns ``func(*primals)`` as well as the
202-
# computed Jacobian-vector product (JVP). Each primal must be associated with a tangent of the same shape.
182+
# μœ„ ν•¨μˆ˜μ˜ JVPλ₯Ό κ³„μ‚°ν•˜λŠ” κΈ°λ³Έ μ˜ˆμ œμž…λ‹ˆλ‹€.
183+
# ``jvp(func, primals, tangents)`` λŠ” ``func(*primals)`` 의 결과와 κ³„μ‚°λœ
184+
# μ•Όμ½”λΉ„μ•ˆ-벑터 κ³±(JVP)을 ν•¨κ»˜ λ°˜ν™˜ν•©λ‹ˆλ‹€. 각 primal은 같은 shape의 νƒ„μ  νŠΈμ™€
185+
# μ—°κ²°λ˜μ–΄μ•Ό ν•©λ‹ˆλ‹€.
203186
primal_out, tangent_out = ft.jvp(fn, (primal0, primal1), (tangent0, tangent1))
204187

205-
# ``functorch.jvp`` requires every primal to be associated with a tangent.
206-
# If we only want to associate certain inputs to `fn` with tangents,
207-
# then we'll need to create a new function that captures inputs without tangents:
188+
# ``functorch.jvp`` λŠ” λͺ¨λ“  primal이 νƒ„μ  νŠΈμ™€ 연결될 것을 μš”κ΅¬ν•©λ‹ˆλ‹€.
189+
# λ§Œμ•½ ``fn`` 의 νŠΉμ • μž…λ ₯μ—λ§Œ νƒ„μ  νŠΈλ₯Ό μ—°κ²°ν•˜κ³  μ‹Άλ‹€λ©΄,
190+
# νƒ„μ  νŠΈκ°€ μ—†λŠ” μž…λ ₯을 λ°›λŠ” μƒˆλ‘œμš΄ ν•¨μˆ˜λ₯Ό λ§Œλ“€μ–΄μ•Ό ν•©λ‹ˆλ‹€.
208191
primal = torch.randn(10, 10)
209192
tangent = torch.randn(10, 10)
210193
y = torch.randn(10, 10)
@@ -214,33 +197,30 @@ def fn(x, y):
214197
primal_out, tangent_out = ft.jvp(new_fn, (primal,), (tangent,))
215198

216199
######################################################################
217-
# Using the functional API with Modules
200+
# ν•¨μˆ˜ν˜• APIλ₯Ό λͺ¨λ“ˆκ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜κΈ°
218201
# --------------------------------------------------------------------
219-
# To use ``nn.Module`` with ``functorch.jvp`` to compute Jacobian-vector products
220-
# with respect to the model parameters, we need to reformulate the
221-
# ``nn.Module`` as a function that accepts both the model parameters and inputs
222-
# to the module.
202+
# ``nn.Module`` κ³Ό ``functorch.jvp`` λ₯Ό ν•¨κ»˜ μ‚¬μš©ν•˜μ—¬ λͺ¨λΈ λ§€κ°œλ³€μˆ˜μ— λŒ€ν•œ
203+
# μ•Όμ½”λΉ„μ•ˆ-벑터 곱을 κ³„μ‚°ν•˜λ €λ©΄, ``nn.Module`` 을 λͺ¨λΈ λ§€κ°œλ³€μˆ˜μ™€
204+
# λͺ¨λ“ˆμ˜ μž…λ ₯을 λͺ¨λ‘ 인자둜 λ°›λŠ” ν•¨μˆ˜λ‘œ μž¬κ΅¬μ„±ν•΄μ•Ό ν•©λ‹ˆλ‹€.
223205

224206
model = nn.Linear(5, 5)
225207
input = torch.randn(16, 5)
226208
tangents = tuple([torch.rand_like(p) for p in model.parameters()])
227209

228-
# Given a ``torch.nn.Module``, ``ft.make_functional_with_buffers`` extracts the state
229-
# (``params`` and buffers) and returns a functional version of the model that
230-
# can be invoked like a function.
231-
# That is, the returned ``func`` can be invoked like
232-
# ``func(params, buffers, input)``.
233-
# ``ft.make_functional_with_buffers`` is analogous to the ``nn.Modules`` stateless API
234-
# that you saw previously and we're working on consolidating the two.
210+
# ``ft.make_functional_with_buffers`` λŠ” μ£Όμ–΄μ§„ ``torch.nn.Module`` μ—μ„œ
211+
# μƒνƒœ(``params`` 와 버퍼)λ₯Ό μΆ”μΆœν•˜κ³ , ν•¨μˆ˜μ²˜λŸΌ ν˜ΈμΆœν•  수 μžˆλŠ”
212+
# ν•¨μˆ˜ν˜• λ²„μ „μ˜ λͺ¨λΈμ„ λ°˜ν™˜ν•©λ‹ˆλ‹€.
213+
# 즉, λ°˜ν™˜λœ ``func`` λŠ” ``func(params, buffers, input)`` 처럼 ν˜ΈμΆœν•  수 μžˆμŠ΅λ‹ˆλ‹€.
214+
# ``ft.make_functional_with_buffers`` λŠ” 이전에 λ³΄μ•˜λ˜ ``nn.Module`` 의 μƒνƒœ μ—†λŠ” API와
215+
# μœ μ‚¬ν•˜λ©°, 이 λ‘˜μ„ ν†΅ν•©ν•˜λŠ” μž‘μ—…μ΄ μ§„ν–‰ μ€‘μž…λ‹ˆλ‹€.
235216
func, params, buffers = ft.make_functional_with_buffers(model)
236217

237-
# Because ``jvp`` requires every input to be associated with a tangent, we need to
238-
# create a new function that, when given the parameters, produces the output
218+
# ``jvp`` λŠ” λͺ¨λ“  μž…λ ₯이 νƒ„μ  νŠΈμ™€ 연결될 것을 μš”κ΅¬ν•˜λ―€λ‘œ,
219+
# λ§€κ°œλ³€μˆ˜λ₯Ό λ°›μ•˜μ„ λ•Œ 좜λ ₯을 μƒμ„±ν•˜λŠ” μƒˆλ‘œμš΄ ν•¨μˆ˜λ₯Ό λ§Œλ“€μ–΄μ•Ό ν•©λ‹ˆλ‹€.
239220
def func_params_only(params):
240221
return func(params, buffers, input)
241222

242223
model_output, jvp_out = ft.jvp(func_params_only, (params,), (tangents,))
243224

244-
245225
######################################################################
246-
# [0] https://en.wikipedia.org/wiki/Dual_number
226+
# [0] https://en.wikipedia.org/wiki/Dual_number

0 commit comments

Comments
Β (0)