consfued about a fomula in the paper

![image](https://github.com/abacusai/smaug/assets/10270306/0492754b-5713-4a87-b297-576656647cbb)
Hi I'm trying to learn how DPOP works but got stuck at the Derivation of DPO part (the provement of DPOP's motivation)
In specific, the equation 4 in Appendix B.1 Derivation for DPO.
Could you help elaborate a little on how equation 4 generates or could you give more directions on the references? 

Really appreciate it! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consfued about a fomula in the paper #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

consfued about a fomula in the paper #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions