open-mmlab
diff --git a/‎.gitmodules
Lines changed: 3 additions & 0 deletions b/‎.gitmodules
Lines changed: 3 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml
Lines changed: 26 additions & 0 deletions b/‎.pre-commit-config.yaml
Lines changed: 26 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 139 additions & 1 deletion b/‎README.md
Lines changed: 139 additions & 1 deletion
diff --git a/‎configs/base_config.yaml
Lines changed: 36 additions & 0 deletions b/‎configs/base_config.yaml
Lines changed: 36 additions & 0 deletions
diff --git a/‎configs/disneyPixar.yaml
Lines changed: 11 additions & 0 deletions b/‎configs/disneyPixar.yaml
Lines changed: 11 additions & 0 deletions
diff --git a/‎configs/kFelted.yaml
Lines changed: 17 additions & 0 deletions b/‎configs/kFelted.yaml
Lines changed: 17 additions & 0 deletions
diff --git a/‎configs/moxin.yaml
Lines changed: 13 additions & 0 deletions b/‎configs/moxin.yaml
Lines changed: 13 additions & 0 deletions
diff --git a/‎configs/origami.yaml
Lines changed: 15 additions & 0 deletions b/‎configs/origami.yaml
Lines changed: 15 additions & 0 deletions
diff --git a/‎configs/pixart.yaml
Lines changed: 12 additions & 0 deletions b/‎configs/pixart.yaml
Lines changed: 12 additions & 0 deletions
diff --git a/‎configs/toonyou.yaml
Lines changed: 10 additions & 0 deletions b/‎configs/toonyou.yaml
Lines changed: 10 additions & 0 deletions
diff --git a/‎demo/.gitattributes
Lines changed: 35 additions & 0 deletions b/‎demo/.gitattributes
Lines changed: 35 additions & 0 deletions
diff --git a/‎demo/.gitignore
Lines changed: 6 additions & 0 deletions b/‎demo/.gitignore
Lines changed: 6 additions & 0 deletions
@@ -0,0 +1,3 @@
+[submodule "live2diff/MiDaS"]
+	path = live2diff/MiDaS
+	url = [email protected]:lewiji/MiDaS.git
@@ -0,0 +1,26 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    # Ruff version.
+    rev: v0.3.5
+    hooks:
+      # Run the linter.
+      - id: ruff
+        args: [ --fix ]
+      # Run the formatter.
+      - id: ruff-format
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.1
+    hooks:
+      - id: codespell
+        args: ["-L", "warmup,mose,parms", "--skip", "*.json"]
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.3.0
+    hooks:
+      - id: trailing-whitespace
+      - id: check-yaml
+      - id: end-of-file-fixer
+      - id: requirements-txt-fixer
+      - id: fix-encoding-pragma
+        args: ["--remove"]
+      - id: mixed-line-ending
+        args: ["--fix=lf"]
@@ -10,7 +10,13 @@
 [![arXiv](https://img.shields.io/badge/arXiv-2407.08701-b31b1b.svg)](https://arxiv.org/abs/2407.08701)
 [![Project Page](https://img.shields.io/badge/Project-Page-blue)](https://live2diff.github.io/)
 
-## Code will be release in **one week**, stay tuned!
+## Release
+
+* [2024/07/18] We release [HuggingFace space](https://huggingface.co/spaces/Leoxing/Live2Diff), code, and [checkpoints](https://huggingface.co/Leoxing/Live2Diff).
+
+## TODO List
+
+- [ ] Support Colab
 
 ## Key Features
 
@@ -33,8 +39,140 @@ The speed evaluation is conducted on **Ubuntu 20.04.6 LTS** and **Pytorch 2.2.2*
 | 768 x 512  |  **On**  | **12.15** |
 | 768 x 512  |   Off    |   6.29    |
 
+## Installation
+
+### Step0: clone this repository and submodule
+
+```bash
+git clone https://github.com/open-mmlab/Live2Diff.git
+# or vis ssh
+git clone [email protected]:open-mmlab/Live2Diff.git
+
+cd Live2Diff
+git submodule update --init --recursive
+```
+
+### Step1: Make Environment
+
+Create virtual envrionment via conda:
+
+```bash
+conda create -n live2diff python=3.10
+conda activate live2diff
+```
+
+### Step2: Install PyTorch and xformers
+
+Select the appropriate version for your system.
+
+```bash
+# CUDA 11.8
+pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu118
+# CUDA 12.1
+pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu121
+```
+
+Please may refers to https://pytorch.org/ for more detail.
+
+### Step3: Install Project
+
+If you want to use TensorRT acceleration (we recommend it), you can install it by the following command.
+
+```bash
+# for cuda 11.x
+pip install ."[tensorrt_cu11]"
+# for cuda 12.x
+pip install ."[tensorrt_cu12]"
+```
+
+Otherwise, you can install it via
+
+```bash
+pip install .
+```
+
+If you want to install it with development mode (a.k.a. "Editable Installs"), you can add `-e` option.
+
+```bash
+# for cuda 11.x
+pip install -e ."[tensorrt_cu11]"
+# for cuda 12.x
+pip install -e ."[tensorrt_cu12]"
+# or
+pip install -e .
+```
+
+### Step4: Download Checkpoints and Demo Data
+
+1. Download StableDiffusion-v1-5
+
+```bash
+huggingface-cli download runwayml/stable-diffusion-v1-5 --local-dir ./models/Model/stable-diffusion-v1-5
+```
+
+2. Download Checkpoint from [HuggingFace](https://huggingface.co/Leoxing/Live2Diff) and put it under `models` folder.
+
+3. Download Depth Detector from MiDaS's official [release](https://github.com/isl-org/MiDaS/releases/download/v3/dpt_hybrid_384.pt) and put it under `models` folder.
+
+4. Apply the download token from [civitAI](https://education.civitai.com/civitais-guide-to-downloading-via-api/) and then download Dreambooths and LoRAs via the script:
+
+```bash
+# download all DreamBooth/Lora
+bash scripts/download.sh all YOUR_TOKEN
+# or download the one you want to use
+bash scripts/download.sh disney YOUR_TOKEN
+```
+
+5. Download demo data from [OneDrive](https://pjlab-my.sharepoint.cn/:f:/g/personal/xingzhening_pjlab_org_cn/EpefezlxFXNBk93RDttYLMUBP2bofb6AZDfyRIkGapmIrQ?e=A6h2Eb).
+
+Then then data structure of `models` folder should be like this:
+
+```bash
+./
+|-- models
+|   |-- LoRA
+|   |   |-- MoXinV1.safetensors
+|   |   `-- ...
+|   |-- Model
+|   |   |-- 3Guofeng3_v34.safetensors
+|   |   |-- ...
+|   |   `-- stable-diffusion-v1-5
+|   |-- live2diff.ckpt
+|   `-- dpt_hybrid_384.pt
+`--data
+   |-- 1.mp4
+   |-- 2.mp4
+   |-- 3.mp4
+   `-- 4.mp4
+```
+
+### Notification
+
+The above installation steps (e.g. [download script](#step4-download-checkpoints-and-demo-data)) are for Linux users and not well tested on Windows. If you face any difficulties, please feel free to open an issue 🤗.
+
+## Quick Start
+
+You can try examples under [`data`](./data) directory. For example,
+```bash
+# with TensorRT acceleration, please pay patience for the first time, may take more than 20 minutes
+python test.py ./data/1.mp4 ./configs/disneyPixar.yaml --max-frames -1 --prompt "1man is talking" --output work_dirs/1-disneyPixar.mp4 --height 512 --width 512 --acceleration tensorrt
+
+# without TensorRT acceleration
+python test.py ./data/2.mp4 ./configs/disneyPixar.yaml --max-frames -1 --prompt "1man is talking" --output work_dirs/1-disneyPixar.mp4 --height 512 --width 512 --acceleration none
+```
+
+You can adjust denoising strength via `--num-inference-steps`, `--strength`, and `--t-index-list`.  Please refers to `test.py` for more detail.
+
+## Troubleshooting
+
+1. If you face Cuda Out-of-memory error with TensorRT, please try to reduce `t-index-list` or `strength`. When inference with TensorRT, we maintian a group of buffer for kv-cache, which consumes more memory. Reduce `t-index-list` or `strength` can reduce the size of kv-cache and save more GPU memory.
+
 ## Real-Time Video2Video Demo
 
+There is an interactive txt2img demo in [`demo`](./demo) directory!
+
+Please refers to [`demo/README.md`](./demo/README.md) for more details.
+
 <div align="center">
     <table align="center">
         <tbody>
 
@@ -0,0 +1,36 @@
+pretrained_model_path: "./models/Model/stable-diffusion-v1-5"
+
+motion_module_path: './models/live2diff.ckpt'
+depth_model_path: './models/dpt_hybrid_384.pt'
+
+unet_additional_kwargs:
+  cond_mapping: true
+  use_inflated_groupnorm:          true
+  use_motion_module              : true
+  motion_module_resolutions      : [ 1,2,4,8 ]
+  unet_use_cross_frame_attention : false
+  unet_use_temporal_attention    : false
+
+  motion_module_type: Streaming
+  motion_module_kwargs:
+    num_attention_heads                : 8
+    num_transformer_block              : 1
+    attention_block_types              : [ "Temporal_Self", "Temporal_Self" ]
+    temporal_position_encoding         : true
+    temporal_position_encoding_max_len : 24
+    temporal_attention_dim_div         : 1
+    zero_initialize                    : true
+
+    attention_class_name               : 'stream'
+
+    attention_kwargs:
+      window_size: 16
+      sink_size: 8
+
+noise_scheduler_kwargs:
+  num_train_timesteps: 1000
+  beta_start:    0.00085
+  beta_end:      0.012
+  beta_schedule: "linear"
+  steps_offset:  1
+  clip_sample:   False
@@ -0,0 +1,11 @@
+# good s0.4
+base: "./configs/base_config.yaml"
+
+prompt_template: "masterpiece, best quality, intricate, print, pattern, {}"
+
+third_party_dict:
+  dreambooth: "./models/Model/disneyPixarCartoon_v10.safetensors"
+  clip_skip: 2
+
+num_inference_steps: 50
+t_index_list: [30, 36, 42]
@@ -0,0 +1,17 @@
+# good, s06-4 and s05-4
+base: "./configs/base_config.yaml"
+
+prompt_template: "masterpiece, best quality, felted, {},"
+
+third_party_dict:
+  dreambooth: "./models/Model/revAnimated_v2RebirthVAE.safetensors"
+  lora_list:
+    - lora: './models/LoRA/kFeltedReV.safetensors'
+      lora_alpha: 1
+  clip_skip: 2
+
+num_inference_steps: 50
+t_index_list: [25, 34, 43]
+# or
+# num_inference_steps: 50
+# t_index_list: [20, 27, 34, 41]
@@ -0,0 +1,13 @@
+base: "./configs/base_config.yaml"
+
+prompt_template: 'shukezouma, negative space, shuimobysim, official art,extremely detailed CG,unity 8k wallpaper,chinese ink painting, {}'
+
+third_party_dict:
+  lora_list:
+    - lora: "./models/LoRA/MoXinV1.safetensors"
+      lora_alpha: 0.7
+  dreambooth: "./models/Model/3Guofeng3_v34.safetensors"
+  clip_skip: 2
+
+num_inference_steps: 50
+t_index_list: [30, 36, 42]
@@ -0,0 +1,15 @@
+# good, s04!!!!
+
+base: "./configs/base_config.yaml"
+
+prompt_template: "(masterpiece),best quality, a origami paper of {}"
+
+third_party_dict:
+  dreambooth: "./models/Model/helloartdoor_V122p.safetensors"
+  lora_list:
+    - lora: "./models/LoRA/ral-origami-sd15.safetensors"
+      lora_alpha: 1
+  clip_skip: 2
+
+num_inference_steps: 50
+t_index_list: [30, 36, 42]
@@ -0,0 +1,12 @@
+# good, 0.4 & 0.3
+
+base: "./configs/base_config.yaml"
+
+prompt_template: "(masterpiece), best quality, {}"
+
+third_party_dict:
+  dreambooth: "./models/Model/aziibpixelmix_v10.safetensors"
+  clip_skip: 2
+
+num_inference_steps: 4
+strength: 0.6
@@ -0,0 +1,10 @@
+base: "./configs/base_config.yaml"
+
+prompt: "masterpiece, best quality, intricate, print, pattern, {}"
+
+third_party_dict:
+  dreambooth: "./models/Model/toonyou_beta6.safetensors"
+  clip_skip: 2
+
+num_inference_steps: 50
+t_index_list: [25, 31, 37, 43]
@@ -0,0 +1,35 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
@@ -0,0 +1,6 @@
+__pycache__/
+venv/
+public/
+*.pem
+!lib/
+!static/
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+[submodule "live2diff/MiDaS"]`
	`2`	`+ path = live2diff/MiDaS`
	`3`	`+ url = [email protected]:lewiji/MiDaS.git`
-Original file line number
+Diff line change
@@ @@ -0,0 +1,6 @@ @@
 +__pycache__/
 +venv/
 +public/
 +*.pem
 +!lib/
 +!static/