-
Notifications
You must be signed in to change notification settings - Fork 205
Humanoid control new #1042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ChenControl
wants to merge
3
commits into
PaddlePaddle:develop
Choose a base branch
from
ChenControl:humanoid_control_new
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Humanoid control new #1042
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
本项目使用PaddleScience和DeepMind Control Suite (dm_control) 实现了一个人形机器人(Humanoid)的运动控制系统。该系统通过深度学习方法,学习控制人形机器人进行稳定的运动。PINN(Physics-informed Neural Network)方法利用控制方程加速深度学习神经网络收敛,甚至在无训练数据的情况下实现无监督学习。尝试实现Humanoid控制仿真。 | ||
|
||
1. 开发指南 - PaddleScience Docs (paddlescience-docs.readthedocs.io) | ||
2. google-deepmind/dm_control: Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo. (github.com) | ||
pip install dm_control | ||
|
||
安装paddle cuda11.8 | ||
python3 -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ | ||
|
||
安装paddlescience | ||
git clone -b develop https://github.com/PaddlePaddle/PaddleScience.git | ||
### 若 github clone 速度比较慢,可以使用 gitee clone | ||
### git clone -b develop https://gitee.com/paddlepaddle/PaddleScience.git | ||
cd PaddleScience | ||
### install paddlesci with editable mode | ||
python -m pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple | ||
|
||
### MuJoCo Humanoid Control with PaddleScience | ||
|
||
|
||
### 主要特点 | ||
- 使用PaddleScience框架进行深度学习模型训练 | ||
- 基于dm_control的MuJoCo物理引擎进行机器人仿真 | ||
- 实现了自监督学习方案 | ||
- 提供了完整的训练和评估流程 | ||
- 包含详细的性能分析和可视化工具 | ||
|
||
## 项目结构 | ||
PaddleScience/examples/ | ||
``` | ||
mujoco_control/ | ||
├── conf/ | ||
│ └── humanoid_control.yaml # 配置文件 | ||
├── humanoid_complete.py # 主程序文件 | ||
└── outputs_HumanoidControl/ # 输出目录 | ||
└── YYYY-MM-DD/ # 按日期组织的输出 | ||
├── checkpoints/ # 模型检查点 | ||
├── evaluation/ # 评估结果 | ||
└── logs/ # 训练日志 | ||
``` | ||
``` | ||
── conf | ||
│ └── humanoid_control.yaml | ||
├── humanoid_complete.py | ||
└── outputs_HumanoidControl | ||
├── 13-17-41 | ||
│ └── mode=train | ||
│ ├── checkpoints | ||
│ │ ├── epoch_10.pdopt | ||
│ │ ├── epoch_10.pdparams | ||
│ │ ├── epoch_10.pdstates | ||
│ │ ├── epoch_100.pdopt | ||
│ │ ├── epoch_100.pdparams | ||
│ │ ├── epoch_100.pdstates | ||
│ │ ├── epoch_20.pdopt | ||
│ │ ├── epoch_20.pdparams | ||
│ │ ├── epoch_20.pdstates | ||
│ │ ├── epoch_30.pdopt | ||
│ │ ├── epoch_30.pdparams | ||
│ │ ├── epoch_30.pdstates | ||
│ │ ├── epoch_40.pdopt | ||
│ │ ├── epoch_40.pdparams | ||
│ │ ├── epoch_40.pdstates | ||
│ │ ├── epoch_50.pdopt | ||
│ │ ├── epoch_50.pdparams | ||
│ │ ├── epoch_50.pdstates | ||
│ │ ├── epoch_60.pdopt | ||
│ │ ├── epoch_60.pdparams | ||
│ │ ├── epoch_60.pdstates | ||
│ │ ├── epoch_70.pdopt | ||
│ │ ├── epoch_70.pdparams | ||
│ │ ├── epoch_70.pdstates | ||
│ │ ├── epoch_80.pdopt | ||
│ │ ├── epoch_80.pdparams | ||
│ │ ├── epoch_80.pdstates | ||
│ │ ├── epoch_90.pdopt | ||
│ │ ├── epoch_90.pdparams | ||
│ │ ├── epoch_90.pdstates | ||
│ │ ├── latest.pdopt | ||
│ │ ├── latest.pdparams | ||
│ │ └── latest.pdstates | ||
│ └── train.log | ||
``` | ||
## 核心组件 | ||
|
||
### 1. 数据集类 (HumanoidDataset) | ||
```python | ||
class HumanoidDataset: | ||
"""处理训练数据的收集和预处理""" | ||
def __init__(self, num_episodes=1000, episode_length=1000, ratio_split=0.8) | ||
def collect_episode_data(self) # 收集单个回合数据 | ||
def _flatten_observation(self) # 处理观察数据 | ||
def generate_dataset(self) # 生成训练集和验证集 | ||
``` | ||
|
||
### 2. 控制器模型 (HumanoidController) | ||
```python | ||
class HumanoidController(paddle.nn.Layer): | ||
"""神经网络控制器""" | ||
def __init__(self, state_size, action_size, hidden_size=256) | ||
def forward(self, x) # 前向传播,预测动作 | ||
``` | ||
|
||
### 3. 评估器类 (HumanoidEvaluator) | ||
```python | ||
class HumanoidEvaluator: | ||
"""模型评估和可视化""" | ||
def __init__(self, model_path, num_episodes=5, episode_length=1000) | ||
def evaluate_episode(self) # 评估单个回合 | ||
def run_evaluation(self) # 运行完整评估 | ||
``` | ||
|
||
## 配置说明 | ||
|
||
主要配置参数(在humanoid_control.yaml中): | ||
|
||
```yaml | ||
DATA: | ||
num_episodes: 100 # 训练回合数 | ||
episode_length: 500 # 每回合步数 | ||
|
||
MODEL: | ||
hidden_size: 256 # 隐藏层大小 | ||
|
||
TRAIN: | ||
epochs: 100 # 训练轮数 | ||
batch_size: 32 # 批次大小 | ||
learning_rate: 0.001 # 学习率 | ||
|
||
EVAL: | ||
num_episodes: 5 # 评估回合数 | ||
episode_length: 1000 # 评估步数长度 | ||
``` | ||
|
||
## 训练流程 | ||
|
||
### 训练方法 | ||
1. 数据收集: | ||
- 使用随机策略收集初始训练数据 | ||
- 将数据分割为训练集和验证集 | ||
|
||
2. 模型训练: | ||
- 使用PaddleScience的训练框架 | ||
- 实现了自定义损失函数 | ||
- 包含动作预测和奖励最大化两个目标 | ||
|
||
3. 训练命令: | ||
```bash | ||
python humanoid_complete.py mode=train | ||
``` | ||
|
||
### 评估方法 | ||
1. 模型评估: | ||
- 在真实环境中运行训练好的模型 | ||
- 收集性能指标 | ||
- 生成评估视频(如果可用) | ||
|
||
2. 评估命令: | ||
```bash | ||
python humanoid_complete.py mode=eval +EVAL.pretrained_model_path="path/to/checkpoint" | ||
``` | ||
|
||
## 性能分析 | ||
|
||
评估过程会生成以下分析结果: | ||
- 总体奖励统计 | ||
- 动作模式分析 | ||
- 性能可视化图表 | ||
- 评估视频(如果启用) | ||
|
||
## 输出说明 | ||
|
||
### 训练输出 | ||
- 模型检查点 | ||
- 训练日志 | ||
- 学习曲线 | ||
|
||
### 评估输出 | ||
- 统计数据文件 (stats.txt) | ||
- 性能分析图表 | ||
- 评估视频文件(如果启用) | ||
|
||
## 使用示例 | ||
|
||
1. 训练新模型: | ||
python humanoid_complete.py mode=train | ||
|
||
2. 评估已训练模型: | ||
python humanoid_complete.py mode=eval | ||
|
||
## 注意事项 | ||
|
||
1. 环境要求: | ||
- PaddlePaddle >= 3.0.0 | ||
- dm_control | ||
- MuJoCo物理引擎 | ||
- Python >= 3.7 (测试环境为3.10.15) | ||
|
||
2. 性能优化建议: | ||
- 适当调整batch_size和learning_rate | ||
- 根据需要修改网络结构 | ||
- 可以通过修改配置文件调整训练参数 | ||
|
||
3. 已知问题: | ||
- WSL2环境下可能存在可视化问题 | ||
- 需要使用适当的渲染后端 | ||
|
||
## 未来改进 | ||
|
||
1. 功能扩展: | ||
- 添加更多控制策略 | ||
- 实现多种任务场景 | ||
- 增强可视化功能 | ||
|
||
2. 性能优化: | ||
- 改进训练效率 | ||
- 优化模型结构 | ||
- 增加并行训练支持 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,41 @@ | ||||||
defaults: | ||||||
- _self_ | ||||||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||
|
||||||
hydra: | ||||||
run: | ||||||
dir: outputs_HumanoidControl/${now:%Y-%m-%d}/${now:%H-%M-%S}/${hydra.job.override_dirname} | ||||||
job: | ||||||
name: ${mode} | ||||||
chdir: false | ||||||
sweep: | ||||||
dir: ${hydra.run.dir} | ||||||
subdir: ./ | ||||||
|
||||||
mode: train | ||||||
seed: 42 | ||||||
output_dir: ${hydra:run.dir} | ||||||
log_freq: 20 | ||||||
|
||||||
DATA: | ||||||
num_episodes: 100 | ||||||
episode_length: 500 | ||||||
|
||||||
MODEL: | ||||||
hidden_size: 256 | ||||||
|
||||||
TRAIN: | ||||||
epochs: 100 | ||||||
iters_per_epoch: 10 | ||||||
save_freq: 10 | ||||||
learning_rate: 0.001 | ||||||
batch_size: 32 | ||||||
pretrained_model_path: null | ||||||
checkpoint_path: null | ||||||
eval_with_no_grad: true | ||||||
|
||||||
EVAL: | ||||||
pretrained_model_path: outputs_HumanoidControl/2024-12-15/22-02-39/mode=train/checkpoints/latest.pdparams | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
eval_with_no_grad: true | ||||||
num_episodes: 5 | ||||||
episode_length: 1000 | ||||||
interactive: false |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.