- 
                Notifications
    
You must be signed in to change notification settings  - Fork 5.9k
 
2018 02 28
        Tao Luo edited this page Dec 9, 2019 
        ·
        1 revision
      
    Main focus: ParallelDo in Multiple GPUs
Issue:
- The problem of improving the performance of Parallel_Do
 
PR:
- Backward on parallel do using nccl:
 - Python framework:
 
Review:
- https://github.com/PaddlePaddle/Paddle/pull/8516
 - https://github.com/PaddlePaddle/Paddle/pull/8489
 - https://github.com/PaddlePaddle/Paddle/pull/8402
 - https://github.com/PaddlePaddle/Paddle/pull/8471
 
Inference:
- Integrate float16 into data_type_transform:
 - Enable is_test for batch norm and dropout op:
 - [WIP] add float16 GEMM GPU function in math_function
 - Review:
 
Inference: PR/issue:
- Profiling C++ inference api for recognize digits model: https://github.com/PaddlePaddle/Paddle/pull/8497
 - Results for this analysis: https://github.com/sidgoyal78/paddle_notes/blob/master/benchmark/recoginze_digits.md
 - Survey TensorRT for inference: https://github.com/PaddlePaddle/Paddle/issues/8492
 
- CSP
- Exposing Channel to be used as a Variable and integrating with Fluid https://github.com/PaddlePaddle/Paddle/pull/8486
 - Add unit tests for ChannelHolder https://github.com/PaddlePaddle/Paddle/pull/8486
 - Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops https://github.com/PaddlePaddle/Paddle/pull/8593
 - Adding more unit tests for ChannelHolder class https://github.com/PaddlePaddle/Paddle/pull/8668
 - Review:
 
 
- Exposing Channel to be used as a Variable and integrating with Fluid: https://github.com/PaddlePaddle/Paddle/pull/8486
 - Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops: https://github.com/PaddlePaddle/Paddle/pull/8593
 - nvcc fatal errors on TeamCity: https://github.com/PaddlePaddle/Paddle/issues/8501
 - PR review:
 
- inference:
- refine inference_lib_dist after code move, and add it to docker/build.sh: https://github.com/PaddlePaddle/Paddle/pull/8379
 - combine batch_size_like.cc into batch_size_like.h: https://github.com/PaddlePaddle/Paddle/pull/8604
 
 - compile:
- Move Fluid C++ code from /paddle to /paddle/fluid:
 - move Fluid API doc/code out of V2 API doc/code:
 - set the default option of WITH_FAST_BUNDLE_TEST be OFF: https://github.com/PaddlePaddle/Paddle/pull/8563
 
 - document
- reduce doc build time in travis ci (from 30+ min to 4 min):
 - Adjust the structure of API, Operators, cluster and quick start (Both Chinese and English):
 - update generate_paddle_docs.sh in paddlepaddle.org repo: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/409
 
 - code review:
- Simplify the cmake of inference: https://github.com/PaddlePaddle/Paddle/pull/8272
 - Refine cmake for cudnn op: https://github.com/PaddlePaddle/Paddle/pull/8591
 - [Intel] MKLDNN conv2d and pool2d OP kernels added: https://github.com/PaddlePaddle/Paddle/pull/8451
 - [doc] add introduction:
 
 
- fix dist traning bug, make sure demo code work.
 - fix v2 async sgd update, https://github.com/PaddlePaddle/Paddle/pull/8474
 - english toturial doc on cloud repo, https://github.com/PaddlePaddle/cloud/pull/621
 - review:
- https://github.com/PaddlePaddle/cloud/pull/617#pullrequestreview-99173766
 - https://github.com/PaddlePaddle/Paddle/pull/8538#pullrequestreview-99175816
 - https://github.com/PaddlePaddle/Paddle/pull/8656#pullrequestreview-100022647
 - https://github.com/PaddlePaddle/Paddle/pull/8634#pullrequestreview-99954888
 
 - turn on cmake flag 
WITH_DISTRIBUTEon CI so that the wheel package support distributed training. 
- Enhance layer_generator
- https://github.com/PaddlePaddle/Paddle/pull/8543
 - 
mean(x=layer_out)-->mean(layer_out) 
 - Moving unique_name to python. We can reset the unique_name generator now
 - Demo about switch optmizers
 - Demo about stack denoising autoencoder
 - Make global_step as a global variable in Fluid
 - Several Enhancements
 
- 
learning rate decay (https://github.com/PaddlePaddle/Paddle/issues/7769)
- Fix compare op https://github.com/PaddlePaddle/Paddle/pull/8532
 - create learning rate for multi program https://github.com/PaddlePaddle/Paddle/pull/8545
 - change learning_rate_decay to 
learning_rate_schedulerhttps://github.com/PaddlePaddle/Paddle/pull/8583 
 - 
multi gpu profile
- parallel-do should not merge the gradient of parameter that stop_gradient=True https://github.com/PaddlePaddle/Paddle/pull/8652
 - 
se_resnet_50multi-gpu profile https://github.com/PaddlePaddle/Paddle/issues/8661 
 - 
add c-api quick start https://github.com/PaddlePaddle/Paddle/pull/8566
 - 
discuss
- The problem of improving the performance of Parallel_Do https://github.com/PaddlePaddle/Paddle/issues/8592
 
 - 
Review
- Fine Tune MNIST by Adam and SGD https://github.com/PaddlePaddle/Paddle/pull/8570
 - simplify shape inference code https://github.com/PaddlePaddle/Paddle/pull/8087
 - Enhance 
layer_function_generatorhttps://github.com/PaddlePaddle/Paddle/pull/8543 - Moving unique_name to python https://github.com/PaddlePaddle/Paddle/pull/8524
 
 
- A new design of model save/load:
 - [WIP] Disassemble evaluator:
 - Reviews:
 
- SSD on Fluid:
- [Merged] Enhance bipartite_match_op to support argmax matching after bipartite matching.
 - [Merged] Register more data type for reshape operator.
 - [Merged] Enable the SSD loss to support normalization by the total number of output locations.
 - [Merged] Fix the backward transpiler bug in ssd_loss API.
 - Verify the correctness of SSD loss:
 - [WIP] Verify the correctness of detection output
 - Review:
- Fix box coder op: https://github.com/PaddlePaddle/Paddle/pull/8647
 
 
 - Other:
- Help HuaWei to debug the AR demo on Fluid.
 
 
DeepASR:
- Convergence verification on single GPU
 - Performance profiling
 - Some enhancements
 - Fix the profiler's bug in multi-gpu mode
 
Code Review:
Main focus: ParallelDo in Multiple GPUs
- 
PR:
 - 
Reviews:
- https://github.com/PaddlePaddle/Paddle/pull/8665#pullrequestreview-100223755
 - https://github.com/PaddlePaddle/Paddle/issues/8504#event-1488164262
 - https://github.com/PaddlePaddle/Paddle/issues/8480#event-1488164544
 - https://github.com/PaddlePaddle/Paddle/pull/8471#pullrequestreview-98784602
 - https://github.com/PaddlePaddle/Paddle/issues/8500
 - https://github.com/PaddlePaddle/Paddle/issues/8592#issuecomment-368693237
 - https://github.com/PaddlePaddle/Paddle/pull/8550#pullrequestreview-100248472
 
 - 
CI fixes:
 
- 
fulid
- 
PR, conv sequence to sequence
- finished, tuning with data
 
 - [PR, v2 API doc overview]https://github.com/PaddlePaddle/Paddle/pull/8547
 
 - 
PR, conv sequence to sequence
 - 
visualdl
 
- Distribute training:
- Tensorflow: https://github.com/PaddlePaddle/Paddle/pull/8522
 - Change script to support tensorflow distribution on k8s: https://github.com/PaddlePaddle/cloud/pull/617
 
 - Document:
- cluster train: https://github.com/PaddlePaddle/Paddle/pull/8622
 - Fix docment generation bugs: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/418
 
 - Review:
 
- Inference Framework
- [Merged] Refine the inference API and unittest
 - Write a basic userguide of Fluid inference
 - Review
- combine batch_size_like.cc into batch_size_like.h: https://github.com/PaddlePaddle/Paddle/pull/8604
 - Inference example and unittest for NMT model: https://github.com/PaddlePaddle/Paddle/pull/8314
 
 
 - Get rid of the dependency of Go compiler when WITH_GOLANG is OFF
 
- Profile data reader for DeepASR 
https://github.com/PaddlePaddle/models/issues/673 - RNN Beam search 
https://github.com/PaddlePaddle/Paddle/issues/8603
https://github.com/PaddlePaddle/models/pull/675 
- Debug and tune the Transformer model referenced with the Pytorch implementation.
 - PR:
- Remove the losses from paddings in Transformer
 - Add learning rate scheduling in Transformer
 
 
- CI speed up
 - Bisect culprit commit:
 - Timeline Profiler
 - Reviews
 
- 
PR
- [WIP]Refine concat_op
 - Refine Sum in elementwise_op_function
 - Refine cmake for cudnn op
 - Add tuple type
 - Fix conv_op bug
 - refine FQA doc
 - fix get_mid_dims annotation
 
 - 
Review
- Refine Sum in elementwise_op_function
 - Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops
 - Add unit tests for ChannelHolder
 - Extend current profiler for timeline and more features
 - Enhance bipartite_match_op to support argmax matching after bipartite matching
 - Add ceil_mode option for pool2d and pool3d
 
 
- 
CI updates
- CI docker graph location update, teamcity data relocation to larger storage
 - daily team city backup
 - daily docker prune
 
 - 
Review
 - 
Pr
 
- GPU perf
- https://github.com/PaddlePaddle/Paddle/pull/8550
 - https://github.com/PaddlePaddle/Paddle/issues/8638
 - https://github.com/PaddlePaddle/Paddle/pull/8634
 - https://github.com/PaddlePaddle/Paddle/pull/8573
 - https://github.com/PaddlePaddle/Paddle/pull/8538
 - https://github.com/PaddlePaddle/Paddle/pull/8512
 - review: https://github.com/PaddlePaddle/Paddle/pull/8600
 - review: https://github.com/PaddlePaddle/cloud/pull/617
 - reviews etc.
 
 - EDL:
 
PaddlePaddle.org
- Fix issue of MathJax equations and images not rendering correctly when user clicks on a new link. (https://github.com/PaddlePaddle/PaddlePaddle.org/pull/412)
 - Fix issue with permalink not showing entire title (https://github.com/PaddlePaddle/PaddlePaddle.org/pull/415)
 
Paddle
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops (https://github.com/PaddlePaddle/Paddle/pull/8593)
 - Unittests concurrency (https://github.com/PaddlePaddle/Paddle/pull/8666)
 
Other
- Worked on Visual Debugger Tech Talk with Varun
 
- Fix error message on charts: https://github.com/PaddlePaddle/VisualDL/pull/279
 - Add the Histogram related Vue files: https://github.com/PaddlePaddle/VisualDL/pull/278
 - Add the Graph.vue and the Config.vue for Graph tab: https://github.com/PaddlePaddle/VisualDL/pull/277
 - Fix the incorrect pagination issue: https://github.com/PaddlePaddle/VisualDL/pull/276
 - Allow the navigation bar to persist the selected item style. : https://github.com/PaddlePaddle/VisualDL/pull/274
 - Include the pymdownx extensions: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/425
 - Fix incorrect markdown and RST file: https://github.com/PaddlePaddle/Paddle/pull/8667
 
- Fix scalar issues: https://github.com/PaddlePaddle/VisualDL/pull/283
 - Show Scalar Data and add ExpandPanel https://github.com/PaddlePaddle/VisualDL/pull/272
 - Add theme and UI https://github.com/PaddlePaddle/VisualDL/pull/280
 - Fix font size and reorganize css and stylus https://github.com/PaddlePaddle/VisualDL/pull/287
 
- Add Go_op, Channel_create, channel_close, channel_send and channel_receive ops (https://github.com/PaddlePaddle/Paddle/pull/8593)
 - Worked on Visual Debugger Tech Talk with Thuan