Reports
Created by
Created On
Last edited
Make gather_for_metrics usage more strict #315
seq2seq example, the difference in reward/mean reflects the difference in len(eval_samples), rest of the metrics are the same
0
2023-02-20
Make gather_for_metrics more strict #315
https://github.com/CarperAI/trlx/pull/315, no changes in behavior
0
2023-02-17
Gather experience samples #305
no changes with 1. determinstic reward_fn and 2. a single process runs with usual sentiment pipeline
0
2023-02-11
Add Accelerate SFT Trainer #280
https://github.com/CarperAI/trlx/pull/280
CUDA_VISIBLE_DEVICES=0 python examples/ppo_sentiments.py &
CUDA_VISIBLE_DEVICES=1 python examples/sft_sentiments.py &
CUDA_VISIBLE_DEVICES=2 python examples/ilql_sentiments.py &
0
2023-02-07
[fix] Set deepspeed's fp16 auto_cast to false #279
https://github.com/CarperAI/trlx/pull/279
0
2023-02-06
Fix distributed dataloaders & deduplicate eval #276
https://github.com/CarperAI/trlx/pull/276
0
2023-02-04