Skip to main content
Reports
Created by
Created On
Last edited
0
2023-08-16
0
2023-06-22
0
2023-06-21
0
2023-05-12
0
2023-05-09
0
2023-05-01
0
2023-04-19
[fix] Fix ILQL head sync under ZeRO3 #387
https://github.com/CarperAI/trlx/pull/387
0
2023-03-23
Cuda OOM with PPO on GPT2-medium #372
https://github.com/CarperAI/trlx/issues/372
0
2023-03-21
0
2023-03-16
0
2023-03-06
0
2023-03-06
0
2023-03-06
0
2023-03-06
Convert the rest of configs from ymls #346
https://github.com/CarperAI/trlx/pull/346
0
2023-03-01
Add batch_size option for the reward model #322
https://github.com/CarperAI/trlx/pull/322
0
2023-02-21
Make gather_for_metrics usage more strict #315
seq2seq example, the difference in reward/mean reflects the difference in len(eval_samples), rest of the metrics are the same
0
2023-02-20
Make gather_for_metrics more strict #315
https://github.com/CarperAI/trlx/pull/315
0
2023-02-19
Make gather_for_metrics more strict #315
https://github.com/CarperAI/trlx/pull/315, no changes in behavior
0
2023-02-17
0
2023-02-13
Gather experience samples #305
https://github.com/CarperAI/trlx/pull/305
0
2023-02-13
Gather experience samples #305
no changes with 1. determinstic reward_fn and 2. a single process runs with usual sentiment pipeline
0
2023-02-11
Gather experience samples #305
https://github.com/CarperAI/trlx/pull/305
0
2023-02-10
0
2023-02-08
Add Accelerate SFT Trainer #280
https://github.com/CarperAI/trlx/pull/280 CUDA_VISIBLE_DEVICES=0 python examples/ppo_sentiments.py & CUDA_VISIBLE_DEVICES=1 python examples/sft_sentiments.py & CUDA_VISIBLE_DEVICES=2 python examples/ilql_sentiments.py &
0
2023-02-07
0
2023-02-06
[fix] Set deepspeed's fp16 auto_cast to false #279
https://github.com/CarperAI/trlx/pull/279
0
2023-02-06
Improve PPO readability #210
https://github.com/CarperAI/trlx/pull/210
0
2023-02-05
Fix distributed dataloaders & deduplicate eval #276
https://github.com/CarperAI/trlx/pull/276
0
2023-02-04
0
2023-02-03
Fix heads dtype
zero3 now works for ILQL, no changes for PPO
0
2023-01-23
0
2023-01-22
0
2023-01-19
0
2023-01-13
0
2023-01-12
0
2023-01-12
0
2023-01-04