jon-tow

hh-gpt-j

jon-tow

2023-03-17

2 years ago

trlx: Set add_special_tokens=False to not add EOS unexpectedly #287

Unable to directly repro T5 results

jon-tow

2023-02-10

3 years ago

trlx: refactor - remove orchestrator abstraction from API #289

ILQL sentiments reproduction from `main`

jon-tow

2023-02-08

3 years ago

trlx: refactor - remove orchestrator abstraction from API #289

* PPO sentiments reproduction from `main` (gpt2) * Summarize dailymail/cnn reproduction from `main` (google/flan-t5-small)`

jon-tow

2023-02-08

3 years ago

trlx: Add `bitsandbytes` optimizer support #133

Report for the following PR: https://github.com/CarperAI/trlx/pull/133

jon-tow

2023-01-04

3 years ago

trlx: Add `LORA` support #110

Report for the following PR: https://github.com/CarperAI/trlx/pull/110

jon-tow

2023-01-11

3 years ago

trlx: Repro `bnb` perf after t5 update

jon-tow

2023-01-10

3 years ago

trlx: LORA support

Results for the ILQL Sentiment task with LORA to observe training dynamics and memory saving.

jon-tow

2023-01-07

3 years ago

trlx: LORA support

Results for the PPO Sentiment task with LORA to observe training dynamics and memory saving.

jon-tow

2023-01-06

3 years ago

trlx: fix-frozen-branch PR

jon-tow

2022-12-17

3 years ago

trlx: Add OptimizerConfig and SchedulerConfig #135

Results for PPO/ILQL sentiments examples to demonstrate no regressions.

jon-tow

2022-12-14

3 years ago

trlx: `accelerate` Multi-Node DDP Benchmark

PPO Sentiments Benchmark on Multi-Node DDP setup

jon-tow

2022-12-07

3 years ago

Hydra GPT-J PPO Sentiment

Hydra GPT-J with all but one layer unfrozen in the base backbone

jon-tow

2022-12-06

3 years ago

GPT-J PPO Sentiment

Hydra GPT-J with all but one layer unfrozen in the base backbone

jon-tow

2022-12-06

3 years ago

trlx: GPT-Neo PPO Sentiment

ILQL sentiment results for the `Add GPTNeo support` PR.

jon-tow

2022-12-06

3 years ago

trlx: GPT-Neo PPO Sentiment

PPO sentiment results for the `Add GPTNeo support` PR.

jon-tow

2022-12-06

3 years ago

trlx: ILQL Sentiment Benchmark

ILQL Benchmark results for the wider Causal LM support PR.

jon-tow

2022-12-05

3 years ago

trlx: PPO Sentiment Benchmark

PPO Benchmark results for the wider Causal LM support PR.

jon-tow

2022-12-05

3 years ago

pythia-13b test run

jon-tow

2022-12-04

3 years ago

GPT-NeoX PPO Positive Sentiment (Test Run)

jon-tow

2022-12-01

3 years ago