byyoung3

Tutorial: Fine-tuning OpenAI gpt-oss

Unlock the power of fine-tuning OpenAI's GPT-OSS models with Weights & Biases. Customize LLMs for your tasks, save costs, and boost performance.

byyoung3

2025-08-05

3 weeks ago

What is RLHF? Reinforcement learning from human feedback for AI alignment

This article explains how reinforcement learning from human feedback (RLHF) is used to train language models that better reflect human preferences, including practical steps, code examples, and evaluation techniques.

byyoung3

2025-07-25

1 month ago