In biochemistry, a ligand is a small molecule that binds to a protein. Can we predict how a protein and a ligand will interact—how tightly they will bind together—based on the chemical structures alone? Below are some examples of the chemical structures we're considering from the Protein Data Bank (PDB) Binding dataset, and some random forests used to predict the binding affinity (Ki) of the ligand to the protein.
Varying the number of estimators and the max features doesn't affect the learning curves much. You can see below that the learning curves are almost identical, overfitting quickly on the training and improving gradually as the train set size increases.
The R^2 on the validation data is more informative: increasing the number of estimators from 10 to 100, and the max features to "sqrt" (square root) gives better results by up to 11 points.