How To Get Started With Numerai Using Weights & Biases
In this article, we demonstrate how to get started with Numerai and compete in the hardest data science tournament on the planet using Weights & Biases.
Created on July 28|Last edited on October 31
Comment

Table of Contents
What is Numerai? How Does Numerai Work?Data ProcessingThe Numerai DatasetMetricsFeature Engineering and SelectionModeling/Hyperparameter OptimizationSubmissionCaveat Emptor (Things to be Aware of)Final Tips
What is Numerai?
Numerai is a crowdsourced AI hedge fund that operates on predictions made by data scientists worldwide (like you)! Numerai was founded by Richard Craib in 2015. Some very experienced people in quantitative finance, like Howard Morgan (Co-Founder of Renaissance Technologies) and Marcos López de Prado (Professor at Cornell University and scientific advisor to Numerai), are involved with the project.
By combining the predictions of thousands of data scientists, they can gain a competitive edge against other quantitative hedge funds. In contrast, data scientists can financially benefit by contributing their predictions to the platform.
Practically the first thing you see on the Numerai homepage is the bold statement "The hardest data science tournament on the planet." Why is competing on Numerai (not) so hard?
Why is Numerai Hard?
- The obfuscated data makes it impossible to integrate domain knowledge and forces you to focus on feature engineering and modeling.
- There is not much signal in the dataset, so it is hard to extract features with predictive value.
- Evaluating your model is not straightforward. Your model may be good in some market situations but fail in other situations.
Why is Numerai not That Hard?
- Contrary to other data science competitions, you do not have to be at the top of the leaderboard to profit from Numerai.
- No domain knowledge of finance is required.
- All features in the dataset are regularized, and there are no categorical features (except for the era column).
- The model does not have to be interpretable. You only provide the predictions.
How Does Numerai Work?
The basis of all transactions on the Numerai platform is the Numeraire (NMR) token. This token operates on the Ethereum platform and enables Numerai to facilitate transactions to its data scientists easily.
Each week data scientists deliver predictions using Numerai's datasets, and these predictions will be used for stock investments in their meta-model. Each user can then stake as much NMR on their model as they want. Depending on the quality of your model, your NMR stake will increase or decrease. Staking ensures that users deliver sensible models and precludes "Sybil attacks." By delivering steady predictions every week, your reputation will increase along with your leaderboard position. Note that you do not have to share any details about your model, and this makes it almost impossible for Numerai to reverse-engineer your model. Numerai and its users are, therefore, dependent on each other and share the risks in a balanced way.
Run set
1875
Data Processing
Numerai has its own API (NumerAPI) that provides a convenient interface to download the datasets, get information about the competition and upload your predictions. We can download the latest data, unzip and load it in just a few lines of code.
import numerapiNAPI = numerapi.NumerAPI(verbosity="info")# Download new dataDIR = "my_data_directory"NAPI.download_current_dataset(dest_path=DIR, unzip=True)# Load datafull_path = f'{DIR}/numerai_dataset_{NAPI.get_current_round()}/'train = pd.read_csv(full_path + 'numerai_training_data.csv')test_df = pd.read_csv(full_path + 'numerai_tournament_data.csv')# Split validation and testval = test_df[test_df['data_type'] == 'validation']test = test_df[test_df['data_type'] != 'validation']
The Numerai Dataset
1875
Metrics
Spearman Correlation
When competing on Numerai, your model will be evaluated on the "Spearman Correlation" metric. I have made a Kaggle Kernel dedicated to this metric that you can check out here. Scipy provides an excellent implementation to calculate the Spearman correlation:
from scipy.stats import spearmanrdef spearman(y_true, y_pred, axis=0):""" Calculate Spearman correlation """return spearmanr(y_true, y_pred, axis=axis)
Sharpe Ratio
Even though Spearman Correlation is the main metric, it does not take into account how stable your model is across multiple eras. Therefore, it is generally more useful to monitor the "Sharpe ratio". This metric is used a lot in quantitative finance. The basic Sharpe ratio for Numerai predictions can be calculated by taking the average correlation per era and dividing by the standard deviation of the correlations per era.
In Python code the calculation looks something like this:
import numpy as npimport pandas as pdfrom scipy.stats import spearmanrdef sharpe(df: pd.DataFrame) -> np.float32:"""Calculate the Sharpe ratio by using grouped per-era data:param df: A Pandas DataFrame containing the columns "era", "target" and "prediction":return: The Sharpe ratio for your predictions."""def _score(sub_df: pd.DataFrame) -> np.float32:""" Calculate Spearman correlation for Pandas' apply method """return spearmanr(sub_df["target"], sub_df["prediction"])[0, 1]corrs = df.groupby("era").apply(_score)return corrs.mean() / corrs.std()# Get Sharpe Ratio for validation datasharpe(val)
For this report, we will be monitoring the Spearman correlation, Sharpe Ratio, Numerai Payout Ratio, and the Mean Absolute Error (MAE) metrics. Additionally, we calculate the feature exposure, which I will talk about in the next section.
Feature Engineering and Selection
The features have a remarkably low correlation to the target variable. Even the most correlated features only have around 1.5% correlation with the target. Engineering useful features out of feature and era groupings are key for creating good Numerai models.
Also, the importance of features may change over time. By selecting a limited number of features, we risk having a high "feature exposure." Feature exposure can be quantified as the standard deviation of all your predictions' correlations with each feature. You can mitigate this risk by using dimensionality reduction techniques like Principal Component Analysis (PCA) to integrate almost all features into your model. In this starter example, we take 150 features that are most correlated to the target variable.
# Calculate correlations with targetfull_corr = train.corr()corr_with_target = full_corr["target"].T.apply(abs).sort_values(ascending=False)# Select features with highest correlation to the target variablefeatures = corr_with_target[:150]features.drop("target", inplace=True)feature_list = features.index.tolist()
Modeling/Hyperparameter Optimization
To get a first good model for Numerai, we will train a LightGBM model and use Weights & Biases to do a hyperparameter sweep. In this example, it will be a grid search over some of the most important hyperparameters for LightGBM. First, we define the configuration of the sweep.
sweep_config = {'method': 'grid','metric': {'name': 'mse','goal': 'minimize'},'parameters': {"num_leaves": {'values': [30, 40, 50]},"max_depth": {'values': [4, 5, 6]},"learning_rate": {'values': [0.05, 0.01, 0.005]},"bagging_freq": {'values': [7]},"bagging_fraction": {'values': [0.6, 0.7, 0.8]},"feature_fraction": {'values': [0.85, 0.75, 0.65]},}}sweep_id = wandb.sweep(sweep_config, project="numerai_tutorial")
After that we define a function (_train) using wandb.config attributes so Weights & Biases can perform the grid search. We make sure to log all the metrics and can then start the agent!
# Prepare data for LightGBMdtrain = lgb.Dataset(train[feature_list], label=train["target"])dvalid = lgb.Dataset(val[feature_list], label=val["target"])watchlist = [dtrain, dvalid]def _train():# Configure and train modelwandb.init(project="numerai_tutorial", name="LightGBM_sweep")lgbm_config = {"num_leaves": wandb.config.num_leaves,"max_depth": wandb.config.max_depth,"learning_rate": wandb.config.learning_rate,"bagging_freq": wandb.config.bagging_freq,"bagging_fraction": wandb.config.bagging_fraction,"feature_fraction": wandb.config.feature_fraction,"metric": 'mse',"random_state": seed}lgbm_model = lgb.train(lgbm_config,train_set=dtrain,num_boost_round=500,valid_sets=watchlist,callbacks=[wandb_callback()],verbose_eval=100,early_stopping_rounds=50)# Create predictions for evaluationval_preds = lgbm_model.predict(val[feature_list], num_iteration=lgbm_model.best_iteration)val.loc[:, "prediction"] = val_preds# W&b log metricsspearman, payout, feature_exposure, numerai_sharpe, mae = evaluate(val)wandb.log({"Spearman": spearman, "Payout": payout, "Feature Exposure": feature_exposure,"Numerai Sharpe Ratio": numerai_sharpe, "Mean Absolute Error": mae})# Run sweepwandb.agent(sweep_id, function=_train)
The results reveal that the learning rate and max_depth are the most important hyperparameters for our LightGBM model. The parallel coordinates plot below shows that the model with the highest Spearman correlation will not necessarily lead to the highest Sharpe ratio. Be sure to compare multiple metrics when evaluating your Numerai models.
Run set
1875
Submission
It is possible to upload a CSV file with your predictions directly on the Numerai tournament page. However, it becomes tedious to do this every week and NumerAPI makes it easy to upload your predictions. This requires you to add API keys when you are initializing NumerAPI.
1875
Once you have obtained your API keys, you can easily submit your predictions with a few lines of code.
PUBLIC_ID = "MY_ID"SECRET_KEY = "MY_SECRET_KEY"SUB_PATH = "my_submission_directory/submission1.csv"# Initialize API with API KeysNAPI = numerapi.NumerAPI(public_id=PUBLIC_ID,secret_key=SECRET_KEY,verbosity="info")# Upload predictions for current roundtest[["id", "prediction"]].to_csv(SUB_PATH, index=False)NAPI.upload_predictions(DIR, tournament=NAPI.get_current_round())
Caveat Emptor (Things to be Aware of)
- Be aware that by investing in your models through Numerai, you are taking a risk on both your model and the Numeraire token itself. Since January 2020, Numeraire did go up significantly (check CoinMarketCap for the latest price). However, cryptocurrencies are known to be volatile.
- Be careful not to judge the performance of your model too quickly. Good validation metrics and short-term live performance are not a guarantee that your model will perform well over the long run. Take your time to set up a good (cross)-validation scheme and evaluate your models. I monitored my model for two months before staking any Numeraire on it.
Final Tips
- Numerai is offering bug bounties for people that can find flaws in the platform. This is a great way to attain Numeraire if you are skeptical of investing your own money. For more information check out the Numerai documentation on bounties.
- Numerai has a great community where interesting ideas are shared continuously. Be sure to check out their RocketChat platform and the Numerai forum.
- Confident that you can build models using techniques that may be overlooked by other users? Numerai recently introduced the ability to stake on "Meta Model Contribution" (MMC), which rewards the uniqueness of your model.
- Numerai offers "Multi-Model support", which allows you to run and allocate Numeraire for up to 10 different models.
I hope this introduction got you excited about starting with Numerai! If so, be sure to check out the accompanying Kaggle kernel with this report.
If you have any questions or feedback, feel free to comment below. You can also contact me on Twitter @carlolepelaars.
Add a comment
i think im to young to do this
Reply
This is a really clear and well-written guide, thanks so much! I've been curious about Numerai for a while and am really tempted to try it after this explanation. Have you explored other prediction markets for machine learning models?
1 reply
Iterate on AI agents and models faster. Try Weights & Biases today.