Next Generation AI Model Evaluation
Gunnar Carlsson, John Carlsson Mattimore Cronin Gunnar Carlsson, John Carlsson Mattimore Cronin

Next Generation AI Model Evaluation

Go beyond the leaderboard: How TDA uncovers what benchmark scores miss in model evaluation.

The evaluation of models is absolutely critical to the artificial intelligence enterprise.  Without an array of evaluation methods, we will not be able to understand whether the models are doing what we want them to do, or what measures we should take to improve them.  Another reason for the need for good evaluation measures is that once an AI model is deployed, we will find that the input data, the interaction of users with the model, and the user reactions to the output of the model will change over time.  This means that not only do we need evaluation at the time of construction of the model, we will need to evaluate continually throughout the deployment lifecycle of the model.

Read More