Most people think of data augmentation as a technique to improve their model during training.
Starting from an initial dataset, you can generate synthetic copies of each sample that will make a model resilient to variations in the data.
This is true, but there's something more you can do to improve your model's predictions.
Imagine you are building a multi-class classification model.
When testing it, you take every sample, run it through the model, and determine the corresponding class from the result.
You use one picture to make a prediction, so you have one opportunity to get the correct answer.
Unfortunately, sometimes this is not enough.
You can take advantage of data augmentation to give you a better opportunity to make the correct prediction.
Test-time augmentation is a technique where you can augment samples before running them through the model, then average the prediction results.
Instead of running a picture through the model, you can generate three versions of it by changing the image's contrast, rotating it slightly, and cropping it a bit.
You now have four different pictures to make a prediction. Run them through the model, average the four softmax vectors you get back, and determine the final class from the result.
By augmenting the original image, you give the model more opportunities to "see" something different and compute the correct prediction.
The success of Test-time augmentation depends on how good are your augmented samples.
That's where most of your time will go.
Your augmented samples will have a lot of influence on the final result. If you create sloppy variations of the original image, test-time augmentation can quickly decrease your model's performance.
In practice, explore this technique using a few slight modifications to the initial picture. You'll find most of the success relies on avoiding excessive complexity.