Processing visual information is a key task in applications such as self-driving cars, aircraft and unmanned aerial vehicles, where reliable computing is also crucial. Until recently, convolutional neural networks (CNNs) were the main approach to detect or classify objects in an image or video.
Now, a new model has been developed to improve on CNNs that uses 'transformers', which outperform CNNs when it comes to detection accuracy. They do this by correlating each pixel in the image with all the other pixels. This strategy allows them to overcome the accuracy limit of CNNs, which can only correlate pixels in physical proximity.
Transformers were originally developed for natural language processing. In fact, when we read or write we need to correlate characters and words that can be far from each other in the sentence. They became famous through ChatGPT, where the T stands for Transformer.
In this investigation, a group led by Dr Paolo Rech from the University of Trento came to the ChipIr beamline at ISIS to see how well these Vision Transformers (ViTs) hold up under radiation testing when run on Google's small, power-efficient Coral tensor processing unit (TPU) chips.
Their study, published in IEEE Transactions on Nuclear Science, is the first to investigate the impact of atmospheric neutrons on the reliability of transformers running on TPUs. The radiation testing, when scaled to the natural neutron flux at New York City, accounts for more than 258 million years of neutron exposure.
They found that the probability of radiation-induced errors affecting the output of a model increases with the model size. The error rate is also significantly affected by the complexity of the model, with more complex ones being more susceptible to radiation effects.
As well as studying different ViTs on the Coral Edge TPU, they also broke down the ViTs to see which parts fail most often and act as the critical component. They found that radiation-induced errors on the patch embedding layer of the transformer model are more likely to lead to misclassifications than errors on other layers of the model.
With this in mind, they were able to develop methods to improve the resilience of this layer of the ViT model. These insights can help designers create more reliable ViTs by strengthening the critical parts of the system while keeping the additional computing costs low.
The full paper can be found at DOI: 10.1109/TNS.2024.3513774