To other participants:
Please be aware that the 'FlopCountAnalysis' function used in the provided example code is not reliable.
- Activation functions and trigonometric functions are not counted.
- Element-wise operations such as addition and multiplication are not counted.
- Bias in linear and convolution layers is not counted.
- Many functions in torch.nn.functional are not counted including 'scaled_dot_product_attention' and 'cosine_similarity'.
This example illustrates it quite well: https://gist.github.com/SimonLarsen/0f79127a02f29ad44ed2a5153cadfac4
Both models are reported to use 1.812 GFLOPs, but in reality the second one exceeds 0.5 TFLOPs.
For a more realistic example, I have trained a model for this competition that I have manually estimated to around 3.93 TFLOPs. However, fvcore estimates this model to just 1.90 TFLOPs.
To the organizers:
How will the FLOP count be evaluated for submitted entries?
Manual calculation is of course quite difficult, but automated tools are very imprecise and invites cheating (intentional or not).
I completely agree and hope the organizers can update how computation constraints are calculated. Otherwise, it's unfair since someone could avoid FlopCountAnalysis calculations using some tricks.
Posted by: fire @ March 3, 2025, 5:55 a.m.We thank the participant for reporting the issue. We have reviewed this issue and discussed it internally.
Finally, we decided to keep this method since the limitations does not only include the FLOPs but also the number of parameters.
If there are any other issues, please let us know.
Posted by: Sangmin-Lee @ March 4, 2025, 8:07 a.m.