I'm just curious if it's possible to get good results without using attention.
(kind of yes/no question)
Yes, I am :)
Posted by: TacchinoLeso @ Dec. 30, 2021, 2:44 p.m.are you guys using some framework or building transformer architectures from scratch?
Posted by: MarcoBonalumi @ Jan. 5, 2022, 9:58 a.m.