RT @Thomas__Ple@twitter.com
@simonbatzner@twitter.com @jppiquem@twitter.com I think there are two main advantages of the positional encoding. First is convenience: you don't have to re-train from scratch when you add data for a new element. Second, it helps with generalization as can be seen for the dissociation of H-F which was not in the training set.
🐦🔗: https://twitter.com/Thomas__Ple/status/1619022660574531585