Major breakthrough

hexaflexagonbear [he/him]@hexbear.net · 5 months ago

Major breakthrough

ta00000 [none/use name]@hexbear.net · 5 months ago

Since LLMs essentially decide on one character at a time, I wonder if they would have better accuracy if asked to tell you the sum backwards. That’s how we teach kids to add, right to left, carry the 1.

hexaflexagonbear [he/him]@hexbear.net · 5 months ago

I think this is essentially what they did. The point of the paper is they made an architecture to make the llm more aware of an individual digit’s position in a number. It helped with addition, multiplication, and even sorting.

HexLlama [it/its, she/her]@hexbear.net · 5 months ago

Its technically true that it decides token at a time but it also takes previous tokens into account.

ta00000 [none/use name]@hexbear.net · edit-2 5 months ago

That’s why it’s easier. if you’re going left to right you have to not only figure out the sum of the first number position, but also if there’s a 1 to carry or not. Going right to left you only have to focus on one 1 digit add at a time and you already know if there’s a carry by looking at the last addition.