It's a neat encoding but the writeup is needlessly confusing. The introduction would really benefit from a graph or two. Or even just a textual sequence showing how the value evolves as you iterate on it.
Similarly, while the end result is quite elegant the consequences of the choices made to get there aren't really explained. It's quite a bit of extra effort to reason out what would have happened if things had been done differently.
For low bit representations, you could just pre-calculate all the possible answers and cache those. A simple algorithm would be to decode numbers, do the arithmetic, and re-encode the answer. With the seven bit example used in the article, there are only 2^7 = 128 unique numbers. So you'd need 16KB of memory to store all possible multiplications. With 16 bit numbers that grows to 4GB. Beyond that, it probably gets too expensive rapidly. You might get some optimizations out of the symmetry for positive and negative.
That's true for normal binary encoding of integers, but I think we should understand the question in context of the post: What's the number of bits required in iterated log coding?
Empirically, it seems to grow more like 2*log2(n)+1. A handwavy argument can be made that the first bit serves to distinguish the positive values from the negative ones, but after that on average every second bit only adds more precision to values that are already distinguishable or out of range, but doesn't help with values whose representation has the same prefix. I don't know how to make that airtight, though...
This feels like you can use e (or I guess any number) for the logarithm base rather than 2. I wonder if that’d make it a bit more uniform in some sense.
It very much is. sqrt(x) is just x^(1/2) which is x^(2^-1). Dirac's solution is using iterated square root of 2, effectively generating a sequence similar to what's used in this post.
Okay, but iterating square roots like √√2 = (2^(2^-1))^(2^-1) recurses into the base, whereas the equivalent iterated log is 2^(2^-1 × 2^-1) = 2^(2^-2) = +2^(+2^(-2^(+2^0))) with the bit representation [1 1 0 0 1 0 0 ...], i.e. it recurses into the exponent.
Dirac's solution was also arbitrarily restricted by the problem definition.
Do you believe that if you asked someone familiar with the solution to come up with a bit efficient variant they would not have trivially come up with the encoding in this post and called it a variation?
Not really. Dirac's trick works entirely at a depth of two logs, using sqrt like unary to increment the number. It requires O(n) symbols to represent the number n, i.e. O(2^n) symbols to represent n bits of precision. This thing has arbitrary nesting depth of logs (or exps), and can represent a number to n bits of precision in O(n) symbols.
Sure, but that was because that was an arbitrary restriction on the problem dirac solved.
Still seems a fairly simple variation once you remove the arbitrary restriction. To the point:
I don't believe for a second that anyone familiar with his solution, asked to make it more bit efficient, would not have come up with this. Nor do I believe they would call it anything other than a variation.
That doesn't make it less cool but I don't think it's like amazingly novel.
> Any value representable in n bits is representable in n+1 bits
> [1 1 1 1 1 1 1] 2.004e+19728
Does that mean that the 8 bits version has numbers larger than this? Doesn't seem very useful for 10^100 is already infinity for all practical purposes.
> Does that mean that the 8 bits version has numbers larger than this?
Yes. Not just a bit larger but an even more ridiculous leap. Notice that you're iterating exponents. That last one (7 bits) was 2^65536 so the next one (8 bits) will be 2^2^65536.
Python objects to 2^65536 complaining that the base 10 string to represent the integer contains more than 4300 digits so it gave up.
> Doesn't seem very useful
By that logic we should just use fixed point because who needs to work with numbers as large as 2^1023 (64 bit IEEE 754). These things aren't useful unless you're doing something that needs them, in which case they are. I could see the 5 or 6 bit variant of such an iterated scheme potentially being useful as an intermediary representation for certain machine learning applications.
holy Father! this is amazing, the things a human mind can do, ask an LLM for that and it'll give you 10 more complicated and unmaintainable ways to do it.
It reminds me of schemes like https://en.wikipedia.org/wiki/Elias_delta_coding , https://en.wikipedia.org/wiki/Elias_omega_coding
As well as https://en.wikipedia.org/wiki/Levenshtein_coding which also encodes iterated lengths.
It's a neat encoding but the writeup is needlessly confusing. The introduction would really benefit from a graph or two. Or even just a textual sequence showing how the value evolves as you iterate on it.
Similarly, while the end result is quite elegant the consequences of the choices made to get there aren't really explained. It's quite a bit of extra effort to reason out what would have happened if things had been done differently.
Agree, I had to ask ChatGPT to explain this to me with an example number and then I was able to understand the approach.
It was successfully able to run through the math? I'm surprised. That's quite impressive.
How feasible is it to do arithmetic directly in this representation?
Well the good news is that multiplication of n-bit numbers simplifies to handling the sign and adding n-1 bit numbers.
For low bit representations, you could just pre-calculate all the possible answers and cache those. A simple algorithm would be to decode numbers, do the arithmetic, and re-encode the answer. With the seven bit example used in the article, there are only 2^7 = 128 unique numbers. So you'd need 16KB of memory to store all possible multiplications. With 16 bit numbers that grows to 4GB. Beyond that, it probably gets too expensive rapidly. You might get some optimizations out of the symmetry for positive and negative.
That's exactly the crux. Addition and subtraction in log space is not exactly straight-forward.
Given an integer n, what is the number of bits required to represent every integer in the range -n..n ?
Exact integers doesn't seem to be its strong suite. Can it even represent 3 exactly?
Running code from the linked notebook (https://github.com/AdamScherlis/notebooks-python/blob/main/m...), I can see that a 32 bit representation of the number 3 decodes to the following float: 2.999999983422908
(This is from running `decode(encode(3, 32))`)
Yeah I think any numbers away from 0 or infinity aren’t its strong suit.
> Given an integer n, what is the number of bits required to represent every integer in the range -n..n ?
(log n) + 1
That's true for normal binary encoding of integers, but I think we should understand the question in context of the post: What's the number of bits required in iterated log coding?
Empirically, it seems to grow more like 2*log2(n)+1. A handwavy argument can be made that the first bit serves to distinguish the positive values from the negative ones, but after that on average every second bit only adds more precision to values that are already distinguishable or out of range, but doesn't help with values whose representation has the same prefix. I don't know how to make that airtight, though...
This feels like you can use e (or I guess any number) for the logarithm base rather than 2. I wonder if that’d make it a bit more uniform in some sense.
Isn't this just a variant of Dirac's solution for representing any number using sqrt, log, and the number 2?
It is not. (There are no square roots, for instance.)
It very much is. sqrt(x) is just x^(1/2) which is x^(2^-1). Dirac's solution is using iterated square root of 2, effectively generating a sequence similar to what's used in this post.
Okay, but iterating square roots like √√2 = (2^(2^-1))^(2^-1) recurses into the base, whereas the equivalent iterated log is 2^(2^-1 × 2^-1) = 2^(2^-2) = +2^(+2^(-2^(+2^0))) with the bit representation [1 1 0 0 1 0 0 ...], i.e. it recurses into the exponent.
So uh, how is that not a variation, exactly?
Dirac's solution was also arbitrarily restricted by the problem definition.
Do you believe that if you asked someone familiar with the solution to come up with a bit efficient variant they would not have trivially come up with the encoding in this post and called it a variation?
I don't, for a second, believe that.
Not really. Dirac's trick works entirely at a depth of two logs, using sqrt like unary to increment the number. It requires O(n) symbols to represent the number n, i.e. O(2^n) symbols to represent n bits of precision. This thing has arbitrary nesting depth of logs (or exps), and can represent a number to n bits of precision in O(n) symbols.
Sure, but that was because that was an arbitrary restriction on the problem dirac solved.
Still seems a fairly simple variation once you remove the arbitrary restriction. To the point: I don't believe for a second that anyone familiar with his solution, asked to make it more bit efficient, would not have come up with this. Nor do I believe they would call it anything other than a variation.
That doesn't make it less cool but I don't think it's like amazingly novel.
Neat, looks like it would work well for values clustered around 0.5, and which hang more closely around 0 and 1.
Nice and elegant! Surprising it was not discovered earlier (unless it was and we're unaware of it).
It was. It's a variant of Dirac's solution to representing any number using sqrt, log, and the number 2.
> Any value representable in n bits is representable in n+1 bits
> [1 1 1 1 1 1 1] 2.004e+19728
Does that mean that the 8 bits version has numbers larger than this? Doesn't seem very useful for 10^100 is already infinity for all practical purposes.
> Does that mean that the 8 bits version has numbers larger than this?
Yes. Not just a bit larger but an even more ridiculous leap. Notice that you're iterating exponents. That last one (7 bits) was 2^65536 so the next one (8 bits) will be 2^2^65536.
Python objects to 2^65536 complaining that the base 10 string to represent the integer contains more than 4300 digits so it gave up.
> Doesn't seem very useful
By that logic we should just use fixed point because who needs to work with numbers as large as 2^1023 (64 bit IEEE 754). These things aren't useful unless you're doing something that needs them, in which case they are. I could see the 5 or 6 bit variant of such an iterated scheme potentially being useful as an intermediary representation for certain machine learning applications.
Who said anything about practical purposes?
I usually like my float representations to have some use. But sure, this does seem like a very interesting idea.
> I usually like my float representations to have some use.
If this does have some use, I don’t think representing real-world numbers is one of them, for the reasons you intuited.
I thought it was some log file hackery, but it's actually about logarithms.
holy Father! this is amazing, the things a human mind can do, ask an LLM for that and it'll give you 10 more complicated and unmaintainable ways to do it.
Cute! I wonder if it would be amenable for use as a variable-width encoding for, say, DCT coefficients in a JPEG-like codec..?
Could be good for AI representation theory, cool idea
[dead]
Feels a bit like binary encoding with + and - instead of 0 and 1