Re: Population count
Posted: Sun Apr 10, 2016 2:57 am
There was once a stupid trick using local lookup tables for direct conversion, via xlat, or modern equivalent.
You can build a 256 entry table mapping 8 bits to # of bits, and then do an xlat. It's only useful in stupid cases...and nothing else.
With a swap and an add, you can do 16. Duplicate for 32, etc. I don't know modern latencies for memory access, so it will probably be horribly slow. Also, setup is very slow, so it will only look better if you count performance after the table is built.
Even if it works, I will be surprised. But it was the fast way when processors were slower and fewer instructions.
You can build a 256 entry table mapping 8 bits to # of bits, and then do an xlat. It's only useful in stupid cases...and nothing else.
With a swap and an add, you can do 16. Duplicate for 32, etc. I don't know modern latencies for memory access, so it will probably be horribly slow. Also, setup is very slow, so it will only look better if you count performance after the table is built.
Even if it works, I will be surprised. But it was the fast way when processors were slower and fewer instructions.