How to compress this ?

Just starting out? Need help? Post your questions and find answers here.
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

How to compress this ?

Post by Olli »

This code gives a numbers list, I want to compress.
I would like to reach 2 bits per number. This seems crazy, but there are lots of identical values, lots of small curves, and lots of periodic patterns. This example runs to 10000 numbers, but the compression algo has to go to 400000000, that allows us to get a 100 megabytes sized binary file containing all the 32 bits sized prime numbers.

(each number gives a gap between two consecutives prime numbers, so each number gives indirectly - and very quickly - a prime number greater or equal to 5)

Code: Select all

n = 5
ff = 1
For i = 1 To 10000
 n0 = n
 Repeat
  n + 1
  sqn = Sqr(n)
  isp = 1
  For j = 2 To sqn
   If n % j = 0
    isp = 0
    Break
   EndIf
  Next
 Until isp
 gap = n
 gap - n0
 gap / 2
 tmp = gap
 g2 = 0
 While tmp
  tmp - ff
  g2 + 1
  ff ! %11
 Wend
 Debug Space(g2) + Str(g2)
Next
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8451
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: How to compress this ?

Post by netmaestro »

2 bits per number isn't possible.
BERESHEIT
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

Re: How to compress this ?

Post by Olli »

netmaestro wrote: Tue Oct 04, 2022 12:32 am 2 bits per number isn't possible.
Your answer implies that you know how to store a good bunch of prime numbers with 7 bits per number!
User avatar
jacdelad
Addict
Addict
Posts: 1993
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: How to compress this ?

Post by jacdelad »

Come on, Olli, with two bits you can distinguish 4 numbers. With 7 bits you have 128. You basically want to get all prime numbers up to #MaxLong. The result is less than #MaxLong, but to store the exact value you still need a long. You could try to put all relevant values into a row and compress them with some known algorithm, some are built in. But it won't work in any other way, unless you find a way to store more than one bit in one bit.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
NicTheQuick
Addict
Addict
Posts: 1504
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: How to compress this ?

Post by NicTheQuick »

I guess your code itself is the best compression for this. To decompress the list, just compile and run it. :wink:
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

Re: How to compress this ?

Post by Olli »

NicTheQuick wrote:To decompress the list[...]
If I find the time, I will add the code to decompress it on the head of the subject, with the 1st code.

A 8-bits processor which put from 4 to 6 gaps "g2" (g like Gap, no secret), through 256 op codes, I think it is doable.

added : not infinite.
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

Re: How to compress this ?

Post by Olli »

Here is the flip-flop function to uncompress the list of "g2" values.

Code: Select all

Procedure fff(i)
   Protected r, j
   Static ff
   If ff = 0
      ff = 1
   EndIf
   For j = 1 To i
      r + ff
      ff ! %11
   Next
   ProcedureReturn r
EndProcedure

pn = 5
repeat
 Debug pn
 ; import g2 (compressed code) here
 pn + 2 * fff(g2)
until pn > 100
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

Re: How to compress this ?

Post by Olli »

Now, for my "idea" of 2 bits per number, and specifically 4*2 bits per a set of 4 to 6 numbers, the way is clear : not possible, as netmaestro said and as you, JacDeLad and NicTheQuick explained.

But... This list is very strange. I cannot read it now, but any weeks ago, I studied it to millions of values.

On one side, no compressor is able to zip it with a good coefficient.

On the other side, as the 1st code does it, if a insert a left margin sized by the displayed value, my eyes see some curves. If I turn the picture (90° left), I get a landscape. There are lots of symetries, and lots of waves.

And if continue to the right of the landscape, to the numbers bigger and bigger, I see a second landscape on the background, behind the fist and main landscape which is more and more shaked.

It gives the feeling we could do a 2D world, replacing numbers by any tiles.
Olli
Addict
Addict
Posts: 1200
Joined: Wed May 27, 2020 12:26 pm

Re: How to compress this ?

Post by Olli »

Uosh ! This list is already done here !

--> oeis.org : A213902
(author : Alex Rathushniak )
User avatar
Piero
Addict
Addict
Posts: 865
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: How to compress this ?

Post by Piero »

Olli wrote: Fri Jul 18, 2025 11:42 pm Uosh ! This list is already done here !
--> oeis.org : A213902
(author : Alex Rathushniak )
Not sure if I fully understood…

Primes >3 are 6N±1, so you can save them in the form N+1bit.
To decode: 6*N+(bit*2-1)

…but if you need the gaps between primes:
https://codegolf.stackexchange.com/ques ... -distances

PS: the code on your 1st post gives wrong results here, if it's supposed to output "uncompressed" prime gaps :cry:
SMaag
Enthusiast
Enthusiast
Posts: 302
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: How to compress this ?

Post by SMaag »

that allows us to get a 100 megabytes sized binary file containing all the 32 bits sized prime numbers.
You do not save any space if you do like this!
For Prime numbers use Modulo 30 Space. In a Modulo 30 Space existing maximal 8 Prime Numbers.
So you can use a single byte to store the Prime Numbers in each Modulo 30 Block.

For signed 32 Bit you have 31 Bit, so the maximum number is 2147483647
2147483647/30 = 71582788 Byte / (1024*1024) = 68MB
To store all primes from 1 to 2147483647 you need 68MB, if you use 1 Bit per Number!

But this doesn't make sense bacause the Prime Calculation with Miller Rabbin Test needs only mircoseconds to test a Number for prime

My Ryzen 5800 needs ~19 seconds to test all numbers from 1 to 400.000.000 for prime and found 21.336.326 primes

https://www.purebasic.fr/german/viewtop ... en#p361719

Just tested your basic code up to 1Mio = 4,8sec ; up to 10Mio = 158sec
User avatar
Piero
Addict
Addict
Posts: 865
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: How to compress this ?

Post by Piero »

I'm sure I still didn't understand a thing about this all (including the double-slit experiment)

So: (6N)Ciao(±1)
User avatar
Lord
Addict
Addict
Posts: 900
Joined: Tue May 26, 2009 2:11 pm

Re: How to compress this ?

Post by Lord »

Olli wrote: Mon Oct 03, 2022 10:02 pm This code gives a numbers list, I want to compress.
I would like to reach 2 bits per number.
...
Maybe I'm totally wrong and misunderstud the problem.
What about using one bit for each number?
This is what come to my mind after reading this thread (pseudo code):

Code: Select all

; Insert IsPrime_x64() here

PrimMem=AllocateMemory(400000000)

If PrimMem
  For N = 0 To 400000000  
    If IsPrime_x64(N)
      Byte=N/8
      Bit=N & $F
      ; set Bit Byte+Bit here for each primenumber
      
    EndIf
  Next
  ; save PrimeMem compressed or uncompressed
  ; or use PrimeMem directly by accessing Bit in memory
  
  FreeMemory(PrimMem)
EndIf
Image
SMaag
Enthusiast
Enthusiast
Posts: 302
Joined: Sat Jan 14, 2023 6:55 pm
Location: Bavaria/Germany

Re: How to compress this ?

Post by SMaag »

Code: Select all

; PrimMem=AllocateMemory(400000000)
PrimMem=AllocateMemory( MaxNumber/30+1 ) ; 1 Bit for each prime candidate in Modulo 30
; in Modulo 30 only the RestClass {1, 7, 11, 13, 17, 19, 23, 29} are possible primes.
; so for each block of 30 Numbers we need 8Bit to store the primes.
User avatar
Piero
Addict
Addict
Posts: 865
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: How to compress this ?

Post by Piero »

SMaag wrote: Wed Jul 23, 2025 10:41 am For Prime numbers use Modulo 30 Space. In a Modulo 30 Space existing maximal 8 Prime Numbers.
So you can use a single byte to store the Prime Numbers in each Modulo 30 Block.

in Modulo 30 only the RestClass {1, 7, 11, 13, 17, 19, 23, 29} are possible primes.
so for each block of 30 Numbers we need 8Bit to store the primes.
I must say that's very clever 👍

PS/Edit: are you sure that for each 30 you can MAX get EIGHT in the RestClass? Can't quickly find on web…
Post Reply