With the tiny size of RC4 yet genuine cryptographic quality (DropN'd of course!) and ease of being a stream cipher i really like how it can be used in place of that Xor/Rot13 family (hey it's only a couple bytes more!) for those times when Top Secret AES block cipher encryption is overkill yet still want something better than 8bit/32bit key "weak scramble cipher". A great middleground i think! brilliant work again wilbert!
RC4 lib
Re: RC4 lib
well, the libs were useful for a couple days anyway, lol
im just super happy it was the first time i compiled a lib for all 3 OS! [✓ bucketlist]. Im guessing the download links will be dead in 30 days anyway, lol
With the tiny size of RC4 yet genuine cryptographic quality (DropN'd of course!) and ease of being a stream cipher i really like how it can be used in place of that Xor/Rot13 family (hey it's only a couple bytes more!) for those times when Top Secret AES block cipher encryption is overkill yet still want something better than 8bit/32bit key "weak scramble cipher". A great middleground i think! brilliant work again wilbert!
With the tiny size of RC4 yet genuine cryptographic quality (DropN'd of course!) and ease of being a stream cipher i really like how it can be used in place of that Xor/Rot13 family (hey it's only a couple bytes more!) for those times when Top Secret AES block cipher encryption is overkill yet still want something better than 8bit/32bit key "weak scramble cipher". A great middleground i think! brilliant work again wilbert!
Re: RC4 lib
That is a great achievementKeya wrote:im just super happy it was the first time i compiled a lib for all 3 OS! [✓ bucketlist].
There's lots of C code available and compiled with optimization on, it's fast and easy to use.
It has also been helpful to see the difference for the different optimization settings; good to know not to compile with O0
It is a nice algorithm and indeed tiny !Keya wrote:With the tiny size of RC4 yet genuine cryptographic quality (DropN'd of course!) and ease of being a stream cipher i really like how it can be used in place of that Xor/Rot13 family (hey it's only a couple bytes more!)
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: RC4 lib
yeah! talk about doors -> OPEN!!! im discovering there are so many amazing libs out there, and Fred was awesome enough to add Import functionality to PB, i think he's empowered us with such a great capability there but I get the feeling most PB'ers don't take advantage of it unless there's precompiled libs simply because C compilers are such a {you know that long evergrowing list of expletives! yeah that one!}, and I can only speak for myself here but the C language itself I find a bit intimidating and weird ... basic syntax just makes sense to me and I'm so happy Purebasic is really all I need. But the way I see it is this: we don't really have to learn to write any C anyway, we just have to (to use libs) learn how to compile an existing .c to .o, so surely the C programmer has already done 99% of the hard work for us?!? so I refuse to be scared by C, i don't feel I have to learn it, so yes I'm up for that final 1%, and I will conquer my fear of C compilers!!!wilbert wrote:There's lots of C code available and compiled with optimization on, it's fast and easy to use.
One Lib At A Time™ ...
Yeah, also good to know GCC's highest level of optimization has got nothing on your asm game hey wilbert!?!? lolwilbert wrote:It has also been helpful to see the difference for the different optimization settings; good to know not to compile with O0
btw I was just reading https://gcc.gnu.org/onlinedocs/gcc/Opti ... tions.html
and I just learned of the -Os optimize for SMALL SIZE option, which is basically a skinnier -O2.
Anyway that took its RC4 .o lib down from 920 to 875 bytes, and the resulting Rc4Xor is 95 bytes:
Code: Select all
004020DA 55 push ebp
004020DB 89E5 mov ebp, esp
004020DD 57 push edi
004020DE 56 push esi
004020DF 53 push ebx
004020E0 52 push edx
004020E1 31F6 xor esi, esi
004020E3 8B45 08 mov eax, dword ptr [ebp+8]
004020E6 3B75 14 cmp esi, dword ptr [ebp+14]
004020E9 74 48 je short 00402133
004020EB 8B18 mov ebx, dword ptr [eax]
004020ED 8D53 01 lea edx, dword ptr [ebx+1]
004020F0 0FB6D2 movzx edx, dl
004020F3 0FB67C10 08 movzx edi, byte ptr [eax+edx+8]
004020F8 8910 mov dword ptr [eax], edx
004020FA 89FB mov ebx, edi
004020FC 897D F0 mov dword ptr [ebp-10], edi
004020FF 0378 04 add edi, dword ptr [eax+4]
00402102 89F9 mov ecx, edi
00402104 0FB6F9 movzx edi, cl
00402107 8A4C38 08 mov cl, byte ptr [eax+edi+8]
0040210B 8978 04 mov dword ptr [eax+4], edi
0040210E 884C10 08 mov byte ptr [eax+edx+8], cl
00402112 885C38 08 mov byte ptr [eax+edi+8], bl
00402116 8A4D F0 mov cl, byte ptr [ebp-10]
00402119 024C10 08 add cl, byte ptr [eax+edx+8]
0040211D 8B5D 0C mov ebx, dword ptr [ebp+C]
00402120 0FB6C9 movzx ecx, cl
00402123 8A1433 mov dl, byte ptr [ebx+esi]
00402126 8B5D 10 mov ebx, dword ptr [ebp+10]
00402129 325408 08 xor dl, byte ptr [eax+ecx+8]
0040212D 881433 mov byte ptr [ebx+esi], dl
00402130 46 inc esi
00402131 ^ EB B3 jmp short 004020E6
00402133 58 pop eax
00402134 5B pop ebx
00402135 5E pop esi
00402136 5F pop edi
00402137 5D pop ebp
00402138 C3 retnRe: RC4 lib
That's greatKeya wrote:... and it just OUTPERFORMED all the other gcc -O's! Check out the updated timings on the previous page!
Do you also have the size and asm code for the 64 bit version (-Os) ?
Would be interesting to compare it with the 32 bit version.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: RC4 lib
so x64 dropped the .o file from 1220 to 1195 bytes with gcc -Os, and Rc4Xor() is 85 bytes. Ooh the joy of more registers...
The disassembler is x64dbg. I had to do a check to make sure "dil" was an x64 register, lol ... don't think i've seen it used before, but maybe people like me should be using dil instead of rax
Code: Select all
00000001400020E8 | 57 | push rdi
00000001400020E9 | 56 | push rsi
00000001400020EA | 53 | push rbx
00000001400020EB | 31 DB | xor ebx,ebx
00000001400020ED | 41 39 D9 | cmp r9d,ebx
00000001400020F0 | 76 47 | jbe rc4.140002139
00000001400020F2 | 8B 01 | mov eax,dword ptr ds:[rcx]
00000001400020F4 | 44 8B 51 04 | mov r10d,dword ptr ds:[rcx+4]
00000001400020F8 | FF C0 | inc eax
00000001400020FA | 0F B6 C0 | movzx eax,al
00000001400020FD | 89 01 | mov dword ptr ds:[rcx],eax
00000001400020FF | 44 0F B6 5C 01 08 | movzx r11d,byte ptr ds:[rcx+rax+8]
0000000140002105 | 45 01 DA | add r10d,r11d
0000000140002108 | 45 0F B6 D2 | movzx r10d,r10b
000000014000210C | 44 89 51 04 | mov dword ptr ds:[rcx+4],r10d
0000000140002110 | 42 8A 7C 11 08 | mov dil,byte ptr ds:[rcx+r10+8]
0000000140002115 | 40 88 7C 01 08 | mov byte ptr ds:[rcx+rax+8],dil
000000014000211A | 46 88 5C 11 08 | mov byte ptr ds:[rcx+r10+8],r11b
000000014000211F | 44 02 5C 01 08 | add r11b,byte ptr ds:[rcx+rax+8]
0000000140002124 | 45 0F B6 DB | movzx r11d,r11b
0000000140002128 | 42 8A 44 19 08 | mov al,byte ptr ds:[rcx+r11+8]
000000014000212D | 32 04 1A | xor al,byte ptr ds:[rdx+rbx]
0000000140002130 | 41 88 04 18 | mov byte ptr ds:[r8+rbx],al
0000000140002134 | 48 FF C3 | inc rbx
0000000140002137 | EB B4 | jmp rc4.1400020ED
0000000140002139 | 5B | pop rbx
000000014000213A | 5E | pop rsi
000000014000213B | 5F | pop rdi
000000014000213C | C3 | retRe: RC4 lib
Thank you very much KeyaKeya wrote:so x64 dropped the .o file from 1220 to 1195 bytes with gcc -Os, and Rc4Xor() is 85 bytes.
The disassembler is x64dbg. I had to do a check to make sure "dil" was an x64 register, lol ... don't think i've seen it used before, but maybe people like me should be using dil instead of rax
Nice to see the 64 bit code is even smaller compared to the 32 bit code and makes optimal use of the additional registers.
It probably also explains why the 64 bit code outperforms the 32 bit code.
I also had never heard of 'dil'. I understand now it's a name for the lowest 8 bits from the rdi register.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: RC4 lib
ahhh, that makes sense. So i take it there's also a DIH? please tell me there's no 16bit DIXwilbert wrote:I also had never heard of 'dil'. I understand now it's a name for the lowest 8 bits from the rdi register.
[update] well there's no DIH for those upper 8bits (only DIL for lowest), makes sense i guess, but just "DI" for the 16bit'er...
Re: RC4 lib
Yes, I found this page which explains itKeya wrote:ahhh, that makes sense. So i take it there's also a DIH? please tell me there's no 16bit DIXwilbert wrote:I also had never heard of 'dil'. I understand now it's a name for the lowest 8 bits from the rdi register.
[update] well there's no DIH for those upper 8bits (only DIL for lowest), makes sense i guess, but just "DI" for the 16bit'er...
https://www.tortall.net/projects/yasm/m ... sters.html
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: RC4 lib
cl, dl, ah, bh, ch, dh, al, bl, cl, dl, dil, sil, bpl, spl, r8l, r9l, r10l, r11l, r12l, r13l, r14l, r15l, al, bl, cl, dl, sil, dil, bpl, spl, r8b, r9b, r10b, r11b, r12b, r13b, r14b, r15b, ax, dx, di, si, bp, sp, ax, bx, cx, dx, di, si, bp, sp, r8w, r9w, r10w, r11w, r12w, r13w, r14w, r15w, eax, ebx, ecx, edx, edi, esi, ebp, esp, r8d, r9d, r10d, r11d, r12d, r13d, r14d, r15d, rax, rbx, rcx, rdx, rdi, rsi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, xmm8, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14, xmm15, cs, ds, ss, es, fs, gs, eip, rip, mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7, st0, st1, st2, st3, st4, st5, st6, st7, fpr0, fpr1, fpr2, fpr3, fpr4, fpr5, fpr6, fpr7, xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7, mxcsr, ymm0, ymm1, ymm2, ymm3, ymm4, ymm5, ymm6, ymm7, ymm8, ymm9, ymm10, ymm11, ymm12, ymm13, ymm14, ymm15, cr0, cr2, cr3, cr4, gdtr, ldtr, idtr, dr0, dr1, dr2, dr3, dr6, dr7, cr8, tpr, tr, gdt, ldt, idt, oh and tr, ok cool and I think i've memorized them all ....... hmm nope.
I KNOW - I'LL TURN THEM INTO A POEM.
I KNOW - I'LL TURN THEM INTO A POEM.

