Page 1 of 1

Find IPs in a bytes' sequence

Posted: Sun Oct 12, 2008 11:31 pm
by Psychophanta
Answering to
http://www.purebasic.fr/english/viewtopic.php?t=34709

Code: Select all

; Find IPs in a string or any byte sequence ending with 'endchar' character (usually #Null).
; Psychophanta 20081013
Procedure.i FindIP(*seq.character,a.l(1),pos.i=0,endchar.c=0)
  ;USAGE:
  ;the IPs are delivered to an unidimensional Array of 'long' type variables: a.l()
  ;the function returns the number of matches found.
  ;Use at your own and test yourself if it is a bullet-proof function. :p
  Protected *p.character=*seq+pos,d.c=1,n.i=0,val.w=0,field.c=0
  Repeat
    Select *p\c
    Case '0' To '9'
      val*d+*p\c-$30
      d=10
      If val&$FF00
        Repeat:*p+1:Until *p\c<'0' Or *p\c>'9'
        If *p\c=endchar
          If n:ReDim a(n-1):EndIf
          Break
        EndIf
        a(n)=0:d=1:field=0:val=0
      EndIf
    Case '.'
      If field=24 Or d<>10 Or val&$FF00
        a(n)=0:d=1:field=0:val=0:*p+1:Continue
      EndIf
      a(n)|val<<field
      field+8:d=1:val=0
      If field>24
        field=0
      EndIf
    Default
      If field=24 And d=10
        a(n)|val<<field
        n+1:ReDim a(n)
      Else
        a(n)=0
      EndIf
      If *p\c=endchar
        If n:ReDim a(n-1):EndIf
        Break
      EndIf
      d=1:field=0:val=0
    EndSelect
    *p+1
  ForEver
  ProcedureReturn n
EndProcedure

;TEST:
Dim ips.l(0)
n.i=FindIP(@"I made it intentionally like this: Any number >255 is taken as garbage (just like letters), so: 355.04.3.4.234 should be 4.3.4.234, but something like 24.23.243.67.44.44.44, is nothing... 5 3 52.003.025.036 .48..5.7.6.8 wert  ",ips())
For t=0 To n-1
  Debug IPString(ips(t))
Next
;
ReDim ips.l(0)
n.i=FindIP(@"Some 15634 foxes jump over 169.45.0.2 lazy dogs on wrong address 169.45.0. Another proper one is 3.145.5.72",ips())
For t=0 To n-1
  Debug IPString(ips(t))
Next
;
a$ = "the quick brown fox jumps over 15634 lazy dogs on 2.169.45.0.2.2 without any regrets!" 
b$ = "the quick brown fox jumps over 15634 lazy dogs on 169.45.0. without any regrets! 169.45.0.1" 
c$ = "the quick brown fox jumps over 15634 lazy dogs on 169..0.4 without any regrets! 169.45.0.1" 
a$+b$+c$
ReDim ips.l(0)
n.i=FindIP(@a$,ips())
For t=0 To n-1
  Debug IPString(ips(t))
Next
;
Dim ips.l(0)
n.i=FindIP(@"35.0455.3.4 should be nothing, while ... 00000005.000093.52.003 is a IP and 192.168.1.1.36 is not.",ips())
For t=0 To n-1
  Debug IPString(ips(t))
Next
EDIT 200810132026: fixed a risky bug when a string has a >255 number and then the null char.

Posted: Mon Oct 13, 2008 2:23 am
by Little John
Note: The thread http://www.purebasic.fr/english/viewtopic.php?t=34709 itself also contains code which solves the problem.

Posted: Mon Oct 13, 2008 11:25 am
by Kaeru Gaman
XXX

Posted: Mon Oct 13, 2008 3:16 pm
by Psychophanta
About yours:
For 169..0.3 it returns 169.0.0.3 which is wrong.
For 169.345.0.2.6 it returns 45.0.2.6, which is wrong.
for 169.3 345.0.2.6 it returns 3.0.2.6, which is wrong.
...

Posted: Mon Oct 13, 2008 6:02 pm
by Kaeru Gaman
XXX

Posted: Mon Oct 13, 2008 6:40 pm
by Little John
Kaeru Gaman wrote:completely silly input will not be parsed correctly, why should it?
Why should it not be parsed correctly?

Neither 169..0.3 nor 169.345.0.2.6 nor 169.3 345.0.2.6 are valid IP-adresses.
So IMHO it's a matter of course that a smart parsing routine doesn't consider any of them as such.

Regards, Little John

Posted: Mon Oct 13, 2008 6:56 pm
by Kaeru Gaman
whooo okay!

Posted: Mon Dec 22, 2008 3:41 pm
by SFSxOI
Psychophanta wrote:About yours:
For 169..0.3 it returns 169.0.0.3 which is wrong.
For 169.345.0.2.6 it returns 45.0.2.6, which is wrong.
for 169.3 345.0.2.6 it returns 3.0.2.6, which is wrong.
...

Well of course its going to be wrong if you feed it incorrect information to begin with, will it not ?

If its expecting an IP address format of, for example, XXX.XXX.XXX.XXX and you feed it xxx.XXX.XXX.XXX.XXX (4 extra characters) or XXX..X.X ( less characters or extra un-needed characters) then needed, then it will give incorrect info?

I find that if I feed it the correct number of characters in the correct format it returns the correct information. We are dealing with IP addreses here are we not? with up to 255.255.255.255, right? and we are not dealing with invalid ip address formats of the type you posted? Are are you saying that it should some how determine what you meant instead of what you asked for?

Posted: Tue Dec 23, 2008 5:10 am
by Rescator
The normal (and not so normal) notation of IPv4 can be seen here: http://en.wikipedia.org/wiki/IPv4
And for IPv6 here: http://en.wikipedia.org/wiki/IPv6

Does not seem like .. is accepted for IPv4 it has to be .0.
But when it comes to IPv6 things are a bit more complicated.

PS! A proper routine should handle both IPv4 according to the standard,
as well as IPv6 which "is" becoming more common, albeit slowly. :wink: