Page 1 of 1

Proper path sanitation? (os independant, robust)

Posted: Sun Dec 16, 2007 5:42 am
by superadnim
Hi guys... I was looking for a robust way to check if a path / filename is correct... So far I've been using win-api MakeSureDirectoryPathExists() but it's not unicode and again, it's not a cross platform solution! (it also relies on DbgHelp...)

The CheckFilename() in PB is only for file names, not for entire paths or just paths, what can I use for this case?

Re: Proper path sanitation? (os independant, robust)

Posted: Sun Dec 16, 2007 5:47 am
by PB
Do you mean something like this?

Code: Select all

f$="c:\program files\internet explorer\iexplore.exe"

Debug FileSize(f$) ; Returns >-1 if the file exists.
Debug FileSize(GetPathPart(f$)) ; Returns -2 if the path exists.

Posted: Sun Dec 16, 2007 5:48 am
by superadnim
Doesn't FileSize attempt to open a file handle?, what's the inner working of this function? - I'd just like string sanitation.

Posted: Sun Dec 16, 2007 6:28 am
by PB
I guess there's a need for a CheckPath() command to complement CheckFilename().

Here's something I just knocked up for you, but it's definitely not 100% foolproof.
For example: Pass "c" to it and it says it's valid. So, something to work on. :)

Code: Select all

Procedure CheckPath(path$)
  ok=1 ; Assume ok for now.
  For p=1 To Len(path$)
    c=Asc(Mid(path$,p,1))
    ; Now check if path contains * ? " < > |
    If c=42 Or c=63 Or c=34 Or c=60 Or c=62 Or c=124
      ok=0 : Break
    EndIf
  Next
  ProcedureReturn ok
EndProcedure

Debug CheckPath("c:\program files\internet explorer\") ; Valid.
Debug CheckPath("c:\program files\*internet explorer*\") ; Invalid.

Posted: Sun Dec 16, 2007 6:59 am
by superadnim
Thanks PB, heres my go after being inspired by your code :)

Code: Select all

Procedure.l CheckPath( *this.character )
	
	Define.l lwResult, lwCheck
	If *this
		
		Repeat 
			
			If (*this\c = ':' Or *this\c = '\')
				lwCheck = #True
			EndIf
			
			If (*this\c = '*' Or *this\c = '?' Or *this\c = '<' Or *this\c = '>' Or *this\c = '|')
				lwResult = #False
				Break
			Else
				lwResult = #True
			EndIf
			
			*this + SizeOf(Character)
		Until *this\c = 0
		
		ProcedureReturn  (  lwCheck & lwResult )
	EndIf
	
EndProcedure

Debug CheckPath(@"c:\program files\internet explorer\") ; Valid.
Debug CheckPath(@"c:\program files\*internet explorer*\") ; Invalid. 
Debug CheckPath(@"c") ; Invalid. 

9+ times faster, bug apparently solved but I need testers!

Notice it's faster to pass a pointer, and I don't mind doing so, that's why its the way it is :P

Posted: Sun Dec 16, 2007 7:26 am
by superadnim
By the way, anything I should be taking care of in case it's a unicode? or the same rules apply?, also.. what if this is not ntfs or fat32?, does the OS even return a set of invalid chars one could use to check on?

On a side note, getting rid of the ELSE in the second IF and assigning 1 to the result var on define, did not seem to improve performance. Perhaps I couldn't see any decrease in time due to the lack of resolution in the timer I was using...

Anyway:

Code: Select all

Procedure.c CheckPath( *this.character )
	
	Define.c lwResult, lwCheck
	
	If *this
		
		lwResult   		= #True
		Repeat
			
			If ( *this\c = ':' Or *this\c = '\' )
				lwCheck 	= #True
			EndIf
			
			If ( *this\c = '*' Or *this\c = '?' Or *this\c = '<' Or *this\c = '>' Or *this\c = '|' )
				lwResult 	= #False
				Break
			EndIf
			
			*this + SizeOf(Character)
		Until *this\c 	= 0
		
		ProcedureReturn  ( lwCheck & lwResult )
	EndIf
	
EndProcedure
It uses .c type all the way through to avoid conversions, it's about 0.18 times faster now, at least in here.

Posted: Sun Dec 16, 2007 11:24 am
by Trond
Valid paths that fails:
"."
".."
"NUL"
"LPT"

Sometimes / is accepted instead of \, sometimes it isn't. "\." is considered valid, but "/." isn't. But "c:/" is considered valid.

Posted: Sun Dec 16, 2007 3:54 pm
by netmaestro
PathFileExists_() on windows, and use a CompilerIf block to sub in a Linux version for crossplatform. Much easier than reinventing an old wheel.

Posted: Sun Dec 16, 2007 8:09 pm
by superadnim
Thanks Trond, I'll take a look on the proc later.

It's trivial and PB as an HLL not supporting it makes me wonder...

netmaestro: that api routine does perform a check whether the file or path actually exists, where all I need is string sanitation. So I'm not reinventing anything here.

If you don't believe me, go ahead - given this proper path but no real filename in said path, you won't get a true as an answer from this routine:

Code: Select all

Debug PathFileExists_(@"C:\bla.bin")
Just that :)

I might have to choose in between absolute and relative paths though.

Posted: Sun Dec 16, 2007 8:21 pm
by superadnim
Ok, I divided it with a flag to either check for an absolute or a relative path, that works. But now I'm wondering about \\\ and escaped slashes... Where can I get a paper regarding what's valid and what's not on path sanitation under most platforms / formats ?

In relative, things like "...\" are invalid, right? - So my checking should go a little further, at least until the first slash is encountered in this case.