Page 1 of 1

Posted: Thu Aug 01, 2002 4:29 am
by BackupUser
Restored from previous forum. Originally posted by Art Sentinel.

Hi, has anyone had any experience with altering multi-line text within an external file using PureBasic? I am very eager to hear your excellent suggestions as to the fastest way to parse and manipulate such a file.

.

Here is what I am doing:

1. I have an external file named Folder.whtml. This file resides in the same directory as my executable. Within this file is the location of a data base folder I want any processed information stored in. (In this case, it is set to D:\Data Base\.)

2. Now, my exe access this .whtml file and finds out which directory to perform actions.

3. Next, it reads from a text file containing links I have added.

4. It converts this text file into an HTML equivalent.

5. It saves this new HTML version in the directory I specified using the .whtml configuration file.

.

That part was simple! (It works as I wanted.) But I have noticed a 3 second delay when processing ~400 kbs of freeware links on my 388mhz computer. This leads me to think that there must be a more optimized way to achieve the same results--and thus decreasing the time required to parse and manipulate the 10,000+ lines of data. I know many of you are highly creative and skilled programmers. I would greatly appreciate any friendly advice you could share with me, or any clever brainstorming ideas you think up.

What would be the fastest way to do this?

I have considered placing the HTML formating text within another text file (and not using WriteStringN statements). My application would then loop through each line and copy it to the final HTML file in that manner. While this would decrease the application size, and make future customization much easier, I do not see this being a solution to my speed-increasing puzzle. (I will however take that external HTML code approach on my completed project since it is a far wiser way to go. This code is merely for testing purposes right now--please forgive the messiness.)

.

.

Code: Select all

;Test to parse a text file and then write a HTML file using the text obtained
;Art Sentinel
;July 29, 2002
;
;ILYLCBD
;



;Delcare Variables

WriteWhat$ = "text files\New Folder\test\"
DatabaseFolder$ = "Folder.whtml"
Content$ = ""
CreatedFile$ = ""
ReadText$ = "links.txt"
WriteFile$ = "links.sat"
Header$ = "][ - Freeware Links - ]["
BodyStart$ = ""
BodyEnd$ = ""


;Test for Database folder

If ReadFile(0, DatabaseFolder$)

  DatabaseFolder$ = ReadString()  
  
  CloseFile(0)
  
EndIf


;Test for command line parameters
;Empty for now...


;Process parameters
;Empty for now...


;Write HTML file
CreatedFile$ = DatabaseFolder$ + WriteWhat$ + WriteFile$

If OpenFile(0, CreatedFile$)
    
  WriteStringN("")
  WriteStringN("")
  WriteStringN(Header$)
  WriteStringN(BodyStart$)
  WriteStringN("")
  WriteStringN("  ")
  WriteStringN("     ")
  WriteStringN("       ][   - ")
  WriteStringN("          Saturday List -   ][")
  WriteStringN("    ")
  WriteStringN("  ")
  WriteStringN("  ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("           ")
  WriteStringN("            ")
  WriteStringN("            ")
  WriteStringN("            ")
  WriteStringN("          ")
  WriteStringN("           ")
  WriteStringN("            ")
  WriteStringN("            ")
  WriteStringN("                   ")
   
    If ReadFile(1, ReadText$)
        
      Repeat
         
        UseFile(1)
        Contents$ = ReadString()
        
          If Contents$ = ""
          
            Contents$ = ""
            
          Else
          
            Contents$ = Contents$ + ""
            
          EndIf
          
        UseFile(0)
        WriteString(Contents$)
      
      Until Eof(1)  0
      
      CloseFile(1)
      
   EndIf
   
  WriteStringN("")
  WriteStringN("            ")
  WriteStringN("          ")
  WriteStringN("           ")
  WriteStringN("            ")
  WriteStringN("            ")
  WriteStringN("            ")
  WriteStringN("          ")
  WriteStringN("        ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("     ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("      ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("       ")
  WriteStringN("    ")
  WriteStringN("  ")
  WriteStringN("   ")
  WriteStringN("   ")
  WriteStringN("  ILYLCBD")
  WriteStringN("")
   
  WriteStringN(BodyEnd$)
  WriteStringN("")
   
  
   
  CloseFile(0)
   
EndIf  

;Finish up
End
.

.

Thank you for your help! Take care.


- Art Sentinel
http://www.artsentinel.net


--------------

Top Ten Reasons Not To Procrastinate:


Coming Soon...

Edited by - Art Sentinel on 01 August 2002 05:33:52

Posted: Thu Aug 01, 2002 6:05 am
by BackupUser
Restored from previous forum. Originally posted by horst.

ReadString() is rather slow.
I would suggest to read the complete file to memory,
and then write lines from memory, appending the "".
There is an IncludeFile (ASM coded) that handles this:

see: http://home.mnet-online.de/horst.muc/pb/

Then the operation would look like this:

Code: Select all

If LoadFileToMem(1,ReadText$)
  While MoreInMem()
    WriteString(ReadLineFromMem()+"") 
  Wend 
  CloseFileMem()
Endif 


Horst

Posted: Thu Aug 01, 2002 8:33 am
by BackupUser
Restored from previous forum. Originally posted by PB.

Code: Select all

UseFile(1)
Contents$ = ReadString()
;
If Contents$ = ""
  Contents$ = ""
Else
  Contents$ = Contents$ + ""
EndIf
;
UseFile(0)
WriteString(Contents$)
I don't know how much speed difference it'll make, but the above can be shortened
to the following so that the decision-making part of it is removed:

Code: Select all

UseFile(1) : Contents$ = ReadString()
UseFile(0) : WriteString(Contents$ + "")
You don't need to test for an empty string, because both parts of your original
code add "" to Content$, regardless of whether it's empty or has data...

PB - Registered PureBasic Coder

Edited by - PB on 01 August 2002 09:35:35

Posted: Thu Aug 01, 2002 11:21 am
by BackupUser
Restored from previous forum. Originally posted by Pupil.

Hi, i've got a suggestion for a replacment for all your WritestringN() statements, What you can do is create two files of them and include in you exe. Something like this perhaps:

Code: Select all

;Write HTML file
CreatedFile$ = DatabaseFolder$ + WriteWhat$ + WriteFile$

If OpenFile(0, CreatedFile$)
  WriteData(?StartFirstSection, ?EndFirstSection-?StartFirstSection)
   If ReadFile(1, ReadText$)
    Repeat
      UseFile(1)
      Contents$ = ReadString()
      UseFile(0)
      WriteString(Contents$ + "")
    Until Eof(1)  0
    CloseFile(1)
  EndIf
  WriteData(?StartSecondSection, ?EndSecondSection-?StartSecondSection)
  CloseFile(0)
endif

; all the other stuff
 ...
End

StartFirstSection:
  IncludeBinary "FirstHalfOfHTMLFile.html"
EndFirstSection:

StartSecondSection:
  IncludeBinary "SecondHalfOfHTMLFile.html"
EndSecondSection:



Note! I haven't tested the code so there might be some spelling misstakes in the code, so check the variable names so that they match...

Edited by - Pupil on 01 August 2002 12:25:35

Posted: Thu Aug 01, 2002 11:31 am
by BackupUser
Restored from previous forum. Originally posted by PB.

> create two files of them and include in you exe.

That's assuming the file to be parsed ("Folder.whtml") doesn't change...
It's a bit hard to tell from the original post if this is the case, but
I understood the post as needing to parse "Folder.whtml" on the fly, for
example after downloading it from the net somewhere.


PB - Registered PureBasic Coder

Posted: Thu Aug 01, 2002 12:43 pm
by BackupUser
Restored from previous forum. Originally posted by Pupil.
> create two files of them and include in you exe.

That's assuming the file to be parsed ("Folder.whtml") doesn't change...
It's a bit hard to tell from the original post if this is the case, but
I understood the post as needing to parse "Folder.whtml" on the fly, for
example after downloading it from the net somewhere.
I think you missunderstood me, what i meant was that the data that it's included in all the WriteStringN() commands should be saved to two separate files and then be included.
I.e. the line starting with:

Code: Select all

WriteStringN("")
 ...
; to the ending line
WriteStringN("                   ")
Should be saved to a single file


and the line starting with:

Code: Select all

WriteStringN("")
 ...
; to the ending line
WriteStringN("")
Should be saved as another file, these are the two files i'm refering to in my post... Hope that clears things up.. This saves atleast 50 lines of WriteStringN() for every html file that is created.


Update Ok, i see now when checking a little closer that the lines i'm refering to above holds some dynamic data i.e:
WriteStringN(Header$)
WriteStringN(BodyStart$)

But that doesn't change much as they only are in the begining and could be written with 'WriteStringN()' and thus be omitted from the include file. The code would then look something like this:

Code: Select all

 WriteStringN("")
 WriteStringN("")
 WriteStringN(Header$)
 WriteStringN(BodyStart$)
 WriteData(?StartFirstSection, ?EndFirstSection-?StartFirstSection)
And you can do something similar on the second section..








Edited by - Pupil on 01 August 2002 13:57:24

Posted: Fri Aug 02, 2002 4:20 am
by BackupUser
Restored from previous forum. Originally posted by ricardo.


Maybe im missing something, but i think that ReplaceString is an easier way...

If you read all the document that is parsed in to one string or into memory, and then add the to the end of all lines.

Code: Select all

NewText$ = ReplaceString(Text$,chr(13) + chr(10), "" + chr(13) + chr(10))
This simple step will add the to ALL of your lines without much complication.
ReplaceString was writed in ASM by Fred, what tells me that its optimized ASM, then it should be faster that ANY not optimized code that we could write.
And its simple...

Posted: Fri Aug 02, 2002 5:08 am
by BackupUser
Restored from previous forum. Originally posted by Art Sentinel.

Thank you to everyone! My brain is working in over-drive right now thanks to all of your excellent ideas. :)

horst, your ASM IncludeFile is an exciting alternative. I am just now testing it and I can say I am receiving a substantial speed boost. There is however an important point that I wanted to bring up (that ricardo generously beat me too.) :wink: I was reading about the ReplaceString function. It seems like a great idea to just take a complete HTML template file, add a special commented-out pointer tag such as , then load it entirely to memory using your LoadFileToMem. Next load the ASCII text file to memory, then find a way to tell PureBasic to take the ASCII text file contents and paste them directly after the tag in the HTML document (which is in memory still). The crucial thing is that this function will act like an insert-enabled paste so that it does not erase over the HTML file from the paste point on--only add some text within it at a specified point.

This seems obvious and elementary. But either I am not reading the PB help correctly, or there does not exist any functions to add text at a desired location within an entirely loaded document. The PB help says:

Try to find any occurrences of 'StringToFind$' into the given 'String$' and replace them with 'StringToReplace$'.

The entire document would not fit inside a string at once. Only line by line until a carriage return was encountered. Is there a function, horst, in your ASM IncludeFile to add or replace text at a given point within the memory-loaded document? This would be immensely helpful! :wink: [hint] [hint]

.

Doh! Thanks PB for that helpful advice. As to why the heck I designed an unnecessary extra step to add a to the end of a line when would be added anyhow? I will blame that flaw in logic on Johny Bravo. It is sometimes difficult to code and keep one eye on the TV set at the same time. Haha..

.

Pupil, I like your idea very much of including the HTML code within a separate file. I would very much like to learn more about embedding a file into PB and then accessing it at runtime. As for my current project, it is important that the HTML file stay external so that it can be easily updated. Parsing needs to be done on the fly.

Are there string commands that I am missing that allow text to be added to a FILE and not just a string? For example, I could simply make a copy of my HTML template to the working directory specified by the .whtml file, then open it, find the special tag, and then lastly 'paste' the contents of the ASCII file there. (Of course the ASCII file would first be loaded into memory and the tags added.)

Is there a String 'PasteIn' or 'PasteOver' command I am missing? I have just begun learning WinASM, so I am completely unprepared to add such a functions myself. If I could, you can be sure I would do it and share it with everyone here..

Thank you again!

- Art Sentinel
http://www.artsentinel.net


--------------

Top Ten Reasons Not To Procrastinate:


Coming Soon...

Posted: Fri Aug 02, 2002 5:59 am
by BackupUser
Restored from previous forum. Originally posted by horst.

> .. The crucial thing is that
> this function will act like an insert-enabled paste so that it does
> not erase over the HTML file from the paste point on--only add some
> text within it at a specified point.

You do not need the entire product file in memory, since you will
write it to a file anyhow.

So the best thing would be to produce and write chunks of that file,
until everything is completed.

1st step: Write the first 4 lines (as in your posting)

2nd step: Read the following lines from a prepared HTML file to a
memory buffer, using ReadData(*MemoryBuffer, LengthToRead) and write
to your file with WriteData(..)

3rd step: Use my tool to read & write the data file adding the ""s

4th step: Read the following constant lines from a prepared file & write
(see step 2)

5th step: add the rest..



Horst

Posted: Fri Aug 02, 2002 6:13 am
by BackupUser
Restored from previous forum. Originally posted by ricardo.

Maybe i dont be clear with my post.
Using ReplaceFile you can do all the job in ONE step, ONE LINE.
You can use it with a string or directly from memory.

See this example:

;THIS FIRST PART IS JUST TO PUT IN MEMORY A FAKE TEXT FILE:

Buffer = AllocateMemory(0, 10000, 0)

EOL$ = Chr(10) + Chr(13) ; The End Of Line of text file has this characters to jump to next line

String$ = "line1 " + EOL$ ; This is a simple trick just to make a fake text file
For I = 2 To 20
String$ + "line " + Str(I) + EOL$
Next

MessageRequester("",String$,0) ; see how is your actual txt file
PokeS(Buffer,String$,Len(String$) ) ; store it on memory


;--------------------------------------------------------------------
;THIS SECOND PART IS THE TIP ITSELF, once you have your file on memory
;DOES the parsing in ONE step:


Result$ = ReplaceString (PeekS(Buffer),EOL$,"" + EOL$) ; make ALL the changes at ONCE

MessageRequester("",Result$,0) ; Your file parsed easily and fast

;But im using strings JUST TO SHOW RESULT, you can do ALL directly to MEMORY

PokeS(Buffer,ReplaceString (PeekS(Buffer),EOL$,"" + EOL$),Len(ReplaceString (PeekS(Buffer),EOL$,"" + EOL$)) )

MessageRequester("Result parsed and stored on memory",PeekS(Buffer),0)

;Hope it helps : )

Posted: Fri Aug 02, 2002 6:17 am
by BackupUser
Restored from previous forum. Originally posted by ricardo.
The entire document would not fit inside a string at once. Only line by line until a carriage return was encountered. Is there a function, horst, in your ASM IncludeFile to add or replace text at a given point within the memory-loaded document? This would be immensely helpful!
My example was to answer this question, run it and you will see.
It works exactly the same on text files.

Posted: Fri Aug 02, 2002 7:12 am
by BackupUser
Restored from previous forum. Originally posted by Pupil.
EOL$ = Chr(10) + Chr(13) ; The End Of Line of text file has this characters to jump to next line
Your EOL character is wrong, should be EOL$ = Chr(13)+Chr(10).
Have you tested your code with strings larger than 64kBytes, i believe some of the PureBasic string commands has a upper limit about there...

Posted: Fri Aug 02, 2002 4:14 pm
by BackupUser
Restored from previous forum. Originally posted by ricardo.

Your EOL character is wrong, should be EOL$ = Chr(13)+Chr(10).
Yes, sorry type wrong last night

Posted: Mon Aug 05, 2002 8:49 pm
by BackupUser
Restored from previous forum. Originally posted by Art Sentinel.

Hmm.. Thank you all for the wonderful help! You have set my mind buzzing with new ideas. Once I sort this text to html template executable out, I will post the code here for everyone to learn and benefit from. :)

- Art Sentinel
http://www.artsentinel.net


--------------

Top Ten Reasons Not To Procrastinate:


Coming Soon...

Posted: Wed Oct 23, 2002 1:12 pm
by BackupUser
Restored from previous forum. Originally posted by Fangbeast.
Originally posted by Art Sentinel

Hmm.. Thank you all for the wonderful help! You have set my mind buzzing with new ideas. Once I sort this text to html template executable out, I will post the code here for everyone to learn and benefit from. :)

- Art Sentinel
http://www.artsentinel.net
Art, I don't know if you ever solved this problem but for 3 of my packages, I store the HTML files as data statements in the EXE from where I process and write them out on the fly. I have header, footer and body sections. makes my life easier.

Fangles woz ear orright den?