MachineCode wrote:In your code for sorting, you saved all lines to a linked list, but the manual says arrays are faster than those. Should we be storing in an array instead, then? Your code is fast enough anyway... but I was thinking maybe arrays would give a better speed gain, based on the manual's comment?
I would first like to say that when I benchmarked the different procedures I would get varying results. I would change one sort procedure and it would affect the performance of other sort procedures that were not changed (they each acted on the same non-random data). So you will just have to evaluate things for yourself to make the final say.
I don't think there is a significant (>10%) speed difference between an array version and a list version. In my tests the array version differed from the list version in speed by +/- 7%. Here is what the array version looks like if you are interested in the way in which the code differs:
Code: Select all
Procedure.s sortText(text$)
numLines = CountString(text$, #CRLF$)
Dim *textLines.string(numlines)
otext$ = text$ ;This is simply to reserve an output buffer of the same size as the original string
*start = @text$
*end.character = *start
currentIndex = 0
;setup pointers to each substring
While *end\c
If *end\c = #LF ;assumes that #LF is the last character of each substring
*end\c = #Null ;replace #LF temporarily to mark it as a substring
*end + #soc
If *end - *start > #crlf_bytes
*textLines(currentIndex) = *start ;simply point to the string segment without copying it
currentIndex + 1
EndIf
*start = *end
Else
*end + #soc
EndIf
Wend
currentIndex - 1
SortStructuredArray(*textLines(), #PB_Sort_Ascending, 0, #PB_String, 0, currentIndex)
;assemble the output string from the pointers to the sorted substrings
*end = @otext$
CopyMemoryString("", @*end) ;prime the copy function
For i = 0 To currentIndex
CopyMemoryString(*textLines(i))
*end\c = #LF ;restore the #LF and 'unmark' the substring
*end + 1
Next
*end\c = #Null ;mark end of string in case output is smaller than input
ProcedureReturn otext$
EndProcedure
MachineCode wrote:Another question (hope you don't mind, sensei).

So that I can process my own lines one-by-one and not hassle you anymore for help, how would your code be changed so that I can iterate every line of text in turn and do something to each on the fly? I know it's to do with the code before the SortStructuredArray() line, but when I try to PeekS() the lines to get them, it fails (the debug output shows the same line twice or more). Feel up to another lesson? Let's assume I just want to sort through that big Win32api.txt file and put "pure" at the start of each line and "basic" at the end, or basically just do anything to each line as fast as possible. Can you show me a skeleton code that grabs each line, stores it in s$, then adds s$ to a new output? That way, I can modify each s$ on the fly, before it's built into the new final output (with CopyMemoryString?).
Because you want to modify the lines and not simply sort the procedure will make copies of them instead of simply using pointers to them. The extra copying will take an additional 92% running time. You will notice that the following skeleton code is almost exactly the same. A major change was not modifying the input string. Another major change was that the output string could exceed the input string in length depending on how the lines were modified just prior to the sort.
You should be able to modify this to use an array instead of a list by using the earlier example if you wanted to go that route. With this you should be ready to knock many challenges down to size, grasshopper.
Code: Select all
#soc = SizeOf(character)
;---------
Procedure.s sortText(text$)
NewList textLines.String()
*start = @text$
*end.character = *start
charCount = 1
outputLength = 0
;split text into separate lines ending in #CRLF$, remove lines containing only #CRLF$
While *end\c
If *end\c = #LF ;assumes that #LF is the last character of each substring
If charCount > 2 ;2 = Len(#CRLF$)
AddElement(textLines())
textLines()\s = PeekS(*start, charCount) ;any changes to the line can be performed here, such as additions, replacements, deletions
outputLength + charCount ;you would replace this with a calculation for your modifications, i.e. outputLength + Len(textLines()\s)
EndIf
*end + #soc
*start = *end
charCount = 1
Else
*end + #soc
charCount + 1
EndIf
Wend
SortStructuredList(textLines(), #PB_Sort_Ascending, 0, #PB_String)
;assemble the output string from each of the sorted substrings
otext$ = Space((outputLength + 1) * #soc)
*end = @otext$
CopyMemoryString("", @*end) ;prime the copy function
ForEach textLines()
CopyMemoryString(textLines()\s)
Next
ProcedureReturn otext$
EndProcedure
On a side note, you sent me a PM that I am unable to respond to because you currently have user settings that prevent you from receiving PM's from forum members.
