Page 1 of 1

Anyone got a Windows Help file for ASM?

Posted: Sun Jul 27, 2003 7:55 am
by oldefoxx
If I try to find Help on an ASM command in PureBASIC, I get the error message that the IDE could not find a ASM.hlp file in the folder PureBASIC\Help. Turns out that this has to be a Windows Help File, which means that it would load and display in the Windows Help file format.

Doing some research on this, I find that Windows has at least two formats that it uses - an HTMLHelp format, and a newer one with a .chm extension. The example file Help.pb shows how to access either type.

The problem ism searching the internet produced several links for ASM.hlp, but none of these correspond to the right format. There are also several tools available to create a help file in one or both formats. I worked all day with one trying to construct my own version (I've done some Assembler programming from time to time), but the demo version refuses to reopen the project I was working on, so that effort went to naught. I guess I could get back into it if I coughed up about $300, but i have yet to see if that is going to do the trick. I guess I will have to try one of the other programs instead.

But if someone has the ASM.hlp file in a usable format, I would really like to find out if it is available. It would save me a lot of rework, althoug I have to admit that trying to do it this way has given me a chance to learn something new.

Posted: Sun Jul 27, 2003 1:51 pm
by Fred
I have packaged the ASM help file I got. Just unpack them in the Help/ directory and you should have F1 help support in the IDE. www.purebasic.com/download/AsmHelp.zip

Posted: Sun Jul 27, 2003 3:08 pm
by Flype
ho, good news, thanx Fred :!:

Of course, now I need more...

Posted: Tue Jul 29, 2003 3:40 am
by oldefoxx
I really appreciate the link for the x86 instructions to help me with Assembler coding.

I can't find a good source that lists and explains FASM directives. I did find one online for MASM, but I know that there are going to be differences, even thoug I haven't had time to look at it yet. Anyone know where to look, or have one on hand?

Posted: Tue Jul 29, 2003 1:44 pm
by freak
You can download the full FAsm package from http://flatassembler.net/
It contains a 'FASM.TXT' file which explains all directives.

Timo

Sorry, it's not there.

Posted: Tue Jul 29, 2003 5:49 pm
by oldefoxx
I had downloaded FASM a couple of days ago when I found out that it was being used with PureBASIC, then I went through the package. There is no FASM.txt file in there - in fact the only text file is LICENSE.txt. So if you have the FASM.txt file, would you mind sharing? It may be that they forgot to bundle it when they did a version upgrade. There is very little in the way of additional support or active links at that site (still too new, I guess), and I could find nothing specific elsewhere.

Posted: Tue Jul 29, 2003 6:02 pm
by freak
Oops, I just notices, that only the DOS/win32 console package contains
this file.

get it here:
http://flatassembler.net/fasm148.zip

For more information on fasm, you could have a look at the FAsm message
board here: http://board.flatassembler.net/

Timo

Getting started with Assembler

Posted: Wed Aug 06, 2003 4:58 am
by oldefoxx
First of all, Assembler is most often used to speed up som part of your program, or to add a capability to your program that is not supported
directly by statements and functions in that language. The idea is, that
if it can be done by any means possible, then Assembler is the key for
getting it done, and if it needs to be fast, nothing is faster than Assembler.

So why isn't everything written in Assembler, also known as ASM for short? Because it isn't easy to pick up everything that writing a lot of ASM code requires. It is possible to write a high level language for different operating systems, and even different computer processors, but when you get down into ASM code, then the way a particular processor is designed and how it works become very important.

Fortunately, all X86 processors have evolved to have certain shared capabilities, and by relying on that commonality, you can write your ASM instructions so that most PCs today can properly carry out those instructions. Some advanced capabilities are only shared by higher end processors, so it is important to know where that dividing line is. For most purposes, the dividing line is pre-x386 (or 80386), and x386 and beyond. Before the x386, it was necessary to handle everyting with 8-bit and 18-bit registers. With the x386, most of the registers were extended to 32-bits, and additional instructions added to work in 32-bit mode as well as continuing to allow use of 16-bit and 8-bit register access.

Today, x386 is considered positively ancient, and most programmers now write to that processor and above, so that they can use 32-bit registers and addressing forms. This also allows them to use the Floating Point Unit (FPU) instructions, since it is from this time that the FPU became incorporated into the hardware. Before that, floating point calculations had to be done with software routines, which were very slow. Ther have been other hardware improvements with succeeding generationss of processors, and they can be important in such things as game programming and making sure the graphics displays update as fast as possible, but that is far beyond the scope of a beginner to deal with.
Even handling floating point numbers from ASM is rarely done, as the use of integer arithment suffices for many uses, and is much easier and faster to program, especially in Assembler.

In order to learn Assembler, it is essential to find several books on the subject. Much as been written on the x86 instruction set, from old books that cover only the 8085/8086, to more recent books that cover the 60386/80486. Any of these would help. A good book will cover the instructions in considerable detail, which is not possible here. And by having several books on the subject, you can follow one text (whichever one is easiest to understand) and use the others for reference. This is important, because no single text is going to be entirely complete or give a balanced treatment to all topics -- there is just too much involved for that to happen.

In general, there are several different aspects of the x86 structure that need to be understood. This is referred to as the computer architecture,

Registers - where the contents of memory are placed when computer
operations are performed
Stack - a portion of memory set aside to hold temporary values,
temporarily, or a Last-In First-Out (LIFO) basis.
Memory - The amount of information that the computer can hold in
immediate access before having to read or write to a device
such as a hard drive. Usually called RAM, for Random Access
Memory, and directly accessable via the address registers.
Storage - Anything where data (which includes programs) can be
stored until needed later. A floppy drive or hard drive would
count as storage. Generally, a printer would not, since it is
difficult to get printout information back into a computer.
Screen Memory - Information in screen memory is automatically displayed
on the screen. Behind everything that appears on the
screen, there is some amount of memory involved. The only
thing different between memory and screen memory is the
fact that it supports a display. A graphics card translates the
contents of screen memory into a viewable image. The
amount of memory used for this purpose, and how it is set
up, determine many of the aspects of what is seen, from the
size, resolution (amount of detail), color range, and even the
speed at which those images can be redrawn.

There are a number of registers in a X86 processor. Some are general
purpose, meaning they can be used in a number of different ways. Others are special purpose, meaning they are restricted to doing just one thing. In general, when you start out programming, you are most interested in the general purpose registers. That is because you can do a lot of things with them, and because they are so versitile, they are rarely ever reserved exclusively by any one program so as not to be used by another. However, even with the general purpose registers, some conventions have grown up about their use in some context, and this might have been in part due to limitations or design features of the x86 itself, or accepted conventions adopted as part of the operating system.

As an example of a hardware limitation, string instructions read from locations specified by the DS:SI register pair, and store results using the ES:DI register pair. DS, for Data Segment, points to an area of memory where data is to be read from. By having a ES, or Extra Segment register, it is possible then to write data to a different area of memory. The SI (source Index) and DI (Destination Index) registers then actually keep track of the precise memory address in those areas that would be read from or written to.

Another hardware limitation is that the CS (Current Segment) register points to the area of memory where the current program is located and running. You are not allowed to directly change the contents of CS as this would corrupt the running process, but it is automatically changed when you execute any FAR jump commands in your program.

An operating system limitation would be the fact that your segment registers have to be preset at the start of your program to point to your program and to the data segment where any data is to be stored. That means that at least the DS register has to be defined so that your program deals with the right data. If you change the DS register, you have to be prepared to restore it later in case you need to access more data. Depending upon the model you adopt for your program (tiny, small, large, or huge), the other segment registers may or may not be
preassigned, and they may or may not be set to the same segment.

To store a register's content temporarily, it is a common practice to put it on the stack. The stack is like a large array which is filled by using a PUSH command to put something on it, and a POP command to remove it later. It is very important that the stack be returned to its original state before your program ends, which normallyl means that every PUSH has to have a later POP to balance it out. The stack is in memory, so it is accessable by a segment register -- which just happens to be called SS, for Stack Segment. There is a index register uniquely designed to work with the stack, but called Stack Pointer, or SP for short. The stack does not build upwards in memory, but rather builds down, so that as you add items with a PUSH item, it drops lower, and as you POP items off the stack, it moves back towards its point of origin. Often. in an effort to preserve the stack against all possibilities, a separate STACK is created in the user's data area, and just used for local operations related to that program. So if you encounter a STACK command in a program, they either wanted to try to protect the original stack, or they wanted to ensure that enough stack space is preserved to do a lot of PUSH instructions or CALL commands.

CALL commands in Assembler are like GOSUB, Function, or Sub calls in a higher level language. The present address, plus 1, is PUSHED automatically on the stack, and a later RETURN or EXIT command will POP the return address off the stack and cause the program to resume operation where it left off when it made the call. GOSUB generally translates into a local call (NEAR jum) command in Assembler, whereas Function and Sub translate into Far Calls (FAR jump) and having parameters placed on the stack before the call is executed. NEAR and FAR refer to whether two addresses are close to each other, and whether the same segment is involved when the action is performed. For FAR call, the current segment and the jump offset have to bpth be placed on the stack, since you are jumping outside the current segment. For FAR returns, both the offset and current segment have to be retrieved. For NEAR calls, only the offset is put on the stack and removed again; the current segment remains unchanged.

From inside a GOSUB - RETURN portion of your program, everything looks the same. You have the same variables and structures, and they still have the same contents. However, a Function or Sub has the parameters being passed to it, if any, on the stack above the return segment and offset. These can then be accessed by using the stack segment and pointer, Plus an offset, to retrieve those values without having to use a POP instruction. This is good, because a POP always effects what is on the bottom of the stack, so the first POP would effect
our return segment and pointer, and potentially could mean that there would be no way to exit the function or sub properly later.

There are a couple ways to get around this. The most common method is probably to use the BP, or Base Pointer. We can perform an offset above the base pointer, which coincidently uses the SS register as its segment register. But the BP might be used for something else somewhere else, so a good practice is to save its contents first. Thus, you may find the following code sequence in some assembler routines that are set up as proceedures (functions or subs):

PUSH bp ;PUSH the contents of PB onto stack
MOV bp,sp ;copy the contents of SP into BP
MOV ax,[bp+8] ;copy the contents of the pb+8 location to ax

The approach allows bp to stablize with a precise offset point on the stack,
and from this you can use a constant value to get any specific parameter and put it where you want.

But wait! What about the previous contents of ax? Won't they be lost? Yes, that is true, and this may or may not be important. Because the general registers are always the same registers in every program or every routine called. So if you want to protect them, you have to save them as well. So you might have to do a PUSH ax.

But then how about the contents of the registers bx,cx,dx, the index registers (si and di), and so on? The answer was to create one instruction, called PUSHA, which pushes the most commonly used or impacted registers onto the stack, one after another. It's counterpart, the POPA cp,,amd. gets the original values and puts them all back in the original registers. Of course you may not need to use all those registers, or you may want to return a value in a register which a PUSHA and POPA would overwrite, so there are many times when a PUSHA and POPA would not be practical.

The most common use of PUSHA and POPA are for things like Interrupt processes, often used with DOS-based programs, and also with programs that run in a DOS-comatability box under Windows. Any PUSH or POP command places or removes some increment of 2 bytes on the stack, usually 2, 4, or 8. There are never any odd-number of bytes placed on the stack as a result of a PUSH or POP. The number of bytes passed in this manner depends upon the source or destination. If a 16-bit register, there will be two bytes. If a 32-bit register, there will be four bytes.

When you execute a routine that was writtine in Assembler, you will generally find this arrangement

1st parameter
2nd parameter
3rd parameter
return segment
return pointer
pushed PB register ;PB now set to SP, pointing here
local pushed register
local pushed register
local pushed register

This code arrangement is somewhat different when using a C/C++
compatability mode. since the parameters are then passed in reverse
order. There you would find:

3rd parameter
2nd parameter
1st parameter
(the rest the same)

The advantage of the C/C++ form is that you do not have to pass all the
parameters every time. You might have a situation where you only need to pass the first parameter. Of course there has to be some way of knowing when a parameter is not passed, and this method often involves passing a NULL or NULL$ reference to mark the end or absence of input parameters. Another way is to pass a count of the number of paramteres that have been passed on the stack. Here, some addord between the routine and the calling process has to be reached.

Another technique for accessing parameters on the stack would be to set another segment register equal to the SS, so that you could use other methods of reading the stack. There is no advantage in this approach, and involves using more registers to access the stack. which is why it is not in wide use. Note that you cannot load or unload segment addresses to or from the segment registers. You can push these values onto the stack and pop them into the segment registers. or you can load or unload a segment register with a general purpose register.

In the old DOS world, system services were obtained though the use of the operating system's interrupts. These were processes that had their values passed through the general registers, not by parameters on the stack. Conversely, the Windows environment has all of its parameters passed via the stack. and only returns values in registers. More on this shortly.

Arrays and structures, even strings are passed uniquely by pointer reference. This may mean that an address pointer is passed, understood to be within the Data Segment pointed to by the DS register, or it may mean that both the segment and the pointer are passed on the stack. You may also have to pass a size value, or, as often the case with the Window's API (Application Program Interface routines), you must define a structure to the size and form required by the routine to be called, and no specific indication of that size or form is passed -- but if you make a mistake by not having it properly dimensioned, you can inadvertently cause a memory corruption as a consequence, meaning a process may fail or the system may crash.

Subs do not return a value, but by convention, a function returns precisely one value, which could be anything. In a 16-bit system, the value would normally be returned in the AX register. For a 32-bit return value, it would normally be returned in the DX,AX pair of registers, with DX carrying the upper 16-bits. With the 32-bit architecture available from
x386 on, you have extended registers, where EAS, EBX, ECX, and EDX each are 32 bits wide, so it is possible to return 32 bits in EAX instead of
using both DX and AX. The question is, though, when referring to routines in the operating system, which handle 16-bit addressing in a pre-x386 environment -- how would that foutine act under a x386 environment instead?

Technically, it might be possible to determine which architecture is used in a given computer and adapt routines accordingly, but it appears that if you use an older operating system on newer hardware, you are only going to have the advantages supported by that operating system. No cutting edge decision making about giving you the results one way on one platform, and a different way on a more recent platform. So if you get results in the DX,AX pair for that operating system, you will continue to do so for that operating system, even if a much newer processor is involved.

Calling Windows API routines is very much like calling routines from other libraries or sources. You have to know the name of the process, you have to know how many parameters are involved, you have to know the type for each parameter, and you have to pass those parameters in the correct order. And if any result comes back, it will be via the registers,
usually the AX or EAX register, but in an older environment, it may be by the DX,AX register pair.

Note that since the values are just in the registers, this does not impact on the stack arrangement at all, so you could call a function as though it were a sub, with no ill effects. So if you had a procedute that returns a valu (a function) such as a$=Reverse(b$), it would reverse the order of characters in b$ and send the results to a$. Depending on whether you actualy changed b$ or not, it may or may not be reversed. Usually, a good procedure is to do the work on a copy of what was in b$, and then just return the copy, so that b$ itself remains unchanged. Thus, a$=Reverse(b$) would copy the reversed copy of b$ into a$, but leave b$ intact. If you then did a Reverse(b$), acting on it like a Sub instead, a reversed copy would be made, but there would be no assignment for the results when done, and in this case, it would be as though nothing had taken place.

The opposite is technicaly true, that you could call a Sub as through it were a Function by using it after an assign, such as a=subproc(b$). The problem is though, it that whatever comes back would be whatever the registers held when the procedure ended, and that might not be usable. Most compilers will catch and reject this approach on the grounds that the results would be meaningless, and that trying to use a Sub as a Function might be a coding error of some sort.

The purpose of coveing the stack in some depth was due to the fact that when you write your own inline assembler routines, there is a good chance that they will be within some procedure. You can access the local variables and structures, as well as any global ones, by using the segment registers and pointers. You can also pass any parameters to the local variables before trying to access them in your own program. But if you want to know where they actually are when you are inside a procedure, they are actually located on the stack, and the PB and/or SP registers, coupled with the SS segment register, chan help you determine just where that might be. Besides, being able to manipulate the stack with a PUSH here and a POP there is a good way to juggle some values by keeping things temporarily on the stack.

The general registers are AX. BX, CX, and DX. AX is also the Accumulator Register, so it is principally used with arithmetic and logical operators. The BX register can be used in a similar fashion, but some instructions also regard it as an index or offset pointer. The CX register, again, can do several things, but its speciality is being a counter register for repeated string operations or loop control. The DX register is again able to do various things, but is mostly used for integer multiply and divide operations, when the results exceed what can be held in the AX register alone. The DS and ES registers, as mentioned, are used with the SI and DI index registers to process repeated string operations, and for just accessing the Data Segment in general. There is a Status Flag register, which remembers the results from certain operations. In general, these operations are the results of any arithmetic, comparison, or logical operation. Other operations, such as MOV, do not change the state of the status flags, so it is important to know what proor operation caused the status bits to be set as they are when you check them.

You can load registers from memory, you can rotate and shift them, you can add them, subtract them, compare them, or perform logical operations on them. You can push and pop them to and from the stack,
you can MOVe one to another, you can exchange their contents, you can
alter them in many ways, and as a result, you can make them represent almost anything and any process imaginable. And you can pair them in unusual ways.

As pointed out, early x86 processors used 16-bit addressing efforts to access memory. Now 16-bits, taken together, can represent only 65,536 possible distinct combinations (0 to 65,535 decimal). Even then, it was possible to consider having up to 512 kbytes, or as much as 640 kbyres of memory. That meant that the computer could only access a small part of the total number of addresses at any given time. How do deal with this? Believing that only a factor of ten was involved (10 times 65,536 - 655,360 combinations or more than really needed), the engineers at Intel made it so that the Segment registers accesses pages of memory in blocks of 16. So a segment of 0000h pointed to segment 0 (addresses from 0 on up), and a segment of 0001h pointger to segment 1 (addresses 16 on up), and so on. The index registers, used as an offset, could point tyo a total of 85,636 distinct addresses per segment, and naturally that meant a lot of segment overlaps. So if you referred to this segment and index pair of 0:16, where 16 is decimal, that was the same as saying 1:0, where the segment address indicated segment 1. In fact for most of the segments in memory, there were multiple ways to address the same memory location by different means, including different values in the segment and index pair. This pair, incidently, was usually written as [ds:si] or [es:di}, or in a similar form for whatever segment and index were involved.

Just to make things a little bit more confusing, the computer instruction set includes segment override capabilities. That is, that if a certain instruction normally defaults to one segment register, you could use a prefix to force it to use a different segment register instead. It made for a slightly longer instructions as a result, but it increased the versitility of the addressing modes available.

But segment override brings up another topic, which is the ASSUME operator. If you point a segment register towards a certain region of memory, such as to the Data Segment, or to a buffer area, you want the compiler to know that you want it to use that segment register whenever you move data into or out of that area. So after you get the segment register set up as you want it, you could use an ASSUME directive so that the assembler could refer to the right segment register when it had an instruction to refer to a apecific item in that region. If a segment override was then needed, the assembler could determine this and do it automatically.

Every assembler has its own set of directives, and learning assembler is also making an effort to learn about these as well. They do a lot to control just how the resulting program is built, how small it is, and how fast it runs, as well as what it had best be used for. FASM, for Fast ASM, or Fast Assembler, has it's share. As it happens though, when you write inline assembler code, most of the needed directives have already been issued and executed by the PureBASIC compiler. That usually means you just have to kick in some assembler code to do a few tricks that aren't so easy in the BASIC syntax, then drop back into PureBASIC to finish up. Sounds easy enough, and after you get a bit of a handle of registers and moving data into and out of them, or getting it off the stack for yourself, the rest will be almost a cakewalk.

Assuming that you have gotten yourselves some good books on the subject and passed a few evenings browsing through them. And dnon't be afraid to work any sample code or "borrow" code from various sources. Stepping though code and watching the registers and having some familiarity with the assembler references will teach you a lot very quickly.

Posted: Wed Aug 06, 2003 9:30 am
by LarsG
wow, that was a long read... 8O

Nice "article" though.. :)

-Lars

Using extended registers for memory access

Posted: Wed Aug 06, 2003 8:38 pm
by D'Oldefoxx
Thanks, it was long, but there is a lot to understanding the X86 architecture and instructions. Not that it is especially hard, just more to it than copying someone else's routine into your code, then not knowing why it didn't work the way you expected.

I got tired, and stopped, posting just what I had written. But there was a bit more I was inclined to say, so here goes:

The use of seg:index register pairs was forced on us by the limits of having just 16-bits for addressing, meaning only 64k of memory could be accessed at one time without having to modify the segment register. This was largely overcome when the architecture was extended to 32 bit registers, which now allowed us to use 64k Squared address space, or over 4 Gigabytes of addresses. Since most computers do not have more than 256 Mbytes, 512 Megabytes, or max out at 2 GBytes, it looks like this might suffice most of us for awhile. But backwards compatability meant that the computer still had to work properly with the old Seg:Index addressing form, so how to avail ourselves of this new power?

First of all, it was recognized that the Seg:Index form did offer certain advantages in the way it worked. Using two registers together, with one acting as a base and the other as an index, paid dividends in the ease with which block transfers could be set up. So why not continue to support the old mode, but allow some of the extended registers, the onex expanded to 32 bits, to arbitrarily to be designated either the base or the index? The understanding was that the base (rather like the segment registers), were not incremented or decremented, while the index registers could be. So this brought out a new form of memory addressing. Whereas the old form was [seg:index+offset] (if there was an offset), the new for mwas [base+index+offset]. Note that the base and index both had to be extended registers (such as EAX or EBX), and the offset is always a constant. The following extended registers can be used as the base:
eax, ebx, ecx, edx, esp, ebp, esi, and edi. All of these can also be used as the index, except esp. That register, exp, is the extended stack pointer, and it cannot be locked into automatic increments or decrements as part of some instruction cycle because that would destroy the integrity of the stack, which is the reason it has to be excluded from this mode. So the question is, would a base + index form of [ebx + ebx] be valid? Yes, you can have the same register for both the base and the index, although the need to do this is not apparent. Without further testing, I would assume that as a base, it would neither increment or decrement, but as an index, it could do either, so in effect, I believe you would have a base and index that both incremented and decremented, once per cycle.

But could you have something like [ds:ecx] or [eax + cx] as addressing modes? No, you are trying to bridge two distince addressing form, one that is a legacy form, and the other an extended form. For the extended form, you can only use extended registers.

How does the assembler know which register is the base and which is the index? By convention, MASM (the Microsoft Assembler) recognizes the first extended register identified as the base, and the second one identified in the statement as the index. Most other assemblers have probably adopted this rule, but some others may have a more explicit way of making this fact known.

What is the difference between the address form of [ds:si+bx]. [ds:si][bx], ,[si+bx], and [si][bx]? Not much, really. The inclusion of ds: just makes the addressed segment explicit. Without it, it is just inferred. The only problem, might be if the assembler had been instructed to ASSUME some other segment register was to be used. Grouping [bx| by itself or putting it inside the other brackets behind a "+" sign just mean that the contents of BX are included in the step to define the referenced address.

If I added two extended registers together. could I access 8 GBytes of memory (4 GBytes + 4 GBytes = 8 GBytes, right)? No, that is not the way it works. You can only reference as much memory as you actually have, regardless of what count you might end up with. If you have 256 MBytes, that is all you can address. But it is actuallly even more restrictive than that. The Operating System (Windows) takes up a significant part of the available memory, and it has to preserve as many "virtual" memory areas as needed to run different programs in, so that you can have several programs running at once, which do not intefere with each other. Each is only allowed to see its own memory space. If ant program tries to address memory that is out of this area, the operating system will detect this and shut off the offending program with an Access Violation warning, forcing it to terminate.

Because subdividing memory up this way means that the resulting memory areas are rather small, the operating system would quickly seize up if it did not have some way to release some of the demand for memory. So one of the thing the operating system does, as it allocates memory for new processes or for additional structures, is to see if it is currently available, and if not, to look for some inactive process that can be temporarily offloaded to a swap space on the hard drive. That will temporarily free up some memory that can then be handed to the new process. When that process is done, the stored portion of the saved process can then be reloaded to continue to operate as before.

In this way the operating system is able to maintain the illusion of having more memory, but the cost is high in terms of performance, because hard drives are thousands of times slower than internal memory. This is the reason that the number one improvement for any PC is to increase the amount of internal memory, or RAM, that it has available. The other three improvements are better graphics cards with more memory (mostly for games), faster and bigger hard drives, and a faster processor, in just about that order.

So in other words, the x86 instruction set and registers may appear to give you the tools for accessing all of memory, but that is not your perogative and the operating system will do its best to keep your program from screwing something else up.