Page 2 of 3

Posted: Mon Jul 21, 2003 2:11 pm
by ricardo
El_Choni wrote: Let me know if someone is interested.
Of course im interested!! :D

Yes

Posted: Mon Jul 21, 2003 3:56 pm
by LJ
Yes, I am interested, and maybe we could fix the <p> being empty and things like that for poorly formatted .html

Re: Yes

Posted: Mon Jul 21, 2003 4:11 pm
by ricardo
LJ wrote:Yes, I am interested, and maybe we could fix the <p> being empty and things like that for poorly formatted .html
Hi LJ!!

That was exactly the first problem that i found when trying to parse (in DOM style) the HTML code.
Because <p> are sometimes empty but sometimes not.
Then i go to tidy.ocx to fix it (convert to xHTML) and then continue in javascript and forgett my old intention to parse it on PB.

With <img> <br> <hr>, etc. no problem.

If you fix it i hope you share you code!! :D

Message Forum

Posted: Mon Jul 21, 2003 9:03 pm
by LJ
Hi Ricardo,

What is interesting is that the code I posted in this thread could be used to create a message forum. Basically the message forum code is in Purebasic, and people who have access to the message forum will have the Purebasic code running on their computer. Now when they run the program it checks for new messages and, if it finds them, it downloads them into the clients computer from a .html web page using the code I posted to just download the new message. It starts grabbing data with the tag sequence ?

Code: Select all

<b><br><b><b></b>
Now when they want to post a message, it saves their message in a .html file with an embedded tag

Code: Select all

<b><br><b><b></b>
sequence at the beginning and end of the message.

Using the code posted in Tricks and Tips on ftp(ing) files, the program then uploads the .html page to the web server.

And whala! The backbone of a .html based Message Board. The reason I would require anyone viewing the message forum to have the Purebasic program is that the message forum would be an integral part of a larger program. For example, if I created a program for teachers that helped them with seating charts and grades for their students. I could include an option to hook up with other teachers via the built in message forum to talk about how to best use the program and anything else teachers might want to talk about. By having the message forum coded into the program, I could also monitor if my program was being illegally copied by allowing only 1 e-mail address and password to access the forum per user.

This is only 1 of many possibilities with the code by Pille that I posted in this forum. I think this is big and the possibilities are many. We should not go by the assumption that the programs only use is to "rip" copyright information from someone elses web site. And then think, well what if they change the format, then it would mess the program up. I think this code is much more valuable in that you can put up a web page and format it just like you want and then use this program for many advantages. Another possibility is that this program can be used as the basis for a TOP 10 score board in video games. It logs into your web page, grabs the data between any two tags you create, saves it on the local computer and then displays the Top scores. If they make it on the top score, it logs into the TOP 10 web page, formats the data between two tags, and then uploads it with the product #. Once again, you can discourage illegal copies by allowing only 1 account for exceptance in the TOP 10 web page. That is people can't use the name Joe one game, then Bob the next for a single product #.

In otherwords, you can create whatever format you'd like, like 5 characters after the first START text contains the Hitpoints for the Elf. Then upload this file to the web using the ftp method posted in another message thread in Tricks and Tips.

Remember, the usefulness of this code goes beyond grabbing data from someone elses web page. When you are the author of the web page, and you control all the data, this is a very effective way of remote storage of variables/information that can be precisely pulled by another program.

So the next goal is to modify the code I posted in this thread to add the option of uploading the TEST.HTML file that is created, to the web.

Anybody have the time to do this or do I gotta do it?[/code]

Re: Message Forum

Posted: Mon Jul 21, 2003 11:25 pm
by ricardo
@LJ

You are right.

What you are plannign has a name: XML.

Its a very easy and inteligent protocol to share information between different applications in a way that it should be easy to parse it.

Of course the protocol only define the way that the data is treated and the implementation depends on each one.

In fact XML is very similar to HTML, but without the 'unclean' possibilitie to have orphan tags (tags without closing tags).

In resume is the idea of having hierarchy, parent, childs, etc. and 'browse' it by this way.

In XML the tags are anyone inside <>, then this could be fine:

Code: Select all


<software>
  <code>
    <project 1>
      bla bla
    </project 1>
    <project 2>
      <sub project 2>
      </sun project 2>
    </project 2>
  </code>
</software>
Where all the tags after an opening tag and before the closing tag is a child of the other.

In my example <code> is a child of <software> and <project 1> and <project 2> are chils of <code> and <project 2> even has its own child, etc.

I should be able to ask to the parse a list of all the child of some element and/or the innertext or even innerXML (the childs with tags, etc).

Then, the parser must know that every time it found an opening tag the next opening tag is a child of the last one until it found a closing tag, etc.

In HTML the problem to do this is that you dont know if the <p> has a closing </p> tag or not. You need to parse it twice to know that.

But when using xHTML (well sctructured HTML without this kind of things) there is no problem, but its not the standard that the web pages are in xHTML.

Now, we could start writing some XML parser and then add some preparsing feature to be able to do it with regular HTML.

I got it!

Posted: Tue Jul 22, 2003 2:25 am
by LJ
Ricardo,

I got it! Okay, I understand what you mean now about the parent, child relationship. It could be anything, like maybe:

Code: Select all

<software>
     <window 800 x 600>
          <window title = "Hello World">
                <a$ = "this is a test">
                     <messagerequestor a$, 1>
That is the message requestor is the child of the window, which is a child element of the command software. This way ownership can be established! Okay! This way in the example above, we establish that a$ belongs to the Window Hello World! Or we could go <GLOBAL a$> to make a$ available to any window. Well maybe these examples are not so good, but I understand why you have been communicating about child/parent relationships now. This is much more advanced than the simple .html parser in this message thread, but we can use this same program at least to parse.

That's fascinating that we stumbled into XML in this conversation. And I think you are right, rather than write our own language interpreter/parser, we should use XML because it will allow for interaction with many other web sites and programs.

I am reading the information at: http://www.w3.org/XML/. I think we can do this. Create a XML capable browser in Purebasic. The uploading via the FTP code posted in another message thread in Tricks and Tips is sort of a different subject. We could always do that later once we worked out the XML browser.

Internet Explorer already has XML

Posted: Tue Jul 22, 2003 2:39 am
by LJ
Check out http://www.freecell.com/xmleditor/editor.htm

It's a XML web page.

Re: Internet Explorer already has XML

Posted: Tue Jul 22, 2003 4:27 am
by ricardo
@LJ

We need to be able to parse any XML document in PB.
a Fast and reliable set of procedure that let us parse XML.

Its not too difficult since every tag MUST has its own closing tag.

<tag>
</tag>

and everything inside are a child. Its very similar than a treeview.
The important thing is that procedures must be very fast, reliables & optimized.

DOM works this way:

Document means the all document

Document.Body

Means that im asking the child named Body in the document

MyVar$ = Document.Body.MyTable.InnerHTML

Means that i want to body tag that are a child of the body, and i want the MyName tag and its innert html (all the code including tags that MyTable has).

Document.Body.MyTable.InnerHTML = MyVar$

Means that im setting the html inside MyTable element.

The way to handle it is easy: the dot are the delimitation of an element and its childs or between an element and its 'properties'. This simple way makes it easy to code it.

In PB the dot is used by the Structures.

OK

Posted: Tue Jul 22, 2003 5:20 am
by LJ
Before we re-create the engine, I found this:
http://www.xml.com/1999/11/parser/harness.zip

The article that is accompanies this zip is here:
http://www.xml.com/1999/11/parser/

Notably:
"Last September, David Brownell conducted a review of XML parsers for XML.com, testing them for conformance to the XML 1.0 specification. In this follow-up article, he tests Microsoft's MSXML.DLL parser, as found in Internet Explorer 5. Unlike previously tested parsers, the Microsoft parser does not provide a SAX interface, used in the testing procedure. As a result of collaboration with Microsoft, the author constructed a Javascript DOM-based test harness. "

When you download the zip file, you will find a cool file called template.xml that is a tree view from a web page, try click on the + on the far left side of the screen next to the text.

Accompanied with this is a .js file that shows how to access the msxml.dll from Java, I think. I don't know Java, and I can't get it to run on my computer.

I did a Search for the file msxml.dll and I found it in my Windows/system32 folder.

Do you have this file also?

Is it possible to call this .dll from within Purebasic?

Posted: Tue Jul 22, 2003 10:41 am
by El_Choni
Its not too difficult since every tag MUST has its own closing tag.
AFAIK, empty elements are represented like this in Xml/Xhtml: <br /> In my code, Html empty elements are hardcoded.

I can't post the code right now, I'll try to do it the 6th or so.

Reading further...

Posted: Tue Jul 22, 2003 3:39 pm
by LJ
As I read the article (posted in earlier message) it says: "The version of the Microsoft XML (MSXML) processor reviewed here is the one that has been bundled with Microsoft's Internet Explorer 5.0 web browser. It can be accessed as "MSXML.DLL," and can be redistributed with other software, as part of Win32 applications. Since it provides a COM API, it can be used from JavaScript, C/C++, Visual Basic, and other COM-aware programming languages."

This article is several years old and I think Microsoft has released a newer, better version of the msxml.dll than what this author is talking about. If you've got Internet Explorer 6+ on your computer then you have it.

Notice it can be redistributed with other Win32 applications. Notice that it says "Since it provides a COM API, it can be used from JavaScript, C/C++, Visual Basic, and other COM-aware programming languages." The question is, is Purebasic a "COM-aware" programming language? And how do we access it from Purebasic?

SOAP (XML)

Posted: Tue Jul 22, 2003 3:54 pm
by LJ
Hi Ricardo,

I found a article that explains how to call the msxml.dll and set up a XML (SOAP) application in Visual Basic on your own local computer.

It's at: http://www.vbip.com/xml/soap_syd.asp

I think the example code assumes you are setting this up locally on your own computer which is a good thing for beta testing. What I don't understand is how the Visual Basic code is calling the msxml.dll. It looks like it might be done with:

Code: Select all

   Dim objHTTP As New MSXML.XMLHTTPRequest
   Dim strEnvelope As String
   Dim strReturn As String
   Dim objReturn As New MSXML.DOMDocument
   Dim dblTax As Double
   Dim strQuery As String

Do you have Visual Basic to test this?

Re: SOAP (XML)

Posted: Wed Jul 23, 2003 5:15 pm
by ricardo
@LJ

Hi, sorry for the delay.

Yep, all this stuff works in visual basic and even in javascript.
Thats why i use it to work with the webgadget wqith javascript and vbs.

But im planning to develope (not now, but not so far from now) a XML parser for PB.

With VB i don't want nothing anymore :twisted: :twisted: :twisted:

Javascript is ok, vbs too but VB (the bloated compiler) noooooo :D

Posted: Wed Jul 23, 2003 5:48 pm
by Kale
XML Parsing in PureBasic 8O

PureXMLParser v0.4 by Mathias Karstaedt:
http://www.handcoder.de/purexmlparser.htm

XMLParser by Unknown:
http://www.garyw.uklinux.net/PB/XMLParser.zip

Maybe you could continue these projects :twisted:

Posted: Wed Jul 23, 2003 6:10 pm
by ricardo
Thanks Kale!!! :D :D :D :D :D :D