Need The Full List of PureBASIC Keywords

Just starting out? Need help? Post your questions and find answers here.
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

User_Russian wrote:New version file purebasic.php for GeSHi.
I've updated the PB file on GeSHi source on GitHub, using your changes.

In the commit message I credited you and backlinked here:


https://github.com/GeSHi/geshi-1.0/comm ... e8b09a5d65


Unfortunately it seems that the PB lang definition didn't make it to the newer GeSHi v1.1.

I don't use GeSHi, so I wouldn't know how to contribute to it.
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

I've created a new "Syntax Highlighting Guidelines" sections in The PureBASIC Archives project resuming the key points of this thread:
Currently it's in a separate dev branch, and it will take some time before it's ready for going into the main branch. I have some scattered bits and pieces of resources that I intend to put in there: anything that can help creators and maintainers of syntax highlighters and code editor's definitions. For eample, I've kept some palette files (in various formats) of the default native PB IDE color scheme, and its CSS version; this might be handy for syntax themes developers.

Of course, the main goal will be creating a list with all the keywords, built-in procedures and commands, etc.. I still have to workout a lightweight format tha could hold this info for each PB version. I was thinking of a JSON file, which is better for version controlling than a database file --- the database can be built from this file. To avoid data redundancy I was thinking of having the full list of keywords for PB 5.00, and from then on just keep track of new additions, depractions and renaming --- ie, the list would contain for each release its version number and then a list of new and deleted keywords. The tool that builds a database from this JSON file will simply rebuild the full list based on the previous release information along with the new info. But it's just a draft idea right now ... (for code editors, additional fields might be reuired, like the associated companion instruction in "pairing instructions" that have a BEGIN/END strcuture, like DataSection/EndDataSection, and so on).

So, if you have any suggestions, links, code snippets, or whatever might be useful on the topic, feel free to make a pull request.

@Marc56us: is it ok with you if I publish your commandindex.html parsing code? I might change a few things (add comments, and maybe write to file), but the paternity would be attributed to you, with a backlink here. Any licensing suggestions?
The PureBASIC Archives: FOSS Resources:
Marc56us
Addict
Addict
Posts: 1479
Joined: Sat Feb 08, 2014 3:26 pm

Re: Need The Full List of PureBASIC Keywords

Post by Marc56us »

Tristano wrote: @Marc56us: is it ok with you if I publish your commandindex.html parsing code? I might change a few things (add comments, and maybe write to file), but the paternity would be attributed to you, with a backlink here. Any licensing suggestions?
Hi Tristano,
You can use, modify, and publish as much as you want. No special license.

:wink:
Marc56us
Addict
Addict
Posts: 1479
Joined: Sat Feb 08, 2014 3:26 pm

Re: Need The Full List of PureBASIC Keywords

Post by Marc56us »

Version with output in txt file
Processes functions and constants at once (in two files).

Code: Select all

; PB_Keywords_Lister.pb
; List all Functions and Constants from PB html help 
; (Not basic keywords for the moment)
; Written by Marc56us, 2017-04-25
; License: Unrestricted usage permission.
; Feel free to use, modify, ameliorate, simplify as you want
; Topic from Tristano, http://www.purebasic.fr/english/viewtopic.php?13&p=506269

InitNetwork()

Repeat
     Read.s Info_Type.s
     If Info_Type = "END"
          Debug "OK All Done."
          End
     EndIf
     
     Debug "--- Extract: " + Info_Type.s + "..."    
     
     Read.s Info_URL.s
     *Buffer = ReceiveHTTPMemory(Info_URL)
     
     If *Buffer
          Taille = MemorySize(*Buffer)
          All_Functions.s = PeekS(*Buffer, Taille, #PB_UTF8|#PB_ByteLength)
          FreeMemory(*Buffer)
     Else
          Debug "Download Fail (" + Info_URL + ")"
          End
     EndIf
     
     Read.s Info_RegEx.s
     If Not CreateRegularExpression(0, Info_RegEx) 
          Debug "Can't create regular expression " + Info_RegEx
          End
     EndIf
     
     Output_FileName.s = GetTemporaryDirectory() + "PB_" + Info_Type + ".txt"
     If Not OpenFile(0, Output_FileName)
          Debug "Can't create output file for: " + Info_Type
          End
     EndIf
     
     If ExamineRegularExpression(0, All_Functions)
          While NextRegularExpressionMatch(0)
               ; Debug RegularExpressionGroup(0, 1)
               WriteString(0, RegularExpressionGroup(0, 1) + #CRLF$)
          Wend 
     EndIf
     CloseFile(0)
     RunProgram(Output_FileName)
     Debug "    Done. " + #CRLF$ + "    FileName: " + Output_FileName + #CRLF$
     FreeRegularExpression(0)
ForEver


DataSection
     ; Name, URL, Regular expression needed
     Data.s "Functions", "http://www.purebasic.com/documentation/reference/commandindex.html", ">(.+)</a><br>",
            "Constants", "http://www.purebasic.com/documentation/reference/pbconstants.html", "(#PB_[\w\d]+)",     
            "END"
EndDataSection
This is a quick and dirty code. Probably a structure will be best instead of datasection for this kind of usage. Adapter as you want.

:wink:
User avatar
Josh
Addict
Addict
Posts: 1183
Joined: Sat Feb 13, 2010 3:45 pm

Re: Need The Full List of PureBASIC Keywords

Post by Josh »

Marc56us wrote:"Constants", "http://www.purebasic.com/documentation/ ... tants.html", "(#PB_[\w\d]+)"
This is a very very small part. PureBasic knows about 16000 constants.
sorry for my bad english
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

Josh wrote:
Marc56us wrote:"Constants", "http://www.purebasic.com/documentation/ ... tants.html", "(#PB_[\w\d]+)"
This is a very very small part. PureBasic knows about 16000 constants.
I think the documented constansts are only those relevant to PureBASIC commands (ie: excluding the WinAPI on Windows, or constants belonging to third party libraries).

Do all these 16000 constants appear in the Structure Viewer's "Constants" pane? I think that the Structure Viewer should show the same structures, interfaces and constants that are outputed by the compiler invoked interactively. Possibly the latter, or the former (or even both) might be OS dependent, and not show constants relating to libraries or commands unsupported by the Guest OS.

Any idea on that?

PS: This list of constants (at least those relating to PB commands) would be useful for designing a syntax package for SublimeText (as an example) in order to implement auto-completion suggestions. SublimeText is very robust and doesn't usually suffer any lagging from having to handle long lists of tokens for highlighting and autocompletion. If some automated system for extracting the tokens of each PB release can be achieved, maintaining up to date similar packages would become a breeze (and we could avoid the risk of having packages that beome stale and outdated).
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

I'm almost there ... I've just finished working on a tool called "Syntax Highlighting DLL Parser".

You can feed to the tool the `SyntaxHilighting.dll`from any version of PureBASIC, and it will parse it and extract a list of the keywords therein contained:

https://github.com/tajmone/purebasic-ar ... ing-DLL.pb

This app parses the SyntaxHilighting.dll and extract the list of keywords it contains, and saves them to a text file. The final list will contain three lists of keywords (in this order, and no separation between them):
  1. PureBASIC pseudotypes
  2. PureBASIC keywords
  3. ASM keywords
Any PureBASIC user should be able to easily distinguish when one list ends and the next one begins by his knowledge of the keywords and by the fact that the alphabetical ordering starts over again with each new list.

Please note that the final list will contain some duplicate keywords and a couple of non-strictly keywords strings (ie: P and ABCUWLSFDQI, the former relating to accessing procedures variables from inline Assembly, the latter being the list of characters used for native types).

The purpose of this tool is not to create a ready-to-use list of tokens: the purpose is to pass the lists to some diffing tool in order to see what has changed between different PureBASIC releases. This would allow a quick way to see if tokens have been added or removed since last version, allowing maintainers of language definitions to manually adjust their lists.

Bare in mind that the tokens list will not contain the built-in commands: PureBASIC IDE highlights user-created procedures and built-in functions in the same manner. For a full list of the built-in commands, refer to the Commands Index section of this document.

....

I don't think we can ever achieve a fully automated (and reliable) way of maintain the list of PB's syntax tokens. But with @Marc56us' code to parse the Commands Index page, and this DLL parser, maintaining an existing list becomes a breeze --- only need to diff the generated lists of the latest PB version with the previous one.

Now I only need to change Marc56us code to parse from file instead of the website --- to get a list from previous versions of the documentation, which can only be found locally. Once the full list is made, Marc56us' version that reads from the website is all that will be needed for maintainince of the commands index (unless we forget to parse a release).
The PureBASIC Archives: FOSS Resources:
Marc56us
Addict
Addict
Posts: 1479
Joined: Sat Feb 08, 2014 3:26 pm

Re: Need The Full List of PureBASIC Keywords

Post by Marc56us »

Code: Select all

; Extract "commands index" from PB .CHM file
; Marc56us - 2017-04-27
; License: Unrestricted usage permission.
; Feel free to use, modify, ameliorate, simplify as you want
; Topic from Tristano, http://www.purebasic.fr/english/viewtopic.php?13&p=506269

; Note a .CHM can be open with 7zip (not with LZMA PB internal function)
; Download 7zip (http://www.7-zip.org/) and copy 7zG.exe in folder or modify path

HelpFile.s = OpenFileRequester("", "PureBasic.chm", "*.chm", 0, 0)

If FileSize(HelpFile) < 1
     Debug "No File or Cancel. Quit"
     End
EndIf

Debug "Open"

; Use 7z.exe or 7zG.exe (Prefere 7zG (graphic) because of graphic error message
If RunProgram("7zG.exe", "e " + HelpFile + " Reference\commandindex.html", "", #PB_Program_Wait)
     Debug "Extract OK :-)"
     RunProgram("commandindex.html")
Else
     Debug "Extract KO :-("
EndIf
:wink:
IdeasVacuum
Always Here
Always Here
Posts: 6425
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by IdeasVacuum »

.... I still think it would be a whole lot better if a full list (text file) was delivered with each version of PB. Has there been no work from Fred or Freak?
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

Thanks a lot @Marc56us! I've added it to the repo's branch.

Currently, I'm trying to work out a JSON format (and possibly a schema) to handle the keywords, but I can't make up my mind.

The general idea is to build a progressive list, grouped by PB release number. For each release an array will be passed for added, renamed, and deleted tokens. This means that the JSON object for PB 5.00 will contain an added array of all the initial tokens. Then, the successive objects (PB 5.10, 5.20, etc) will contain arrays of further additions, renamings, and deletions.

It will be up to the final application to decide how to handle the data. For example, a code beautifier will most likely build up an additive list of tokens (so that the syntax of older and newer code are covered alike), therefore renamed and deleted tokens will be added to the final list.

On the other hand, when building a syntax for a code editor it's likely that one wishes to target only the latest version: in this case the list is built progressively, starting from PB 5.00 upward, but deletions are renamings are taken care of while building the final list.

This progressive system will also allow to build a list of tokens for a specific PureBASIC version (useful when working with both latest release and LTS version, for example) by stopping the process at any given version number. I deem it a better approach then keeping a full list of tokens from every version (which would produce a huge redundancy of data). Also, once the initial list of PB 5.00 is ready, the JSON file can be updated manually by reading the changelog (usually a release might bring some new keywords, and now and then some renaming or deletions).

What I'm still not sure about is how to handle the JSON data structures. Should I keep the keywords, functions and ASM keywords lists in separate files or put them all in a single JSON file with subgroupings? Also, as noted in the documentation I've written in the repo, some syntax definitions distinguish between general keywords, debugger and compiler directives. So it might be good to sub-group keywords accordingly --- which brings in the question on how to handle these groups without creating a huge JSON monster of nested structures.

Since the idea is that maintainance should (or could) be done by manually editing the JSON file, I'd like to keep it clean.

Any suggestions on the JSON structure?

Until I actually know of all the potential applications , and maintainers needs, I'd like to keep it simple (and therefore, separate files). For my current purposes -- maintaining the two highlighters definitions -- I only need the PB keywords (no ASM support currently, nor built-in functions). For this simple task, automating the syntax definitions of the tokens list is really round the corner, not much work needed there. But sooner or later I'd like to broaden the tokens range in both these definitions --- currently, I just wanted to mimick PB's native IDE way of highlighting, but these highlighters support a broader range of tokens classification.

I haven't so far found online examples of third-party code editors PureBASIC definitions (SublimeText, Atom, Vim, etc.) except for the Kate definition (which I'm not sure if it's used for editing or just for beautifing), and I'd really like to see some real-case uses to better understand how this list could be built in a way that would be useful for most people.

Someone here mentioned coding PB in third party editors, could you tell me more about it?
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

I've pushed a first draft of the JSON list of PB keywords from 5.00 to 5.60:

https://raw.githubusercontent.com/tajmo ... words.json

Here is a cut-down version:

Code: Select all

{
  "500": {
    "changes": true,
    "tokens-add": [
      "And",
      "Array",
      ...
      "With",
      "XIncludeFile",
      "XOr"
    ]
  },
  "510": {
    "changes": true,
    "tokens-add": [
      "Align",
      "CompilerElseIf",
      "MacroExpandedCount",
      "UndefineMacro"
    ]
  },
  "511": {
    "changes": false
...
  "560": {
    "changes": false
  }
}
... and you get the idea. Each object represent the PB version number (in string), and contains a boolean "changes" key to allow quickly querying if any changes are present. How does it look like? It seems simple enough to maintain (even manually), and JSON is a good standard that can be handled by any language.

In this particular list, only keywords additions are present. If it was a list of commands, version 5.60 could also contain a "tokens-rename" key like this:

Code: Select all

  "560": {
    "changes": true,
    "tokens-rename": {
      "Base64Decoder": "Base64DecoderBuffer",
      "Base64Encoder": "Base64EncoderBuffer"
    }
  }
... but this would introduce usage of nested objects instead of arrays. Maybe the renaming aspect could be ignored, and just delete the renamed keyword and add the new name --- ie: if there is no need for parses to distinguish when renaming occurs (can't think of why this was needed, except if some attributes need to be carried on). The latter solution would be neater to implement and keep the schema simpler:

Code: Select all

  "560": {
    "changes": true,
    "tokens-delete": [
      "Base64Decoder",
      "Base64Encoder"
    ],
    "tokens-add": [
      "Base64DecoderBuffer",
      "Base64EncoderBuffer"
    ]
  }
The reason I originally though of having a "rename" key was in the contex of subgrouping keywords (eg: compiler directives, general, debug, etc.) which is needed for some code editors, which means that keywords would need to be added in groups (separate arrays for each group). The advantages of renaming would be that it requires no reference to groups because you are pointing to tokens which are already categorized. I'd go for the easier method (delete and add), unless someone thinks implementing renaming lists might be useful.

I've noticed that usually no keywords changes occur within PATCH releases -- only when MAJOR and MINOR number changes. Which makes sense, since these should be only bug fixes in most revision numbering schemes (not sure how PB goes about it though).
Therefore:
  • shouldn't the JSON file only consider MAJOR and MINOR version jumps (5.00, 510, 520 ... 5.60)?
  • Is it safe to assume that no keywords or commands changes would occur within PATCH releases?
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

IdeasVacuum wrote:.... I still think it would be a whole lot better if a full list (text file) was delivered with each version of PB. Has there been no work from Fred or Freak?
I've sent a PM to Freak on the issue, but haven't got a reply yet.

I agree, it would be ideal if such info came with the SDK. When it comes to obtaining info of this type (or the version of the third party libs) there are often obstacles on the way (the compiler will only list functions supported by the OS, purelibs versions have to be seeked in the changelog, etc.). I think it's a pity because having to hack your way the hard way to access info that usually is openly available in programming languagues can get tiring. PureBASIC is a beautiful language, and comes with a great IDE and debugging tools.

I was really happy to create the syntax highlighting definitions for two major highlighters (highlight.js and Highlight), and this work I'm doing to create a list of maintanable keywords is to help maintaining updated the PureBASIC definition that now ships with pandoc, and hopefully to soon see PB among the supported languages by GitHub's syntax highlighter, and tools like Pygments. I believe the reason there are so few highlighters supporting PureBASIC syntax (or that the existing ones are out of date) has to do with the difficulty in maintaining such a list of keywords. So, hopefully, the final list and tools we're working on will help ensure that existing definitions will be kept up to date, and new ones will pop up.

I'm quite confident that if PB's developers see that a relevant number of users are interested in the list of keywords, and strive to maintain it, they will consider providing the list of tokens in the SDK and/or documentation. It's clear to me that every user thinks his needs are "THE" needs, and would like to see this or that feature implemented. I also understand that maintaining features is an harder task than introducing them, and that the PB team is small. PureBASIC is a solid language, and its developers have proven to be constant in their developing efforts along the years, making it a solid and steadily evolving language and IDE.

I also realize that the license fee we paid for PureBASIC is a really small price when you think that its a lifetime license covering all updates, including major versions. In the users' download page, we can read:
Since updates are free, those who want to support the further PureBasic development can do it here. Thank you !
... and I think that this is something we users should consider seriously. I can't fail notice that with SpiderBASIC the team has opted for a yearly license; so maybe maintainance is a harder task than we realize, and I guess this might be the reason for a change of policy with their new product.

I also believe that if Fantaise software highlighted more this aspect of contributing to PB develoement, users would happily take it into consideration. Right now it's just a paragraph that appears on the download page. If there was a more visible donations page, listing all support donations (something like KickStarter), users might be reminded more often of this need, and be encouraged to contribute.

Of course, users have always contributed code and libraries to PureBASIC, which has an important role in making a language grow. This is a public aspect of contribution, the "PB Community", and it has always been a strong incentive to participate. Support donations, on the other side, are currently a private thing (you can donate with a click, and it ends there). Many open source software have special pages of their website to show donations, and this has a strong motivation impact on users.

PureBASIC is not open source, but its license fee is almost nominal if you think of the years of continuos work that have gone and keep going in it. It simply isn't like other commercial softwares which sell you licenses for major releases only, and often bump up one version just to make you upgrade. And we musn't forget that PB 4 for Amiga was released as open source when support for Amiga was dropped --- which does make part of PB open source. This also suggests that at the end of its journey PureBASIC will probably go open source.

Now ... speaking about all this without doing anything about it doesn't make much sense. So I've made a small donation right now, while writing this reply. It's something I've been thinking about quite often, and realized I wanted to do it --- not because I expect this or that feature to be implemented (I have accepted that you don't often get replies by PM's to the devs, because they surely have a life of their own beside maintaining PureBASIC and following all users requests) but because I really appreciate that PureBASIC is always evolving and don't want it to ever become stale.
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

I've extracted the list of commands from the commandsindex page of each PB 5.xx release.

What I've discovered is that there can be keywords changes even in PATCH releases (so probably PB versioning isn't strictly following SemVer).

For example, PB 5.22 introduces the following changes in commands keywords:
  • EventlParam (new)
  • EventwParam (new)
  • NodeAnimationKeyFramePosition (removed)
  • NodeAnimationKeyFrameRotation (removed)
  • NodeAnimationKeyFrameScale (removed)
  • SetNodeAnimationKeyFramePosition (new)
  • SetNodeAnimationKeyFrameRotation (new)
  • SetNodeAnimationKeyFrameScale (new)
... at least, this is what comes up from diffing the extracted commands list from the documentation.

Anyhow, I'll be publishing soon plain txt files with the list of commands from each version. Adapting them to one's needs will be a simple task.
The PureBASIC Archives: FOSS Resources:
User avatar
Tristano
Enthusiast
Enthusiast
Posts: 190
Joined: Thu Nov 26, 2015 6:52 pm
Location: Italy
Contact:

Re: Need The Full List of PureBASIC Keywords

Post by Tristano »

I've published a JSON file with all the PureBASIC commands from v5.00 to 5.60:

https://github.com/tajmone/purebasic-ar ... mands.json

Like the previous one, it starts by adding all tokens from 5.50, and with the following releases it just adds new ones and deletes old ones.

Keep in mind that PureBASIC 5.54 is missing from the "museum" donwload page --- so I only had the changelog for that version.

If the Commands Index page correctly represents the actual built in commands, then the JSON file is a valid reference for building syntax files for both code beautifiers and editors.
The PureBASIC Archives: FOSS Resources:
Post Reply