Page 1 of 1

Source language lexer

Posted: Thu Mar 27, 2025 6:02 pm
by PBJim
I'm doing research work on source code lexers in general, or more precisely lexical analysis. I know there are some forum members who work occasionally on the IDE and editor who might be able to answer this question — does the IDE editor use a lexer (distinct from a parser) to determine syntax colouring? If so, could you possibly point me in the right direction with its placement in the Github repo? Thanks.

Re: Source language lexer

Posted: Fri Mar 28, 2025 3:21 am
by moricode
purebasic is not a popular language like C , or C++ , or ASM or Pascal or QuickBasic or something else .

In order to make it widely accepted or increase popularity , language should publish it's specifications and encourage alternative compiler implementation.

even the liberty basic has alternative compiler/interpreter.

any master here intend to create an alternative compatible base language compiler which comply to PB specification ?

not necessary to implement all library function , since most of them are take from open source, like sqlite, zip, png ....

this will boost the popularity

Re: Source language lexer

Posted: Fri Mar 28, 2025 6:29 pm
by Sicro
The PureBasic IDE uses Scintilla for the token highlighting in the editor, and it comes with a lexer:
https://www.scintilla.org/ScintillaDoc.html#Lexer
This code sets up the lexer:
https://github.com/fantaisie-software/p ... ighting.pb
This code is also used for highlighting:
https://github.com/fantaisie-software/p ... gEngine.pb
This code is used for autocomplete, variableviewer, procedurebrowser etc.:
https://github.com/fantaisie-software/p ... eParser.pb

I have written two lexers for the PureBasic programming language: Here is my own regex engine that compiles several regexes into NFA or a very fast DFA:
https://github.com/SicroAtGit/RegEx-Engine
The project focuses on ensuring that the regex engine is suitable for creating lexers. The above DFA-based PureBasic lexer also utilizes the DFA generated by this regex engine. In the code examples of the project, you will also find a simple lexer example there.

Re: Source language lexer

Posted: Fri Mar 28, 2025 8:36 pm
by wro
Yeah, most modern IDEs do use a lexer for syntax highlighting. It's usually separate from the parser since highlighting doesn't need the full understanding of the code like parsing does. For example, in some editors, you'll find a lexer that just looks for keywords, strings, comments, etc., and assigns colors based on those. As for the GitHub repo, it really depends on the IDE, but in general, look for files or modules related to syntax highlighting or tokenization, and it’s likely there.

Re: Source language lexer

Posted: Fri Mar 28, 2025 9:08 pm
by PBJim
Sicro wrote: Fri Mar 28, 2025 6:29 pm The PureBasic IDE uses Scintilla for the token highlighting in the editor, and it comes with a lexer:
Thanks very much Sicro, that's just what I was looking for. In fact, I had been reading at least one of those sections of IDE code yesterday, looking for references to what might define it as a 'lexer' but didn't realise I was looking at it already.

Okay, I have a lot of reading up to do now. It's an interesting area, I can see.

Re: Source language lexer

Posted: Fri Mar 28, 2025 9:10 pm
by PBJim
wro wrote: Fri Mar 28, 2025 8:36 pm Yeah, most modern IDEs do use a lexer for syntax highlighting.
Thanks for the comments Wro, interesting to read. I'm getting into this now. :D

Re: Source language lexer

Posted: Sun May 25, 2025 8:57 pm
by BlameTroi
Tangential. Does anyone know of a TreeSitter grammar for PureBasic?

Re: Source language lexer

Posted: Mon May 26, 2025 10:27 am
by DarkDragon
JFlex & GrammarKit EBNF if you prefer this.