Yet another LinkGrammar Wrapper issue
Yet another LinkGrammar Wrapper issue
Hi all,
I'm still working on a LinkGrammar (http://www.abisource.com/projects/link-grammar/) wrapper for PB, and I'm having a quite interesting issue.
Here's what I have so far: https://dl.dropboxusercontent.com/u/287 ... 3.zip?dl=1
The package includes a x86 dll and static lib, includes and a basic example.
I'm not sure why, but LinkGrammar is unable to find the specified dictionary (see ln. 7 in lg_test.pb). The example is compiled as unicode.
What I managed to find out is that the functions, particularly Dictionary_Set_Data_Dir and possibly Dictionary_Create_Lang expect a utf8 string. If I pass a data directory to Dictionary_Set_Data_Dir, and read it back via a peek, however, the string is cut off right before a colon, e.g. "C:\test" becomes just "C".
If there is no default path specified, a peek returns the proper program directory.
I could think that the dictionary pointer is null because PB does not convert to UTF8 when a string is passed in a function call, but "en" should be the same in both unicode and utf8... I am really out of ideas.
If anyone could take a look at this, and we could fix it, the PB community would gain a quite useful library for language parsing IMHO.
Thanks for any help provided in advance!
Erion
I'm still working on a LinkGrammar (http://www.abisource.com/projects/link-grammar/) wrapper for PB, and I'm having a quite interesting issue.
Here's what I have so far: https://dl.dropboxusercontent.com/u/287 ... 3.zip?dl=1
The package includes a x86 dll and static lib, includes and a basic example.
I'm not sure why, but LinkGrammar is unable to find the specified dictionary (see ln. 7 in lg_test.pb). The example is compiled as unicode.
What I managed to find out is that the functions, particularly Dictionary_Set_Data_Dir and possibly Dictionary_Create_Lang expect a utf8 string. If I pass a data directory to Dictionary_Set_Data_Dir, and read it back via a peek, however, the string is cut off right before a colon, e.g. "C:\test" becomes just "C".
If there is no default path specified, a peek returns the proper program directory.
I could think that the dictionary pointer is null because PB does not convert to UTF8 when a string is passed in a function call, but "en" should be the same in both unicode and utf8... I am really out of ideas.
If anyone could take a look at this, and we could fix it, the PB community would gain a quite useful library for language parsing IMHO.
Thanks for any help provided in advance!
Erion
Last edited by erion on Sat Apr 19, 2014 7:02 pm, edited 2 times in total.
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
-
- Addict
- Posts: 1676
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Yet another LinkGrammar Wrapper issue
Are you using the appropriate psuedo-types and/or Unicode versions of commands when handling strings?
I only took a quick glance, but it didn't seem like it.
I only took a quick glance, but it didn't seem like it.
Re: Yet another LinkGrammar Wrapper issue
Hi,
Thanks for your reply!
I have modified the two functions, where I replaced .s with .p-unicode. Unfortunately, there is no change.
Shouldn't PB auto convert the parameters though, especially if the executable is compiled as unicode? Just wondering...
Erion
Thanks for your reply!
I have modified the two functions, where I replaced .s with .p-unicode. Unfortunately, there is no change.
Shouldn't PB auto convert the parameters though, especially if the executable is compiled as unicode? Just wondering...
Erion
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
-
- Addict
- Posts: 1676
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Yet another LinkGrammar Wrapper issue
Honestly, don't know. I've never compiled a unicode application before.
There is also PeekU(), try that?
One of the Peek commands also has flags to specify Unicode, ASCII etc..
Try looking up "Unicode" in the help CHM and see if you find anything useful
There is also PeekU(), try that?
One of the Peek commands also has flags to specify Unicode, ASCII etc..
Try looking up "Unicode" in the help CHM and see if you find anything useful
Syntax
Text$ = PeekS(*MemoryBuffer [, Length [, Format]])
Description
Reads a string from the specified memory address.
Parameters
*MemoryBuffer The address to read from.
Length (optional) The maximum number of characters to read. If this parameter is not specified or -1 is used then there is no maximum. The string is read until a terminating null-character is encountered or the maximum length is reached.
Format (optional) The string format to use when reading the string. This can be one of the following values:
#PB_Ascii : Reads the strings as ascii
#PB_UTF8 : Reads the strings as UTF8
#PB_Unicode: Reads the strings as unicode
The default is #PB_Unicode if the program is compiled in unicode mode and #PB_Ascii otherwise.
Return value
Returns the read string.
See Also
PokeS(), MemoryStringLength(), CompareMemoryString(), CopyMemoryString()
Supported OS
All
Re: Yet another LinkGrammar Wrapper issue
Hi,
Yes. As I wrote in the first post, if I peek a path that I set using Dictionary_Set_Data_Dir, it returns only up to the first colon. PeekU returns only the first unicode character (2 bytes) of a buffer, so unfortunately it won't work here.
What baffles me is why the dictionaries are not found, even when using the default, unspecified dictionary path, which is the program's directory. Could it be because PB's not converting argument calls to UTF8, and the function expects a utf8 path/dictionary language string?
I might end up contacting the LinkGrammar people, but I thought I'd ask here first, in case it's related to PB.
In case this helps, here's an official example http://abiword.com/projects/link-grammar/api/index.html
This is roughly the same as what I have in the LinkGrammar archive, except my *dict pointer is always null.
Erion
Yes. As I wrote in the first post, if I peek a path that I set using Dictionary_Set_Data_Dir, it returns only up to the first colon. PeekU returns only the first unicode character (2 bytes) of a buffer, so unfortunately it won't work here.
What baffles me is why the dictionaries are not found, even when using the default, unspecified dictionary path, which is the program's directory. Could it be because PB's not converting argument calls to UTF8, and the function expects a utf8 path/dictionary language string?
I might end up contacting the LinkGrammar people, but I thought I'd ask here first, in case it's related to PB.
In case this helps, here's an official example http://abiword.com/projects/link-grammar/api/index.html
This is roughly the same as what I have in the LinkGrammar archive, except my *dict pointer is always null.
Erion
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
Re: Yet another LinkGrammar Wrapper issue
Hi erion,
The following files are missing from your LinkGrammar package:
- msys-1.0.dll
- msys-regex-1.dll
Addressing your original question:
- change the following (lg_test.pb):
-- remove Unicode from "Compiler Options"
-- remove backslash from "data": Dictionary_Set_Data_Dir(GetPathPart(ProgramFilename())+"data")
-- change the error MessageRequester:
--- MessageRequester("Error", PeekS(Dictionary_Get_Data_Dir()))
- change to the following (link-includes.pbi):
-- linkgrammar_get_dict_version(Dictionary.p-utf8)
-- dictionary_create_lang(lang.p-utf8)
-- dictionary_set_data_dir(path.p-utf8)
*** This will only show a new set of problems ***
Note (missing from your example):
- dict : dictionary Structure
-- required Sub-Structures
- sent : Sentence Structure
-- required Sub-Structures
- link : linkage Structure
-- required Sub-Structures
- etc.
Just a suggestion: Take what you've already done - Start from scratch - Build the includes - Test each step.
- begin with the file [ api-structures.h ] from the download [ link-grammar-5.0.5 ]
I started the following... Structures still need work, but it's how I would begin:
- simple example: http://abiword.com/projects/link-gramma ... l#example1
- break the code into separate includes only after you have a working example
- add Constants, Structures, Macros, Functions only as needed, until you have a working example
The following files are missing from your LinkGrammar package:
- msys-1.0.dll
- msys-regex-1.dll
Addressing your original question:
- change the following (lg_test.pb):
-- remove Unicode from "Compiler Options"
-- remove backslash from "data": Dictionary_Set_Data_Dir(GetPathPart(ProgramFilename())+"data")
-- change the error MessageRequester:
--- MessageRequester("Error", PeekS(Dictionary_Get_Data_Dir()))
- change to the following (link-includes.pbi):
-- linkgrammar_get_dict_version(Dictionary.p-utf8)
-- dictionary_create_lang(lang.p-utf8)
-- dictionary_set_data_dir(path.p-utf8)
*** This will only show a new set of problems ***
Note (missing from your example):
- dict : dictionary Structure
-- required Sub-Structures
- sent : Sentence Structure
-- required Sub-Structures
- link : linkage Structure
-- required Sub-Structures
- etc.
Just a suggestion: Take what you've already done - Start from scratch - Build the includes - Test each step.
- begin with the file [ api-structures.h ] from the download [ link-grammar-5.0.5 ]
I started the following... Structures still need work, but it's how I would begin:
- simple example: http://abiword.com/projects/link-gramma ... l#example1
- break the code into separate includes only after you have a working example
- add Constants, Structures, Macros, Functions only as needed, until you have a working example
Code: Select all
Enumeration ConstituentDisplayStyle
#NO_DISPLAY = 0
#MULTILINE = 1
#BRACKET_TREE = 2
#SINGLE_LINE = 3
#MAX_STYLES = 3
EndEnumeration
Structure Dictionary Align #PB_Structure_AlignC
EndStructure
Structure Sentence Align #PB_Structure_AlignC
EndStructure
Structure Linkage Align #PB_Structure_AlignC
EndStructure
Structure Resources Align #PB_Structure_AlignC
max_parse_time.l
max_memory.l
time_when_parse_started.d
space_when_parse_started.l
when_created.d
when_last_called.d
cumulative_time.d
memory_exhausted.l
timer_expired.l
EndStructure
Structure Cost_Model Align #PB_Structure_AlignC
type.l
*compare_fn
EndStructure
Structure Parse_Options Align #PB_Structure_AlignC
verbosity.l
*debug
*test
use_sat_solver.b
use_viterbi.b
linkage_limit.l
disjunct_cost.d
min_null_count.l
max_null_count.l
null_block.l
islands_ok.b
twopass_length.l
max_sentence_length.l
short_length.l
all_short.b
use_spell_guess.b
repeatable_rand.b
cost_model.Cost_Model
resources.Resources
display_short.b
display_word_subscripts.b
display_link_subscripts.b
display_walls.b
allow_null.b
use_cluster_disjuncts.b
echo_on.b
batch_mode.b
panic_mode.b
screen_width.l
display_on.b
display_postscript.b
display_constituents.l
display_bad.b
display_disjuncts.b
display_links.b
display_morphology.b
display_senses.b
EndStructure
ImportC "liblink-grammar-5.lib"
dictionary_create_lang(lang.p-utf8)
dictionary_set_data_dir(path.p-utf8)
sentence_create(input_string.p-utf8, *dict)
parse_options_create()
sentence_split(*sent, *opts)
sentence_parse(*sent, *opts)
linkage_create(index, *sentSentence, *opts)
linkage_print_diagram(*linkage)
EndImport
dictionary_create_lang("en")
*dict = dictionary_set_data_dir("data")
If *dict
*sent = sentence_create("My dog likes dog food.", *dict)
*options.Parse_Options = parse_options_create()
; sentence_split(*sentence, *options)
; sentence_parse(*sentence, *options)
*link = linkage_create(0, *sent, *options)
Debug linkage_print_diagram(*link)
EndIf
If you're not investing in yourself, you're falling behind.
My PureBasic Stuff ➤ FREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
Re: Yet another LinkGrammar Wrapper issue
Hi JHPJHP,
Huge thanks for your help!
It seems that you do have to specify the argument type, according to your code. In the PB manual, however, in the Import : EndImport section we have:
Once again, thank you very much for your help!
Erion
Huge thanks for your help!
I have noticed that the regexp library was missing right after I uploaded it to DropBox, but Windows did not complain, so I thought it would work anyway. Interestingly, msys-1.0.dll did not pop up, when I checked via DependencyWalker.JHPJHP wrote:The following files are missing from your LinkGrammar package:
- msys-1.0.dll
- msys-regex-1.dll
It seems that you do have to specify the argument type, according to your code. In the PB manual, however, in the Import : EndImport section we have:
It seems that this is not entirely automatic, not at least when it comes to unicode/utf-8.The compiler will automatically converts the strings to unicode when needed.
I did not find a structure for these in the sources, when Iwas searching for struct dict, struct link, etc.JHPJHP wrote:Note (missing from your example):
- dict : dictionary Structure
-- required Sub-Structures
- sent : Sentence Structure
-- required Sub-Structures
- link : linkage Structure
-- required Sub-Structures
- etc.
Once again, thank you very much for your help!
Erion
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
Re: Yet another LinkGrammar Wrapper issue
To add to what I have said previously:
I compiled LG 5.06 https://dl.dropboxusercontent.com/u/287 ... 6.zip?dl=1
The missing structures are hopefully in, also Jhpjhp's example as test.pb.
The problem is still with dictionary_create_lang: invalid memory read at address 16. It does not matter what data directory I set or don't set, LinkGrammar is unable to open the specified dictionary.
Jhpjhp's example does exactly the same.
Erion
I compiled LG 5.06 https://dl.dropboxusercontent.com/u/287 ... 6.zip?dl=1
The missing structures are hopefully in, also Jhpjhp's example as test.pb.
The problem is still with dictionary_create_lang: invalid memory read at address 16. It does not matter what data directory I set or don't set, LinkGrammar is unable to open the specified dictionary.
Jhpjhp's example does exactly the same.
Erion
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
Re: Yet another LinkGrammar Wrapper issue
Hi erion,
I'm not sure if your still looking for a solution...
A couple observations:
- missing the Sentence Structure
- link-includes.pbi: there are Enumeration and Structure declarations inside the ImportC declaration
-- putting them outside allows the GUI debugger to work correctly
Something to try:
While working on the OpenCV frame-work, and because PureBasic for the most part doesn't allow values from a Function to be returned directly to a Structure (outside of a Pointer or using ASM) - the following worked for me, and may provide a solution for you; or possibly a variation of the following.
Standard (Function returns *dict pointer):
Alternate (@dict pointer included in the Function):
I'm not sure if your still looking for a solution...
A couple observations:
- missing the Sentence Structure
- link-includes.pbi: there are Enumeration and Structure declarations inside the ImportC declaration
-- putting them outside allows the GUI debugger to work correctly
Something to try:
While working on the OpenCV frame-work, and because PureBasic for the most part doesn't allow values from a Function to be returned directly to a Structure (outside of a Pointer or using ASM) - the following worked for me, and may provide a solution for you; or possibly a variation of the following.
Standard (Function returns *dict pointer):
Code: Select all
ImportC "..\liblink-grammar-5.lib"
dictionary_create_lang(lang.p-utf8)
EndImport
Global *dict.Dictionary = Dictionary_Create_Lang("en")
Code: Select all
ImportC "..\liblink-grammar-5.lib"
dictionary_create_lang(*dict, lang.p-utf8)
EndImport
Global dict.Dictionary
Dictionary_Create_Lang(@dict, "en")
If you're not investing in yourself, you're falling behind.
My PureBasic Stuff ➤ FREE STUFF, Scripts & Programs.
My PureBasic Forum ➤ Questions, Requests & Comments.
Re: Yet another LinkGrammar Wrapper issue
Hi,
Thank you! The alternate pointer to return a struct solved the invalid memory read at address 16 error. Unfortunately, the *dict pointer is still zero, which seems to indicate that linkGrammar is unable to find the appropriate dictionary.
What is even more frustrating is that I have no idea if this is an issue on PB's end, or a LinkGrammar bug.
To add to this, PB seems to be able to find the function names if I use Import, but also if ImportC is specified.
Erion
Thank you! The alternate pointer to return a struct solved the invalid memory read at address 16 error. Unfortunately, the *dict pointer is still zero, which seems to indicate that linkGrammar is unable to find the appropriate dictionary.
What is even more frustrating is that I have no idea if this is an issue on PB's end, or a LinkGrammar bug.
To add to this, PB seems to be able to find the function names if I use Import, but also if ImportC is specified.
Erion
To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.
- W. B.
Visit my site, also for PureBasic goodies http://erion.tdrealms.com