Page 1 of 1

Help with Read Byte

Posted: Wed Mar 29, 2017 7:36 pm
by tpsreports
Hello,

I am trying to read in XML files character by character, I have this part working but for some reason whenever there is a "&" in the value for a node the return has "amp;" after it. I have no idea why this happening and the method that I am using to read the value in is:

Code: Select all

If ReadFile(#c_file_id, in_file_name)
        
        Dim w_level_id.s(#c_max_level)
        Dim w_value.s(#c_max_level)
        Dim w_sequence.l(#c_max_level)
        
        w_next_sequence = 1
        
        w_cur_level     = 0
        w_in_tag        = #False
        w_in_tag_type   = 0
        w_in_quotes     = #False
        w_in_h_tag      = 0
        w_in_value      = #False
        
        w_tag_id.s    = ""
        w_value_cur.s = ""
        
        While Not Eof(#c_file_id)
          i = ReadByte(#c_file_id)
          w_in_char.s = Chr(i)
and the string that I am trying to read in is:

"B R FUNSTEN & CO DBA TOM DUFFY"

Any help that you can offer would be great

__________________________________________________
Code tags added
26.04.2017
RSBasic

Re: Help with Read Byte

Posted: Wed Mar 29, 2017 8:21 pm
by infratec
Hi,

not easy to say, because you don't tell us if the file is in ASCII, UTF8 or UNICODE.
A xml file cold be written in UTF8. So check the coding of that file.
But why you read it byte wise? This is slow and normally not needed.
And when byte by byte, why not ReadCharacter() ? This saves the conversion.
I would read the file completely in a buffer and use CatchXML().

Also I don't understand 100% what you mean: a & is & escaped

Bernd

Re: Help with Read Byte

Posted: Wed Mar 29, 2017 8:28 pm
by normeus
&
Sounds like "&" html encoded to "&"

single byte to unicode or utf-8 translation error.
Search the forums for an xml library so it will be less complicated for you to handle.

your compiler is set to unicode ( unless you are using an older compiler ) so chr() is trying to display a unicode char

Norm

Re: Help with Read Byte

Posted: Wed Mar 29, 2017 9:29 pm
by tpsreports
Hi,

Thanks for the responses.

As to why I am doing this character by character, I am taking this over from someone else and we have to use the standard PureBasic with none of the extension libraries.

The file is a UTF-8 file and I have tried to use ReadString(),ReadCharacter() and ReadByte() all of them have returned the same way.

I am loosing my mind a bit on this one.

Mark

Re: Help with Read Byte

Posted: Wed Mar 29, 2017 10:10 pm
by infratec
Hi,

XML is included :wink:

Try this:

Code: Select all

Procedure WalkThrough(*CurrentNode, CurrentSublevel)
  
  If XMLNodeType(*CurrentNode) = #PB_XML_Normal
    
    Text$ = GetXMLNodeName(*CurrentNode) + " (Attributes: "
    
    If ExamineXMLAttributes(*CurrentNode)
      While NextXMLAttribute(*CurrentNode)
        Text$ + XMLAttributeName(*CurrentNode) + "=" + Chr(34) + XMLAttributeValue(*CurrentNode) + Chr(34) + " "
      Wend
    EndIf
    
    Text$ + ")"
    
    Debug Text$
    
    *ChildNode = ChildXMLNode(*CurrentNode)
    
    While *ChildNode <> 0
      WalkThrough(*ChildNode, CurrentSublevel + 1)      
      *ChildNode = NextXMLNode(*ChildNode)
    Wend        
    
  EndIf
  
EndProcedure



Filename$ = OpenFileRequester("Choose the xml file", "", "XML|*.xml;All|*.*", 0)
If Filename$ <> ""
  
  XML = LoadXML(#PB_Any, Filename$)
  If XML
    If XMLStatus(XML) = #PB_XML_Success
      *MainNode = MainXMLNode(XML)      
      If *MainNode
        WalkThrough(*MainNode, 0)
      EndIf
    Else
      Debug XMLError(XML)
    EndIf
    FreeXML(XML)
  EndIf
  
EndIf
Bernd

Re: Help with Read Byte

Posted: Thu Mar 30, 2017 6:55 am
by infratec
To eliminate the & you can use URLDecoder()

Bernd

Re: Help with Read Byte

Posted: Thu Mar 30, 2017 9:50 pm
by tpsreports
Thanks for the help guys, using what infratec posted I was able to get the application working.

Mark