FormatXML and #PB_ReFormat

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
Lukaso
User
User
Posts: 20
Joined: Fri Apr 29, 2005 11:53 am
Location: Germany
Contact:

FormatXML and #PB_ReFormat

Post by Lukaso »

Hi,

has someone allready thinked about #PB_ReFormat using FormatXML and the way the output looks like?

Example:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<gwcore>
  <module name="gwpp" load="true">
    <test.test>
      some text
    </test.test>
    <test.long>
      299 
    </test.long>
  </module>
</gwcore>
The problem is GetXMLNodeText will return all spaces and new lines, it is a bit stupid for loading a formated saved xml with pb again and work with.

A formated xml output (like ie or firefox displays) like this whould be better in my opinion.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<gwcore>
  <module name="gwpp" load="true">
    <test.test>some text</test.test>
    <test.long>299</test.long>
  </module>
</gwcore>
what are you guys think about?
User avatar
Thorsten1867
Addict
Addict
Posts: 1372
Joined: Wed Aug 24, 2005 4:02 pm
Location: Germany

Post by Thorsten1867 »

A small workaround:

Code: Select all

Procedure SaveBeautifyXML(XMLId.l, XMLFile$)
  Protected i, Lpos, Rpos = 1, pos = 1, indent = 0, tag$ = "", ntag, xml$, new$, encoding$
  If IsXML(XMLId)
    xml$ = Space(ExportXMLSize(XMLId))
    ExportXML(XMLId, @xml$, Len(xml$))
    xml$ = RemoveString(xml$, Chr(13))
    For i=1 To CountString(xml$, "<")
      Lpos = FindString(xml$, "<", Rpos)
      Rpos = FindString(xml$, ">", Lpos)
      ntag$ = Mid(xml$,Lpos, Rpos-Lpos+1)
      If tag$="" ; <?xml version="1.0" encoding="UTF-8"?>
        new$ = ntag$
        If FindString(ntag$, "UTF-8",1)
          encoding$ = "UTF-8"
        EndIf
        indent - 2
      ElseIf Left(tag$,2) <> "</" And Left(ntag$,2) = "</" ; <Tag></Tag>
        txt$ = Trim(Mid(xml$, pos, Lpos-pos))
        If Left(txt$,1) = #LF$
          txt$ = Trim(Mid(txt$,2))
        EndIf
        If Right(txt$,1) = #LF$
          txt$ = Trim(Left(txt$,Len(txt$)-1))
        EndIf
        new$ + txt$ + ntag$ 
      Else
        If Left(tag$,2) = "</" And Left(ntag$,2) = "</" ;  ; </Tag2></Tag1>
          indent - 2
        ElseIf Left(tag$,2) <> "</" And Left(ntag$,2) <> "</" ;  ; <Tag1><Tag2>
          indent + 2
        EndIf
        new$ + #LF$ + Space(indent) + ntag$
      EndIf
      tag$ = ntag$
      pos = Rpos+1
    Next
    If CreateFile(0, XMLFile$)
      If encoding$ = "UTF-8"
        WriteStringFormat(0, #PB_UTF8)
        WriteString(0, new$, #PB_UTF8)
      Else
        WriteString(0, new$)
      EndIf
      CloseFile(0)
    EndIf
  EndIf
EndProcedure


If LoadXML(1, "address.xml")
  SaveBeautifyXML(1, "adress.xml")
  FreeXML(1)
EndIf
Before:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>

<Adressen Anzahl="1">
  <Adresse id="1">
    <Name>
      Thorsten Hoeppner
    </Name>
    <Strasse>
      Alte Zeile 18
    </Strasse>
    <Ort>
      87600 Kaufbeuren
    </Ort> 
  </Adresse>
</Adressen>
After:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<Adressen Anzahl="1">
  <Adresse id="1">
    <Name>Thorsten Hoeppner</Name>
    <Strasse>Alte Zeile 18</Strasse>
    <Ort>87600 Kaufbeuren</Ort>
  </Adresse>
</Adressen>
Translated with http://www.DeepL.com/Translator

Download of PureBasic - Modules
Download of PureBasic - Programs

[Windows 11 x64] [PB V5.7x]
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Post by Little John »

I'm supporting this feature request.

However, another workaround is, instead of using

Code: Select all

text$ = GetXMLNodeText(*node)
use something like this

Code: Select all

#WHITESPACE$ = " " + #TAB$ + #CRLF$

Procedure.s TrimChars (source$, charList$=#WHITESPACE$)
   ; in : string
   ; out: string without any leading or trailing characters,
   ;      that are contained in 'charList$'
   Protected left, right, length=Len(source$)

   ; Trim left
   left = 1
   While (left <= length) And FindString(charList$, Mid(source$,left,1), 1)
      left + 1
   Wend

   ; Trim right
   right = length
   While (left < right) And FindString(charList$, Mid(source$,right,1), 1)
      right - 1
   Wend

   ProcedureReturn Mid(source$, left, right-left+1)
EndProcedure

text$ = TrimChars(GetXMLNodeText(*node))
Regards, Little John
Perkin
Enthusiast
Enthusiast
Posts: 504
Joined: Thu Jul 03, 2008 10:13 pm
Location: Kent, UK

Post by Perkin »

I've just needed to use this formatting, however have had one problem.
When a tag is followed directly by its close tag (i.e. is empty), the output is changed to <Tag/> when I would like the old <Tag></Tag>

Code: Select all

<String id="Author"></String>
becomes

Code: Select all

<String id="Author"/>
I've looked at the code, but cannot see how it's eliminating the end tag.

Can anyone help?[/code]
%101010 = $2A = 42
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Post by Little John »

Please try this code.

Regards, Little John
Perkin
Enthusiast
Enthusiast
Posts: 504
Joined: Thu Jul 03, 2008 10:13 pm
Location: Kent, UK

Post by Perkin »

That code didn't work either.

Here's a shortened orig input file (It's appears as one long line in a xml file)

Code: Select all

<?xml version="1.0" encoding="ISO-8859-1"?><Template><Version>1.0</Version><Page><Name>A4</Name><Properties><String id="Author"></String><String id="Width">297.000000MM</String><String id="Height">210.000000MM</String></Properties><Content></Content></Page></Template>
Now here's the output from that code

Code: Select all

<?xml version="1.0" encoding="ISO-8859-1"?>
<Template>
   <Version>1.0</Version>
   <Page>
      <Name>A4</Name>
      <Properties>
         <String id="Author"/>
         <String id="Width">297.000000MM</String>
         <String id="Height">210.000000MM</String>
      </Properties>
      <Content/>
   </Page>
</Template>
Notice the <string id="Author"/> and the <Content/> tags.

Here's what I need as the result.

Code: Select all

<?xml version="1.0" encoding="ISO-8859-1"?>
<Template>
  <Version>1.0</Version>
  <Page>
    <Name>A4</Name>
    <Properties>
      <String id="Author"></String>
      <String id="Width">297.000000MM</String>
      <String id="Height">210.000000MM</String>
    </Properties>
  <Content></Content>
</Page>
</Template>
Any help will be appreciated.
%101010 = $2A = 42
Little John
Addict
Addict
Posts: 4777
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Post by Little John »

It's some time ago when I dealt with this stuff ...
Well, I see the problem now: The combination of PB's LoadXML() and ExportXML() does the unwanted conversion. Fortunately, for your problem at hand we do not need it anyway. :-)

So please see my new procedure FormatXMLfile() here.

Regards, Little John
Perkin
Enthusiast
Enthusiast
Posts: 504
Joined: Thu Jul 03, 2008 10:13 pm
Location: Kent, UK

Post by Perkin »

That does it. :) Thanks Little John
%101010 = $2A = 42
Post Reply