Regex to replace apostrophes in hyperlinks with speech marks

Just starting out? Need help? Post your questions and find answers here.
Seymour Clufley
Addict
Addict
Posts: 1266
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Regex to replace apostrophes in hyperlinks with speech marks

Post by Seymour Clufley »

I need to ensure that all hyperlinks in a string enclose their href parameter with speech marks rather than apostrophes, without changing any other apostrophes in the string. In other words, I need to turn:

Code: Select all

754'536 <a href="https://purebasic.com">PB</a>, 523'434 <a href='https://purebasic.com'>PB</a> 837'562
into:

Code: Select all

754'536 <a href="https://purebasic.com">PB</a>, 523'434 <a href="https://purebasic.com">PB</a> 837'562
Regex is something I struggle to understand, but I'm fairly sure it is the ideal tool for this. Is there some simple expression that would do it?
JACK WEBB: "Coding in C is like sculpting a statue using only sandpaper. You can do it, but the result wouldn't be any better. So why bother? Just use the right tools and get the job done."
User avatar
jacdelad
Addict
Addict
Posts: 2032
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by jacdelad »

RegEx is good for finding, not ideal for replacing. However, if it's only for the href's, I can build you something in the evening (I just turned off my computer ), if nobody else is faster.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Seymour Clufley
Addict
Addict
Posts: 1266
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by Seymour Clufley »

Ah, thank you for that tip. I've managed it that way.

Code: Select all

Procedure.s ConformAllHyperlinks(h.s)
  Static r.i
  If Not r
    r = CreateRegularExpression(#PB_Any,"href='(.+?)'")
  EndIf
  
  If ExamineRegularExpression(r,h)
      While NextRegularExpressionMatch(r)
        detect1 = RegularExpressionMatchPosition(r)
        detect2 = RegularExpressionMatchLength(r)
        tag.s = Mid(h,detect1,detect2)
        tag = ReplaceString(tag,Chr(39),Chr(34))
        h = Left(h,detect1-1)+tag+Mid(h,detect1+detect2,Len(h))
      Wend
  EndIf

  ProcedureReturn h
EndProcedure
JACK WEBB: "Coding in C is like sculpting a statue using only sandpaper. You can do it, but the result wouldn't be any better. So why bother? Just use the right tools and get the job done."
User avatar
Piero
Addict
Addict
Posts: 1040
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by Piero »

Code: Select all

; https://www.purebasic.fr/english/viewtopic.php?p=575871

Structure ReplaceGr : pos.i : ngr.i : group.s : EndStructure

Procedure RegexReplace2(RgEx, *Result.string, Replace0$)
	Protected i, CountGr, Pos, Offset = 1
	Protected Replace$
	Protected NewList item.s()
	Protected LenT, *Point
	Protected RE2
	Protected NewList ReplaceGr.ReplaceGr()
	CountGr = CountRegularExpressionGroups(RgEx):If CountGr > 9:CountGr = 9 :EndIf ; max 9 groups
	If ExamineRegularExpression(RgEx, *Result\s)
		RE2 = CreateRegularExpression(#PB_Any, "\\\d")
		If RE2
			If ExamineRegularExpression(RE2, Replace0$)
				While NextRegularExpressionMatch(RE2)
					If AddElement(ReplaceGr())
						ReplaceGr()\pos = RegularExpressionMatchPosition(RE2) ; позиция
						ReplaceGr()\ngr = ValD(Right(RegularExpressionMatchString(RE2), 1)) ; номер группы
						ReplaceGr()\group = RegularExpressionMatchString(RE2) ; текст группы
					EndIf
				Wend
			EndIf
			FreeRegularExpression(RE2) ; убрать строку при Static
		EndIf
		If Not ListSize(ReplaceGr())
			*Result\s = ReplaceRegularExpression(RgEx, *Result\s, Replace0$)
			ProcedureReturn
		EndIf
		SortStructuredList(ReplaceGr(), #PB_Sort_Descending, OffsetOf(ReplaceGr\pos), TypeOf(ReplaceGr\pos))
		While NextRegularExpressionMatch(RgEx)
			Pos = RegularExpressionMatchPosition(RgEx)
			Replace$ = Replace0$
			ForEach ReplaceGr()
				If ReplaceGr()\ngr
					Replace$ = ReplaceString(Replace$, ReplaceGr()\group, RegularExpressionGroup(RgEx, ReplaceGr()\ngr), #PB_String_NoCase, ReplaceGr()\pos, 1)
				Else
					Replace$ = ReplaceString(Replace$, ReplaceGr()\group, RegularExpressionMatchString(RgEx), #PB_String_NoCase, ReplaceGr()\pos, 1) ; обратная ссылка \0
				EndIf
			Next
			If AddElement(item())
				item() = Mid(*Result\s, Offset, Pos - Offset) + Replace$
			EndIf
			Offset = Pos + RegularExpressionMatchLength(RgEx)
		Wend
		If AddElement(item())
			item() = Mid(*Result\s, Offset)
		EndIf
		LenT = 0
		ForEach item()
			LenT + Len(item()) ; вычисляем длину данных для  вмещения частей текста
		Next
		*Result\s = Space(LenT) ; создаём строку забивая её пробелами
		*Point = @*Result\s	   ; Получаем адрес строки
		ForEach item()
			CopyMemoryString(item(), @*Point) ; копируем очередной путь в указатель
		Next
		FreeList(item()) ; удаляем список, хотя в функции наверно это не требуется
	EndIf
EndProcedure

Define Text.string
Text\s = ~"754'536 <a href=\"https://purebasic.com\">PB</a>, 523'434 <a href='https://purebasic.com'>PB</a> 837'562"
CreateRegularExpression(0 , "(.*href=)'(.+?)'(.*)" )
RegexReplace2(0, @Text, ~"\\1\"\\2\"\\3" )
FreeRegularExpression(0)
Debug Text\s
User avatar
Piero
Addict
Addict
Posts: 1040
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by Piero »

Some notes about my post above:

href='something' (must contain something) or else use: (.*?)

I'm Italian, but in this case I wouldn't apostrophize any quote as if it was a speech mark, whether it's double or not

I bet #PB_String_CaseSensitive would not really make it faster…
THANKS AZJIO!!! Got a Mac?
AZJIO
Addict
Addict
Posts: 2225
Joined: Sun May 14, 2017 1:48 am

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by AZJIO »

Piero wrote: Fri Aug 23, 2024 6:37 pm THANKS AZJIO!!! Got a Mac?
Do I have it? No.
User avatar
Piero
Addict
Addict
Posts: 1040
Joined: Sat Apr 29, 2023 6:04 pm
Location: Italy

Re: Regex to replace apostrophes in hyperlinks with speech marks

Post by Piero »

AZJIO wrote: Fri Aug 23, 2024 7:38 pm
Piero wrote: Fri Aug 23, 2024 6:37 pm THANKS AZJIO!!! Got a Mac?
Do I have it? No.
What did I do? Why are you so cruel? :cry:
Post Reply