Page 1 of 1

Regex to replace apostrophes in hyperlinks with speech marks

Posted: Thu Aug 22, 2024 3:43 am
by Seymour Clufley
I need to ensure that all hyperlinks in a string enclose their href parameter with speech marks rather than apostrophes, without changing any other apostrophes in the string. In other words, I need to turn:

Code: Select all

754'536 <a href="https://purebasic.com">PB</a>, 523'434 <a href='https://purebasic.com'>PB</a> 837'562
into:

Code: Select all

754'536 <a href="https://purebasic.com">PB</a>, 523'434 <a href="https://purebasic.com">PB</a> 837'562
Regex is something I struggle to understand, but I'm fairly sure it is the ideal tool for this. Is there some simple expression that would do it?

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Thu Aug 22, 2024 5:14 am
by jacdelad
RegEx is good for finding, not ideal for replacing. However, if it's only for the href's, I can build you something in the evening (I just turned off my computer ), if nobody else is faster.

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Thu Aug 22, 2024 6:17 am
by Seymour Clufley
Ah, thank you for that tip. I've managed it that way.

Code: Select all

Procedure.s ConformAllHyperlinks(h.s)
  Static r.i
  If Not r
    r = CreateRegularExpression(#PB_Any,"href='(.+?)'")
  EndIf
  
  If ExamineRegularExpression(r,h)
      While NextRegularExpressionMatch(r)
        detect1 = RegularExpressionMatchPosition(r)
        detect2 = RegularExpressionMatchLength(r)
        tag.s = Mid(h,detect1,detect2)
        tag = ReplaceString(tag,Chr(39),Chr(34))
        h = Left(h,detect1-1)+tag+Mid(h,detect1+detect2,Len(h))
      Wend
  EndIf

  ProcedureReturn h
EndProcedure

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Fri Aug 23, 2024 7:42 am
by Piero

Code: Select all

; https://www.purebasic.fr/english/viewtopic.php?p=575871

Structure ReplaceGr : pos.i : ngr.i : group.s : EndStructure

Procedure RegexReplace2(RgEx, *Result.string, Replace0$)
	Protected i, CountGr, Pos, Offset = 1
	Protected Replace$
	Protected NewList item.s()
	Protected LenT, *Point
	Protected RE2
	Protected NewList ReplaceGr.ReplaceGr()
	CountGr = CountRegularExpressionGroups(RgEx):If CountGr > 9:CountGr = 9 :EndIf ; max 9 groups
	If ExamineRegularExpression(RgEx, *Result\s)
		RE2 = CreateRegularExpression(#PB_Any, "\\\d")
		If RE2
			If ExamineRegularExpression(RE2, Replace0$)
				While NextRegularExpressionMatch(RE2)
					If AddElement(ReplaceGr())
						ReplaceGr()\pos = RegularExpressionMatchPosition(RE2) ; позиция
						ReplaceGr()\ngr = ValD(Right(RegularExpressionMatchString(RE2), 1)) ; номер группы
						ReplaceGr()\group = RegularExpressionMatchString(RE2) ; текст группы
					EndIf
				Wend
			EndIf
			FreeRegularExpression(RE2) ; убрать строку при Static
		EndIf
		If Not ListSize(ReplaceGr())
			*Result\s = ReplaceRegularExpression(RgEx, *Result\s, Replace0$)
			ProcedureReturn
		EndIf
		SortStructuredList(ReplaceGr(), #PB_Sort_Descending, OffsetOf(ReplaceGr\pos), TypeOf(ReplaceGr\pos))
		While NextRegularExpressionMatch(RgEx)
			Pos = RegularExpressionMatchPosition(RgEx)
			Replace$ = Replace0$
			ForEach ReplaceGr()
				If ReplaceGr()\ngr
					Replace$ = ReplaceString(Replace$, ReplaceGr()\group, RegularExpressionGroup(RgEx, ReplaceGr()\ngr), #PB_String_NoCase, ReplaceGr()\pos, 1)
				Else
					Replace$ = ReplaceString(Replace$, ReplaceGr()\group, RegularExpressionMatchString(RgEx), #PB_String_NoCase, ReplaceGr()\pos, 1) ; обратная ссылка \0
				EndIf
			Next
			If AddElement(item())
				item() = Mid(*Result\s, Offset, Pos - Offset) + Replace$
			EndIf
			Offset = Pos + RegularExpressionMatchLength(RgEx)
		Wend
		If AddElement(item())
			item() = Mid(*Result\s, Offset)
		EndIf
		LenT = 0
		ForEach item()
			LenT + Len(item()) ; вычисляем длину данных для  вмещения частей текста
		Next
		*Result\s = Space(LenT) ; создаём строку забивая её пробелами
		*Point = @*Result\s	   ; Получаем адрес строки
		ForEach item()
			CopyMemoryString(item(), @*Point) ; копируем очередной путь в указатель
		Next
		FreeList(item()) ; удаляем список, хотя в функции наверно это не требуется
	EndIf
EndProcedure

Define Text.string
Text\s = ~"754'536 <a href=\"https://purebasic.com\">PB</a>, 523'434 <a href='https://purebasic.com'>PB</a> 837'562"
CreateRegularExpression(0 , "(.*href=)'(.+?)'(.*)" )
RegexReplace2(0, @Text, ~"\\1\"\\2\"\\3" )
FreeRegularExpression(0)
Debug Text\s

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Fri Aug 23, 2024 6:37 pm
by Piero
Some notes about my post above:

href='something' (must contain something) or else use: (.*?)

I'm Italian, but in this case I wouldn't apostrophize any quote as if it was a speech mark, whether it's double or not

I bet #PB_String_CaseSensitive would not really make it faster…
THANKS AZJIO!!! Got a Mac?

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Fri Aug 23, 2024 7:38 pm
by AZJIO
Piero wrote: Fri Aug 23, 2024 6:37 pm THANKS AZJIO!!! Got a Mac?
Do I have it? No.

Re: Regex to replace apostrophes in hyperlinks with speech marks

Posted: Fri Aug 23, 2024 10:49 pm
by Piero
AZJIO wrote: Fri Aug 23, 2024 7:38 pm
Piero wrote: Fri Aug 23, 2024 6:37 pm THANKS AZJIO!!! Got a Mac?
Do I have it? No.
What did I do? Why are you so cruel? :cry: