PureBasic Forum
http://forums.purebasic.com/english/

Regular Expression dude
http://forums.purebasic.com/english/viewtopic.php?f=13&t=71917
Page 1 of 1

Author:  zikitrake [ Thu Dec 13, 2018 6:44 pm ]
Post subject:  Regular Expression dude

Hi, In Notepad++ I can do this search/replace:

Find: <a [[:<:]]href(.*?)<\/a>
Replace with: <zz href$1</zz>

And it will change all <a href...>I'm a phrase</a> tags in a document to <zz href...>I'm a phrase</zz>

<a href=\"https://www.sample.com\">I'm a phrase</a> will be replace to
<zz href=\"https://www.sample.com\">I'm a phrase</zz>

(I don't want to change pairs without href attribute, as <a noPop" onClick="var e=document.createElement('script');>Another phrase</a>)

How can I do it in pb? Actually I use this code

Code:
Procedure.s Ereg_Replace(Text$, Pattern$, Replace$ = "", Options.l = #PB_RegularExpression_DotAll |  #PB_RegularExpression_Extended |  #PB_RegularExpression_AnyNewLine)
  Protected hRegex = CreateRegularExpression(#PB_Any, Pattern$, Options)
  Protected Dim result.s(0)
  If hRegex
    Repeat
      ReDim result(0)
      ExtractRegularExpression(hRegex, Text$, result())
      Text$ = ReplaceRegularExpression(hRegex, Text$, Replace$)
      Delay(0)
    Until ArraySize(result())=0

    FreeRegularExpression(hRegex)
  Else
    Debug "Can't create a Regex with this pattern : " + Pattern$
  EndIf
  ProcedureReturn Text$
EndProcedure

Text$ = ~"<a href=\"https://www.sample.com\">I'm a phrase</a>"

Debug Ereg_Replace(Text$, "<a href(.*?)<\/a>", "<zz href$1</zz>", #PB_RegularExpression_DotAll|#PB_RegularExpression_MultiLine|#PB_RegularExpression_NoCase)


Thank you and sorry my english!

Author:  normeus [ Thu Dec 13, 2018 9:18 pm ]
Post subject:  Re: Regular Expression dude

You are using a back reference which does not work with PB ($1 or \1).
"ReplaceRegularExpression" is just replacing the whole expression.

you might want to do a simple find and replace if the patterns are simple "<a href=" and end "</a>"
or if it s more like
Code:
"<a style="margin: 0;" href="
then do a regex for the beginning get the size then
get Mid( string, sizeoffoundregex)
take the result from this and delete the last "</a>" to a newlycreatedstring
finally add your " + <zz href" newlycreatedstring + "</zz>"

Norm.

Author:  zikitrake [ Thu Dec 13, 2018 9:44 pm ]
Post subject:  Re: Regular Expression dude

normeus wrote:
You are using a back reference which does not work with PB ($1 or \1).
"ReplaceRegularExpression" is just replacing the whole expression.

you might want to do a simple find and replace if the patterns are simple "<a href=" and end "</a>"
or if it s more like
Code:
"<a style="margin: 0;" href="
then do a regex for the beginning get the size then
get Mid( string, sizeoffoundregex)
take the result from this and delete the last "</a>" to a newlycreatedstring
finally add your " + <zz href" newlycreatedstring + "</zz>"

Norm.
Thank you! For now, I did this ugly, but functional, code
Code:
Procedure.s Ereg_ReplaceTags(Text$, tagIn$, tagOut$)
  Protected Pattern$
  Pattern$ = "<" + tagIn$ + " href(.*?)<\/" + tagIn$ + ">"
  Debug Pattern$
  Protected hRegex = CreateRegularExpression(#PB_Any, Pattern$, #PB_RegularExpression_DotAll|#PB_RegularExpression_MultiLine|#PB_RegularExpression_NoCase)
  Protected Dim result.s(0)
  Protected aux$, cont.l
  If hRegex
    Repeat
      ReDim result(0)
      ExtractRegularExpression(hRegex, Text$, result())
      If ArraySize(result())>0
        For cont = 0 To ArraySize(result())-1
          aux$ = result(cont)
          aux$ = ReplaceString(aux$, "<" + tagIn$ + " ", "<" + tagOut$ + " ", #PB_String_NoCase)
          aux$ = ReplaceString(aux$, "</" + tagIn$ + ">", "</"+tagOut$ + ">", #PB_String_NoCase)
          Text$ = ReplaceString(Text$, result(cont), aux$)
        Next
      EndIf
      Delay(0)
    Until ArraySize(result())=0
    FreeRegularExpression(hRegex)
  Else
    Debug "Can't create a Regex with this pattern : " + Pattern$
  EndIf
  ProcedureReturn Text$
EndProcedure

Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a NoReplaceMe>Share</a>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$

Debug Ereg_ReplaceTags(Text$, "a", "zz")

Author:  RASHAD [ Thu Dec 13, 2018 10:31 pm ]
Post subject:  Re: Regular Expression dude

Be careful with Escape String
Yours misses "href=\"
Code:
Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a NoReplaceMe>Share</a>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$


Dim String$(0)
CreateRegularExpression(0, "(?<=<)a(?=\s+href)|(?<=</)a(?=>)", #PB_RegularExpression_NoCase)
For k = 1 To 100
  ReDim String$(k)
  String$(k) = StringField(Text$, k,#CRLF$)
  If String$(k) = ""
    Break
  ElseIf Left(String$(k),7)= "<a href"
    new$ = ReplaceRegularExpression(0, UnescapeString(string$(k)), "zz")
    String$(k) = EscapeString(new$)
  EndIf
  final$ = final$ + string$(k)+#CRLF$
Next
FreeRegularExpression(0)
Debug final$

Author:  zikitrake [ Fri Dec 14, 2018 9:23 am ]
Post subject:  Re: Regular Expression dude

RASHAD wrote:
Be careful with Escape String
Yours misses "href=\"
...
Thank you, RASHAD, but the original source don't has '\', it's only to manage the input text with double quotes :)
Code:
Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
or
Text$ = "<a href=" + #DQUOTE$ + "https://www.sample.com" + #DQUOTE$ +">Share</a>" + #CRLF$
I only need get <a href="https://www.sample.com">Share</a>

PS: If your comment goes the other way, excuse me, my English is limited and I often confuse the real meaning of your comments.

Page 1 of 1 All times are UTC + 1 hour
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/