It is currently Thu Nov 21, 2019 8:36 pm

All times are UTC + 1 hour




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Regular Expression dude
PostPosted: Thu Dec 13, 2018 6:44 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Thu Mar 25, 2004 2:15 pm
Posts: 702
Location: Spain
Hi, In Notepad++ I can do this search/replace:

Find: <a [[:<:]]href(.*?)<\/a>
Replace with: <zz href$1</zz>

And it will change all <a href...>I'm a phrase</a> tags in a document to <zz href...>I'm a phrase</zz>

<a href=\"https://www.sample.com\">I'm a phrase</a> will be replace to
<zz href=\"https://www.sample.com\">I'm a phrase</zz>

(I don't want to change pairs without href attribute, as <a noPop" onClick="var e=document.createElement('script');>Another phrase</a>)

How can I do it in pb? Actually I use this code

Code:
Procedure.s Ereg_Replace(Text$, Pattern$, Replace$ = "", Options.l = #PB_RegularExpression_DotAll |  #PB_RegularExpression_Extended |  #PB_RegularExpression_AnyNewLine)
  Protected hRegex = CreateRegularExpression(#PB_Any, Pattern$, Options)
  Protected Dim result.s(0)
  If hRegex
    Repeat
      ReDim result(0)
      ExtractRegularExpression(hRegex, Text$, result())
      Text$ = ReplaceRegularExpression(hRegex, Text$, Replace$)
      Delay(0)
    Until ArraySize(result())=0

    FreeRegularExpression(hRegex)
  Else
    Debug "Can't create a Regex with this pattern : " + Pattern$
  EndIf
  ProcedureReturn Text$
EndProcedure

Text$ = ~"<a href=\"https://www.sample.com\">I'm a phrase</a>"

Debug Ereg_Replace(Text$, "<a href(.*?)<\/a>", "<zz href$1</zz>", #PB_RegularExpression_DotAll|#PB_RegularExpression_MultiLine|#PB_RegularExpression_NoCase)


Thank you and sorry my english!

_________________
PB 5.7x, PureVision User.


Top
 Profile  
Reply with quote  
 Post subject: Re: Regular Expression dude
PostPosted: Thu Dec 13, 2018 9:18 pm 
Offline
Enthusiast
Enthusiast

Joined: Fri Apr 20, 2012 8:09 pm
Posts: 284
You are using a back reference which does not work with PB ($1 or \1).
"ReplaceRegularExpression" is just replacing the whole expression.

you might want to do a simple find and replace if the patterns are simple "<a href=" and end "</a>"
or if it s more like
Code:
"<a style="margin: 0;" href="
then do a regex for the beginning get the size then
get Mid( string, sizeoffoundregex)
take the result from this and delete the last "</a>" to a newlycreatedstring
finally add your " + <zz href" newlycreatedstring + "</zz>"

Norm.

_________________
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari


Top
 Profile  
Reply with quote  
 Post subject: Re: Regular Expression dude
PostPosted: Thu Dec 13, 2018 9:44 pm 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Thu Mar 25, 2004 2:15 pm
Posts: 702
Location: Spain
normeus wrote:
You are using a back reference which does not work with PB ($1 or \1).
"ReplaceRegularExpression" is just replacing the whole expression.

you might want to do a simple find and replace if the patterns are simple "<a href=" and end "</a>"
or if it s more like
Code:
"<a style="margin: 0;" href="
then do a regex for the beginning get the size then
get Mid( string, sizeoffoundregex)
take the result from this and delete the last "</a>" to a newlycreatedstring
finally add your " + <zz href" newlycreatedstring + "</zz>"

Norm.
Thank you! For now, I did this ugly, but functional, code
Code:
Procedure.s Ereg_ReplaceTags(Text$, tagIn$, tagOut$)
  Protected Pattern$
  Pattern$ = "<" + tagIn$ + " href(.*?)<\/" + tagIn$ + ">"
  Debug Pattern$
  Protected hRegex = CreateRegularExpression(#PB_Any, Pattern$, #PB_RegularExpression_DotAll|#PB_RegularExpression_MultiLine|#PB_RegularExpression_NoCase)
  Protected Dim result.s(0)
  Protected aux$, cont.l
  If hRegex
    Repeat
      ReDim result(0)
      ExtractRegularExpression(hRegex, Text$, result())
      If ArraySize(result())>0
        For cont = 0 To ArraySize(result())-1
          aux$ = result(cont)
          aux$ = ReplaceString(aux$, "<" + tagIn$ + " ", "<" + tagOut$ + " ", #PB_String_NoCase)
          aux$ = ReplaceString(aux$, "</" + tagIn$ + ">", "</"+tagOut$ + ">", #PB_String_NoCase)
          Text$ = ReplaceString(Text$, result(cont), aux$)
        Next
      EndIf
      Delay(0)
    Until ArraySize(result())=0
    FreeRegularExpression(hRegex)
  Else
    Debug "Can't create a Regex with this pattern : " + Pattern$
  EndIf
  ProcedureReturn Text$
EndProcedure

Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a NoReplaceMe>Share</a>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$

Debug Ereg_ReplaceTags(Text$, "a", "zz")

_________________
PB 5.7x, PureVision User.


Top
 Profile  
Reply with quote  
 Post subject: Re: Regular Expression dude
PostPosted: Thu Dec 13, 2018 10:31 pm 
Offline
PureBasic Expert
PureBasic Expert

Joined: Sun Apr 12, 2009 6:27 am
Posts: 3458
Be careful with Escape String
Yours misses "href=\"
Code:
Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a NoReplaceMe>Share</a>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<p> Share</p>" + #CRLF$
Text$ + ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$


Dim String$(0)
CreateRegularExpression(0, "(?<=<)a(?=\s+href)|(?<=</)a(?=>)", #PB_RegularExpression_NoCase)
For k = 1 To 100
  ReDim String$(k)
  String$(k) = StringField(Text$, k,#CRLF$)
  If String$(k) = ""
    Break
  ElseIf Left(String$(k),7)= "<a href"
    new$ = ReplaceRegularExpression(0, UnescapeString(string$(k)), "zz")
    String$(k) = EscapeString(new$)
  EndIf
  final$ = final$ + string$(k)+#CRLF$
Next
FreeRegularExpression(0)
Debug final$

_________________
Egypt my love


Top
 Profile  
Reply with quote  
 Post subject: Re: Regular Expression dude
PostPosted: Fri Dec 14, 2018 9:23 am 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Thu Mar 25, 2004 2:15 pm
Posts: 702
Location: Spain
RASHAD wrote:
Be careful with Escape String
Yours misses "href=\"
...
Thank you, RASHAD, but the original source don't has '\', it's only to manage the input text with double quotes :)
Code:
Text$ = ~"<a href=\"https://www.sample.com\">Share</a>" + #CRLF$
or
Text$ = "<a href=" + #DQUOTE$ + "https://www.sample.com" + #DQUOTE$ +">Share</a>" + #CRLF$
I only need get <a href="https://www.sample.com">Share</a>

PS: If your comment goes the other way, excuse me, my English is limited and I often confuse the real meaning of your comments.

_________________
PB 5.7x, PureVision User.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye