This is topic Mayfly: Anyone here good with regular expressions? in forum Books, Films, Food and Culture at Hatrack River Forum.


To visit this topic, use this URL:
http://www.hatrack.com/ubb/main/ultimatebb.php?ubb=get_topic;f=2;t=055912

Posted by Chris Bridges (Member # 1138) on :
 
We used to use a 3rd party photo print service. We no longer do so, and I need to remove the specialized URLs that still refer to it from our website. There are a few thousand of them.

Dreamweaver allows regular expressions in its search and replace function, but as far as I can tell there is no way to use a single wildcard to find a variable amount of text.

What regular expression would I use that could find all of these chunks of code?

| <a href="http://fictitioussite.com/perl/ptp/oursite?photo_name=/downloads/photoname040408.JPG&title=Photo caption">BUY THIS PRINT</a>

| <a href="http://fictitioussite.com/perl/ptp/oursite?photo_name=/special/woohoohoo/otherphotoname072309.JPG&title=Other photo caption">BUY THIS PRINT</a>

| <a href="http://fictitioussite.com/perl/ptp/oursite?photo_name=/otherotherphotoname121206.JPG&title=Other other photo caption">BUY THIS PRINT</a>

The domain name is the same, the closing code is always the same, but after that the URLs are of wildly varying lengths and characters, and as far as I can tell DW's regex only handles single characters. Any suggestions?

I can also use Homesite or another text editor if that would help. Anything beats manually editing thousands of pages.
 
Posted by King of Men (Member # 6684) on :
 
Are you completely sure there's no "the preceding item 0 or more times" in these regexps?

Is there an "this item optional" marker? As a crude kludge, you could do (optional: any character) and do the repetition by hand.

Failing that, I can only suggest you go to some editor with a more powerful regexp system.
 
Posted by fugu13 (Member # 2859) on :
 
The Dreamweaver regex tutorial shows that it has pretty much all the standard regex capabilities: http://www.adobe.com/devnet/dreamweaver/articles/regular_expressions.html

So, to do the replace about as I guess you'd want to do it, you'd do the following.

In the find box (assuming title is always present):

code:
http://fictitioussite.com/perl/ptp/oursite?photo_name=(.+)&title=(.+)

And in the replace box:

code:
http://our.new/photo/url/prefix$1?title_if_we_want_it=$2

Note: not tested, but should be about right. There could be a thing or two you'd want to escape, though the . in ".com" likely isn't one of them (too specific a location to matter).
 
Posted by Chris Bridges (Member # 1138) on :
 
What I want to do is find all those URLs and remove them entirely. so that

Photo by Photographer Name | <a href="http://fictitioussite.com/perl/ptp/oursite?photo_name=/downloads/photoname040408.JPG&title=Photo caption">BUY THIS PRINT</a>

becomes

"Photo by Photographer Name"
 
Posted by King of Men (Member # 6684) on :
 
If Dreamweaver has the usual regexp options as fugu says, then I'm confused about what the problem is. Can you give some detail on what you've tried so far, and what result you got that you didn't like?
 
Posted by fugu13 (Member # 2859) on :
 
That should be easy, then. Just add more surrounding stuff to the first regex, then put nothing in the replace field and do a full search/replace.
 
Posted by Shmuel (Member # 7586) on :
 
Search for:

\| <a href="http://fictitioussite\.com/perl/ptp/oursite\?.+?PRINT</a>

Replace with nothing.

(Assuming that there are no links you want to keep to that site and path which are preceded by a vertical bar and a space.)
 
Posted by Chris Bridges (Member # 1138) on :
 
Shmuel's worked perfectly, thanks!

That was the first one I tried when I got in this morning, so fugu13's may work just as well. I'll try that elsewhere (first one is still replacing).
 
Posted by fugu13 (Member # 2859) on :
 
Mine wouldn't work for what you want without tweaking as I indicated, which would basically make it Shmuel's.
 


Copyright © 2008 Hatrack River Enterprises Inc. All rights reserved.
Reproduction in whole or in part without permission is prohibited.


Powered by Infopop Corporation
UBB.classic™ 6.7.2