[osflash] Find/replace xml tags - best approach?
postmaster at glenpike.co.uk
Mon Oct 8 10:31:00 PDT 2007
This is not AS3, but a PHP option as you have mentioned PHP in a
previous message, thought it may help...
I wrote a "scraper" in PHP based on an example in the Pear
XML_HTMLSax package to rewrite bad HTML into XHTML.
This was a server side thing that read pages from the NYTimes site
and pulled out segments of HTML between known points and cleaned up
unclosed paragraphs, bad <br /> tags.
I guess this maybe a similar application. You will require the PHP
Pear libraries installed, which can be a fiddle to do if you don't have
shell access, but there is a pear installer which runs off an HTML page
that you can visit.
Worth a try if you have used Pear before.
Also, someone once recommended not using Regex's to parse other
people's XML / XHTML / HTML because it breaks so easily. If you are
using Regex's on your own code, it may suffice - if your XML is very
strict and does not break the "rules".
I uploaded a ZIP file with the classes in for you to look at. It is
well commented so you may be able to see what's going on and if it is
> Yeah, funny, I've come to the same conclusion myself. I'm looking into
> the php libxslt right now, actually.
> Now, all I have to do is learn XSLT again... not touched it for years.
> On 08/10/2007, Peter Hall <peter.hall at memorphic.com> wrote:
>> The first two choices are pretty much the same thing, just different
>> ways of selecting the nodes in the first place. Once you have them,
>> it's still a bit of a pain to replace nodes.
>> The best solution is probably XSLT, depending on how complex the
>> transform that you actually want to do. It could just be overkill. I
>> am planning to build a full XSLT implementation at some point, but it
>> won't be any time in the next few months unless there are
>> On 10/8/07, Alias™ <alias at proalias.com> wrote:
>>> Hi guys,
>>> I'm wondering if anyone has any opinions on this. I'm faced with the
>>> need to search and replace a bunch of XML tags in an AS3 project. The
>>> tags are going to be nested, and will probably be basic HTML elements,
>>> and replacing them with other html elements. For various reasons it
>>> seems that this is necessary because of the project's localisation
>>> I'm currently considering the following options:
>>> - native E4X
>>> pros:built in, simple
>>> cons:not really powerful enough without writing a lot of code
>>> - the memorphic xpath library (http://www.memorphic.com/news/?page_id=16)
>>> pros:xpath is nice and what I'm used to
>>> cons:might be using a sledgehammer to crack a nut
>>> - native regex:
>>> pros:built in, lots of prewritten magic regexes which could do the job
>>> cons: lots of prewritten magic regexes which could do the job,
>>> but might also mysteriously fail further down the line
>>> Has anyone had any experiences with this that they'd like to share?
>>> Thanks in advance,
>>> osflash mailing list
>>> osflash at osflash.org
>> osflash mailing list
>> osflash at osflash.org
> osflash mailing list
> osflash at osflash.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the osflash