[osflash] Find/replace xml tags - best approach?
Glen Pike
postmaster at glenpike.co.uk
Mon Oct 8 10:31:00 PDT 2007
Hi,
This is not AS3, but a PHP option as you have mentioned PHP in a
previous message, thought it may help...
I wrote a "scraper" in PHP based on an example in the Pear
XML_HTMLSax package to rewrite bad HTML into XHTML.
http://pear.php.net/package/XML_HTMLSax/
This was a server side thing that read pages from the NYTimes site
and pulled out segments of HTML between known points and cleaned up
unclosed paragraphs, bad <br /> tags.
I guess this maybe a similar application. You will require the PHP
Pear libraries installed, which can be a fiddle to do if you don't have
shell access, but there is a pear installer which runs off an HTML page
that you can visit.
Worth a try if you have used Pear before.
Also, someone once recommended not using Regex's to parse other
people's XML / XHTML / HTML because it breaks so easily. If you are
using Regex's on your own code, it may suffice - if your XML is very
strict and does not break the "rules".
I uploaded a ZIP file with the classes in for you to look at. It is
well commented so you may be able to see what's going on and if it is
useful.
http://glenpike.co.uk/sd/HTML/HTMLHandler.zip
HTH
Glen
Alias™ wrote:
> Yeah, funny, I've come to the same conclusion myself. I'm looking into
> the php libxslt right now, actually.
>
> Now, all I have to do is learn XSLT again... not touched it for years.
>
> Cheers!
> Alias
>
> On 08/10/2007, Peter Hall <peter.hall at memorphic.com> wrote:
>
>> The first two choices are pretty much the same thing, just different
>> ways of selecting the nodes in the first place. Once you have them,
>> it's still a bit of a pain to replace nodes.
>>
>> The best solution is probably XSLT, depending on how complex the
>> transform that you actually want to do. It could just be overkill. I
>> am planning to build a full XSLT implementation at some point, but it
>> won't be any time in the next few months unless there are
>> volunteers...
>>
>> Peter
>>
>>
>> On 10/8/07, Alias™ <alias at proalias.com> wrote:
>>
>>> Hi guys,
>>>
>>> I'm wondering if anyone has any opinions on this. I'm faced with the
>>> need to search and replace a bunch of XML tags in an AS3 project. The
>>> tags are going to be nested, and will probably be basic HTML elements,
>>> and replacing them with other html elements. For various reasons it
>>> seems that this is necessary because of the project's localisation
>>> goals.
>>>
>>> I'm currently considering the following options:
>>>
>>> - native E4X
>>> pros:built in, simple
>>> cons:not really powerful enough without writing a lot of code
>>> - the memorphic xpath library (http://www.memorphic.com/news/?page_id=16)
>>> pros:xpath is nice and what I'm used to
>>> cons:might be using a sledgehammer to crack a nut
>>> - native regex:
>>> pros:built in, lots of prewritten magic regexes which could do the job
>>> cons: lots of prewritten magic regexes which could do the job,
>>> but might also mysteriously fail further down the line
>>>
>>> Has anyone had any experiences with this that they'd like to share?
>>>
>>> Thanks in advance,
>>> Alias
>>>
>>> _______________________________________________
>>> osflash mailing list
>>> osflash at osflash.org
>>> http://osflash.org/mailman/listinfo/osflash_osflash.org
>>>
>>>
>> _______________________________________________
>> osflash mailing list
>> osflash at osflash.org
>> http://osflash.org/mailman/listinfo/osflash_osflash.org
>>
>>
>
> _______________________________________________
> osflash mailing list
> osflash at osflash.org
> http://osflash.org/mailman/listinfo/osflash_osflash.org
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://osflash.org/pipermail/osflash_osflash.org/attachments/20071008/5fcc5d2c/attachment.html
More information about the osflash
mailing list