Parse HTML the Groovy way
Posted by thomas - 09/06/08 at 05:06:53 pmIn the last couple of weeks I often had to download a lot of files, submitted to a web-based teaching platform. Downloading all these files by hand is very annoying so I implemented a short Groovy script. Since Groovy has a great support for parsing well-formed XML-like information it fails if you want to parse unstructured and nasty HTML code.
At last I searched for a Java library containing an HTML-parsers and I found TagSoup. This is a SAX-compliant HTML-parser specialized in re-formating and cleaning up faulty HTML code.
This is <B>bold, <I>bold italic, </b>italic, </i>normal text
will be rewritten to
This is <b>bold, <i>bold italic, </i></b><i>italic, </i>normal text.
One advantage of TagSoup is the Xpath-like query mechanism. It parses the HTML code and generates an object structure representing this content. Now the user can access the single elements. One possible example could be:
def slurper = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser())
html = slurper.parse("an_example_file.html")
table = html.body.div.find{ it.@id == "content" }.form.table.
find{ it.@id == "attempts" }
This retrieves the table “attempts” placed inside a form in the div “content”. The method findAll() will retrieve all elements for a given attribute or with given child elements.
After all I fell in love with TagSoup. It saves a lot of work when you have to access HTML content of websites, portals or similar, which are not able to send a XHTML 1.x compliant responses. But this is an other topic
.
Fancy Zoom
Posted by Nils - 10/02/08 at 05:02:03 pmFancy Zoom is a great way of adding zoom capabilities to your website in a blink of an eye.
Javascript powered and cross-browser compatible.
Have a look at the demo-site!
To add fancy zoom to your site do the following easy steps:
1. Download Fancy Zoom over at cabel’s place
2. add upload the contents of the package to your webroot
3. add the following javascript files to your html pages:
4. enable the body to load the neccesary javascript stuff:
and you’re done!
from now on, every link to an image becomes zoomed. Even text-links to images!
thank you cabel for this great addition!
- Nils
Theme updates
Posted by Nils - 25/01/08 at 09:01:52 pmfirst geekery! *hooray*
also bevor man hier überhaupt ANFANGEN kann zu bloggen musste erstmal das Theme auf den aktuellsten hip-web-zwonull-mega-standard gebracht werden. Einmal mit Profis arbeiten ne?!
Falls jemand ‘mal von seinem Wordpress die Meldung bekommen sollte “Your theme is not widgetized” der folge einfach folgenden einfachen Anweisungen:
Eure sidebar sieht wahrscheinlich etwa so aus:
Um jetzt Wordpress’ großartige “Widget” Funktion einzubauen, fügt man folgendes einfach in die sidebar ein:
|| !dynamic_sidebar() ) ?>
und legt (falls nicht vorhanden) die datei functions.php im Theme-Verzeichnis an:
if ( function_exists('register_sidebar') )
register_sidebar();
?>
letztere teilt Wordpress mit, dass die sidebar Widgets unterstützt.
Von nun an, kann man unter “Presentation -> Widgets” einfach per drag and drop die sidebar konfigurieren, wie hier in unserem Fall zum Beispiel mit der wunderschönen, noch sehr leeren Tag Cloud.
cheers, Nils
Powered by WordPress with GimpStyle Theme design by Horacio Bella. Get Entries and comments.
