Peter Breuls's Weblog

 dinsdag 23 augustus 2005

Converting HTML to plaintext

This afternoon I wrote a simple PHP-function to convert an HTML formatted message to plain text. I'm using it to create a non-HTML version of a newsletter. It strips out all the HTML, but before it does that, it captures al links (modeled after this method and turns HTML list-items into plaintext list-items. I thought I'd share the code:

function html2text($text,$wrap=0){
preg_match_all("/(<([\w]+)[^>]*>)([^<]*)(<\/\\2>)/", $text, $matches, PREG_SET_ORDER);
$text = str_replace("<br />","\n",$text);
$text = str_replace("<br>","\n",$text);    
$text = str_replace("<BR>","\n",$text);    
$text = str_replace("<p>","\n\n",$text);    
$text = str_replace("<P>","\n\n",$text);    
$text = str_replace("<LI>","\n * ",$text);    
$text = str_replace("<li>","\n * ",$text);    
$text = str_replace("</LI>","",$text);    
$text = str_replace("</li>","",$text);    
$text = str_replace("</UL>","\n\n",$text);    
$text = str_replace("</ul>","\n\n",$text);    
    foreach (
$matches as $val) {
$val[2]=="a" || $val[2]=="A"){
preg_match_all ("|href\=([\"'`])(.+?)\1|i", $val[1], $urls);
$text = str_replace($val[0],$val[3]. " [$urlcount]",$text);
$text= wordwrap($text, $wrap, "\n");
$urllist as $key=>$url){
$text.="\n[".$key."] ".$url."";

Feel free to use it if you need it. Please mind that the code is wrapped to fit on this page.

Planning ahead

Systems Engineer: How long will it take for you to implement [the customer]'s changes?
Engineer: About two-three weeks. So four weeks.
Systems Engineer: Good. And how long will it take you to make your changes?
Intern: Well, I already did it, and it took an hour.
Systems Engineer: Okay, I'll tell them five weeks total.
Sounds like an everyday situation at work.

Google Desktop 2

The question is: does Google Desktop 2 work with FireFox? It doesn't say..

Maybe I should try it out. Looks interesting, with the sidebar and all.