[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - How to strip HTML from messages saved in Poco

How to strip HTML from messages saved in Poco

Scripting questions and ideas

Moderators: Eric, Tomas, robin, Michael

How to strip HTML from messages saved in Poco

Postby frazmi » Mon Aug 16, 2004 12:28 pm

Elsewhere a request has been made to be able to strip HTML code out of a message before saving it. The following script does a pretty good job -- not perfect, by any means -- but I use it on my system.

It depends on having a free tray utility called "clippy" running. This utility takes whatever is on the clipboard, and does stuff to it. One of the options is to strip HTML. See end of post for a link.

Code: Select all
{ Script to strip HTML from the message body before saving it.
{ Assumes that Clippy is running...
{ Script by frazmi. Placed in public domain.
{ Please test thoroughly before committing to its use.  This script works
{ on my system, but that's the only warranty there is :-).
IntToChar $CR 13
IntToChar $LF 10
AddStrings $CRLF $CR $LF

ReadAllHeaders $headers %message
ReadBody $body %message

CopyToClipboard $body

Set $a "Message body has been copied to clipboard."
AddStrings $a $CRLF "Click on the Clippy icon to strip HTML."
AddStrings $a $CRLF "After you've done that, close this message box."
AddStrings $a $CRLF "The message body should be updated with 'clean' HTML."
MessageBox $a

PasteFromClipboard $body
AssignBody %message $body

DeleteMessage %message
SaveMessage %message $CurrentMailbox

EXIT


You can find clippy here...
http://wots.coolfreepage.com/link.php

If you don't like clippy, there are a number of other approaches to solving this problem. For example, you could use an external program to strip HTML (look for "HTML stripper" AND program AND download in a search engine). Then, instead of copying the message body to the clipboard, you could save it to a file, use execute and wait to pass the file to the external stripper, then read the stripped file back into Poco.
frazmi
Poco Enthusiast
 
Posts: 248
Joined: Tue Jul 27, 2004 1:27 am
Location: South Korea

Postby Michael » Tue Aug 17, 2004 9:08 am

The ReadBody command itself strips the HTML, there should be no need for third party software. (PS: Use the ReadRawBody command to preserve HTML).

Thus, I believe that on an incoming message all you need is:
Code: Select all
ReadBody $Body %Message
ClearBody %Message
AssignBody %Message $Body


Caution: This has not been tested.
Michael
Moderator
 
Posts: 866
Joined: Mon Jul 26, 2004 12:14 pm
Location: Victoria BC, Canada

Postby frazmi » Tue Aug 17, 2004 12:18 pm

"Should be" is correct. Read body command does not do a good job on poorly formed HTML. I had about a thousand messages that readbody did not strip properly because the HTML was not perfect. The script above was written because of this -- clippy is much more robust at actually stripping HTML -- I suspect because it is not expecting a "complete message" but rather just looks for tags and kills them.
frazmi
Poco Enthusiast
 
Posts: 248
Joined: Tue Jul 27, 2004 1:27 am
Location: South Korea


Return to PocoScript Help and How-To

Who is online

Users browsing this forum: No registered users and 0 guests

cron