[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Notice: in file [ROOT]/includes/session.php on line 2208: Array to string conversion
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4688: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4690: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4691: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
[phpBB Debug] PHP Warning: in file [ROOT]/includes/functions.php on line 4692: Cannot modify header information - headers already sent by (output started at [ROOT]/includes/functions.php:3823)
Poco Forums • View topic - Running Whois Query against URL in Body

Running Whois Query against URL in Body

Scripting questions and ideas

Moderators: Eric, Tomas, robin, Michael

Running Whois Query against URL in Body

Postby spamsmasher » Thu Dec 10, 2009 9:32 am

OK...maybe not "fun", but definitely interesting...

What would be the best way to grab a URL from the body of an email, dump it to whois, and then append the results to an email?

I see the biggest challenge as parsing the URL from the body of the email and dumping it to whois. Not only does the script have to find the url (look for "http://" perhaps?), but it has to take everything between http:// and the first slash. Otherwise, the whois tool wouldn't know what to do with a full URL such as http://www.example.com/path/to/something.htm.

The other issue I see is differentiating between HTML and plaintext emails. I guess the script would have to check for HREF tags first and if they exist look for http between the "<a href>" tags.

Also, I think it'd be ideal to only perform one whois per message. If multiple links exist within the body, there'd be a painful amount of whois data appended in the email.

I've been looking at the DNSBL script for insight, but I'm not experienced or talented enough with Pocoscript to figure this out.

Any thoughts on this? Unrealistic?
spamsmasher
Drop-in Visitor
 
Posts: 9
Joined: Thu Nov 19, 2009 4:20 am

Postby MarkB » Tue Dec 15, 2009 5:19 am

It's still possible that someone with a knowledge of net tools and plenty of time will drop by here, but in case that never happens, rather than leave your post dangling...

This is a substantial undertaking, and 99% of the effort would fall outside the realm of PocoScript I think. Here's a couple of possible homework assignments.

1. Find a program that retrieves whois info, accepts input via command line or file, and outputs its findings to a file. And get it working. Alternatively, find enough info to enable a do-it-yourselfer.
http://www.google.com/search?q=whois%20client

2. Find a regular expression that matches all URLs and only URLs, and extracts the domain name, and is above reproach. It extracts the entire domain name (e.g., bbc.co.au) and nothing but the domain name (e.g., blog.microsoft.com is too much). It is compatible with your chosen programming language. Do this and you will deserve a fancy merit badge.
http://www.google.com/search?q=%22regular+expression%22+match|find+URL
Maybe a good-enough, win-some, lose-some routine in PocoScript would suffice for your purposes if someone had time to write it.

3. Decide what to do with the file. Append it to the message? Good with plain-text messages; sloppy with HTML messages. Attach it to the message, if that is approved for already received messages?

In the scenario I envisage, the role of PocoScript is to hand over a raw message to an external script that supports regular expressions, which in turn calls the whois program. When that program finishes, the PocoScript will retrieve the output file and do something with it.

Unrealistic? I believe so, yes.
MarkB
Poco Enthusiast
 
Posts: 217
Joined: Mon Aug 09, 2004 1:31 am
Location: Canada

Postby spamsmasher » Tue Dec 15, 2009 8:33 am

MarkB wrote:Unrealistic? I believe so, yes.


I appreciate your insight.
spamsmasher
Drop-in Visitor
 
Posts: 9
Joined: Thu Nov 19, 2009 4:20 am


Return to PocoScript Help and How-To

Who is online

Users browsing this forum: No registered users and 1 guest

cron