now there's an idea
Jan. 14th, 2004 03:11 pmThe redoubtable Dave Zuckerman writes:
I'm not a huge blog fan. Every once in a while, just to be contrarian, I think about doing periodic .plan updates, just so I can tell people who ask if I blog that they'll have to finger me to find out! Bwahahahah!
Now i'm tempted to figure a way to stick my latest LJ entry into my .plan.
No problemo!
Date: 2004-01-14 06:37 pm (UTC)HTH! HAND! SEND ME A DOLLAR!
Re: No problemo!
Date: 2004-01-14 07:08 pm (UTC)While I'll grant that "the entire damn page" does indeed include "latest entry", I'm not sure this is what was meant :)
rone - what client do you use? the update.bml web page? Because a client side hack is easiest, but a cron post harvester should be easy enough using Simes' LJ::Simple.
sol.
.
Re: No problemo!
Date: 2004-01-14 07:36 pm (UTC)The serious way involves an XML parser - the easy way involves the LWP trick with the rss page and split() :)
But I'm not going to do it as an on-the-fly one liner :)
sol.
.
Re: No problemo!
Date: 2004-01-14 09:26 pm (UTC)Re: No problemo!
Date: 2004-01-14 09:34 pm (UTC)Then either LJ now includes the entire post in the rss description for everyone (it used to truncate unpaid users), or you have short posts :)
Seems they include the whole post for everyone. Oh huzzah.
If you're serious, consider it done - I have code for much of this at home, so given a weekend it should be doable.
sol.
.
Re: No problemo!
Date: 2004-01-14 10:11 pm (UTC)Re: No problemo!
Date: 2004-01-14 10:19 pm (UTC)sol.
.
While you americans sleep ...
Date: 2004-01-15 02:38 am (UTC)9:35pm@kittling~...perl/rone>roneme -u ronebofh
_now there's an idea_
The redoubtable Dave Zuckerman (http://www.panix.com/~daz/) writes:
Now i'm tempted to figure a way to stick my latest LJ entry into my .plan (http://www.vetmed.auburn.edu/cgi-bin/finger?rone%40ennui.org).
===================================
Now for the *hard* cutting and pasting - stripping the HTML.
sol.
.
munch, munch
Date: 2004-01-15 03:15 am (UTC)_now there's an idea_
The redoubtable Dave Zuckerman writes:
I'm not a huge blog fan. Every once in a while, just to be
contrarian, I think about doing periodic .plan updates, just so I
can tell people who ask if I blog that they'll have to finger me
to find out! Bwahahahah!
Now i'm tempted to figure a way to stick my latest LJ entry into my
.plan.
===================================
The formatting may need tweaking, but I'm sure you can pipe it to fmt before .planning it
Oh, and:
9:50pm@kittling~...perl/rone>roneme -t -i2 -u ronebofh
_now there's an idea_
The redoubtable Dave Zuckerman writes:
I'm not a huge blog fan. Every once in a while, just to be
contrarian, I think about doing periodic .plan updates, just so I
can tell people who ask if I blog that they'll have to finger me
to find out! Bwahahahah!
Now i'm tempted to figure a way to stick my latest LJ entry into my
.plan.
===================================
_blogulate_
Bob Mould has a Web log. His new stuff doesn't appeal to me,
unfortunately. Any comments from anyone out there who's a fan of his
old AND his new stuff?
===================================
You'll need:
#!/usr/bin/perl -w
use strict;
use LWP::UserAgent;
use Getopt::Std;
use XML::Simple;
use HTML::FormatText;
Which I consider pretty standard, but then I do this shit for a living.
Let me know your email address (I'm solitaire@tygger.net), and I'll let you know where to pick up the first cut. I'd like to polish it a little before I release it to the world at large.\
Though it's 77 lines of perl. I could just email it to you :)
sol.
.
Re: No problemo!
Date: 2004-01-14 08:00 pm (UTC)#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $content = get("http://www.livejournal.com/users/ronebofh/");
local($/);
if ($content =~ m|<div class="entry">(.*?)</div>|ms)
{
my $latest_entry = $1;
#strip tags and crap
$latest_entry =~ s/<.*?>//g;
$latest_entry =~ s/ / /g;
open PLAN, ">$ENV{HOME}/.plan" or die "can't open .plan: $!";
print PLAN $latest_entry;
close PLAN or die "can't close .plan: $!";
}
__END__
Re: No problemo!
Date: 2004-01-14 08:25 pm (UTC)It breaks as soon as he changes layout/format. Or at least has the potential to break - some layouts don't even use the div.
Splitting the RSS feed is the way to go.
my $content = get("http://livejournal.com/users/ronebofh/rss");
Then split on "", chop off everything after , de-htmlise as you do, and you're good to go. Though I'd run it through HTML::Parser to expand the entities rather than trusting to , and I'd use an XML parser anyway.
sol.
.
Re: No problemo!
Date: 2004-01-14 08:28 pm (UTC)"split on <description>", and you want the second element of the array thus produced. Then discard everything after "</description>", etc.
I can't be expected to be thinking right, I'm at work!
sol.
.
Re: No problemo!
Date: 2004-01-14 09:26 pm (UTC)Re: No problemo!
Date: 2004-01-14 09:36 pm (UTC)So a cronned or otherwise triggered screenscraper will in fact be easier than a client hack.
sol.
.