RSS-Spider

Development, Ideas, Issues, problems, ßetas and what not…

Quick link to major company news

Filed under: What Not... — Dave at 10:27 pm on Tuesday, January 31, 2006

Since I’m always online either programming this site or screwing around in my Ameritrade account I’ve created a “cheat” page for me to quickly pull up anything that might be in the database about companies a whole slew of companies.  I’ll be adding more as I get though the chapters in the 100 best companies to invest in in 2006.  But for right now there are 480+ companies at http://www.rss-spider.com/company_list.php

Database backup & purge GONE WILD!

Filed under: Issues, Problems — Dave at 6:30 pm on Sunday, January 29, 2006

Last night we experienced an 8 hour outage during our weekly database backup & purge.  The database of headlines has grown so massive that it took way too long to purge out any old & “spammy” headlines.  The last time we did this purge was back in December and we only experienced a 15 minute outtage.  I was up till 6 am waiting for everything to finish.  If you had visited the site during this time you would have been directed over to the my.rss-spider.com page which was incorrectly listed as our forwarding page.  My.RSS-Spider.com is a project we’re working on which is in it’s infancy.  In a nut shell what it will do is allow users to have their own webspace to aggrogate RSS feeds from RSS Spider.  So instead of coming and searching for something every time you can simply go to your my.RSS-Spider.com home page and view all the new feeds matching your search criteria.  Again.. this is something thats way down the road, however, we are planning to have a Beta release in late March.

What was Hot Yesterday!

Filed under: Betas, Development — Dave at 6:23 pm on Sunday, January 29, 2006

Yesterday marked the launch of the Hot Words section of RSS-Spider. What this section does is mash up all the posts from any given day, sort all the words from that day and count the number of times any specific word appears. From there it takes the top 100 words as they appear and rank them in font size order.  So on January 25th the system processed all the documents in the database that had a pubdate of January 24 (this date comes from the RSS feed that the spider pulled) and found that in the top 100 terms used on that day Alito, Bush, Iraq and War all came up…

Clicking on any one of these terms will pull up all the articles stored in the database for that day with that term.

Currently we are only processing English language feeds, but our next step is to add a Hot Words for German users.

Whats hot! (Yesterday)

Filed under: Betas — Dave at 1:56 pm on Sunday, January 15, 2006

If you want to see what items are hot in the RSS feeds we’re polling check out the WhatsHotYesterday.php link under the Beta section.  What we’re doing here is taking all the feeds we’ve spidered that have a post date of yesterday (what ever that might be) and mashing the headlines and bodies together sorting out all the words then figuring out which words appeare the most.

So far we’ve see Alito pop up a few times on our Friday test, as well as Bush.  Guitar seems to be a big one.  Right now it’s only going to return blogs where the language is EN or english.

Soon we should have a whats hot yesterday database of every pubdate in our database.

Top Ten Google Searches for RSS-Spider

Filed under: What Not... — Dave at 10:21 pm on Tuesday, January 10, 2006

I’ve started capturing Google search strings that are being carried over to RSS-Spider in the query string.  They now replace the top 10 searches for the day & top 10 most searched for terms of the last 30 days on the home page.  A bit more logical since 90% of all searches done on this site start out somewhere else…

Next Page »