RSS-Spider

Development, Ideas, Issues, problems, ßetas and what not…

wordpress hack <u style=’display:none’>

Filed under: Problems — Dave at 1:21 am on Saturday, April 26, 2008

Checking email for this site today I ran across this email from Google Search Quality. At first I thought it was a spam seeing as it was filled with crap about viagra & calliass but was shocked to find that this crap WAS on this blog. Well it seems an older version of Wordpress that I was running has a venerability allowing someone to update your theme files and post all sorts of CRAP into it with links leading back to thier spammy sites. Some one did this since I am a lazy sysadmin and didn’t update wordpress. Broke rule number 2 on the Google webmaster security check list…

Shame on me…

Dear site owner or webmaster of rss-spider.com/blog,

While we were indexing your webpages, we detected that some of your pages were using techniques that are outside our quality guidelines, which can be found here: http://www.google.com/webmasters/guidelines.html. This appears to be because your site has been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index.
(Read on …)

Database backup & purge GONE WILD!

Filed under: Issues, Problems — Dave at 6:30 pm on Sunday, January 29, 2006

Last night we experienced an 8 hour outage during our weekly database backup & purge.  The database of headlines has grown so massive that it took way too long to purge out any old & “spammy” headlines.  The last time we did this purge was back in December and we only experienced a 15 minute outtage.  I was up till 6 am waiting for everything to finish.  If you had visited the site during this time you would have been directed over to the my.rss-spider.com page which was incorrectly listed as our forwarding page.  My.RSS-Spider.com is a project we’re working on which is in it’s infancy.  In a nut shell what it will do is allow users to have their own webspace to aggrogate RSS feeds from RSS Spider.  So instead of coming and searching for something every time you can simply go to your my.RSS-Spider.com home page and view all the new feeds matching your search criteria.  Again.. this is something thats way down the road, however, we are planning to have a Beta release in late March.

Speed is an issue

Filed under: Issues, Problems — Dave at 12:22 am on Wednesday, December 21, 2005

I’ve been noticing that some times it takes upwards of a minute or more to return a search. This ain’t good. Looking into it I see that MySql is attemping to do UPDATES and SELECTS at the same time. The spider is updating the RSS feed links to current time (last polled) while people are using the database. I’ve bundled all the updates into a single querey and added PRIORITY_LOW to the updates. Remove all other updates which weren’t doing anything meaningful ie: number of times an item was viewed on a returning search page… bah… crap…

Top 10 Searches

Filed under: Problems — Dave at 9:06 pm on Sunday, December 18, 2005

Ok… Top 10 Searches is broken…