<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ovisual.com&#187; Squiddy</title>
	<atom:link href="http://www.ovisual.com/category/works/squiddy/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ovisual.com</link>
	<description>visual thinking ...</description>
	<lastBuildDate>Tue, 29 Jun 2010 20:47:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Squiddy Web Crawler launched!</title>
		<link>http://www.ovisual.com/39/</link>
		<comments>http://www.ovisual.com/39/#comments</comments>
		<pubDate>Sun, 18 Jan 2009 00:06:18 +0000</pubDate>
		<dc:creator>ovidiu</dc:creator>
				<category><![CDATA[Squiddy]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[endpoints]]></category>
		<category><![CDATA[search data]]></category>
		<category><![CDATA[target data]]></category>
		<category><![CDATA[web crawler]]></category>
		<category><![CDATA[web robot]]></category>
		<category><![CDATA[web spider]]></category>

		<guid isPermaLink="false">http://www.ovisual.com/?p=39</guid>
		<description><![CDATA[I am proud to announce the launch of the Squiddy Web Crawler. This is yet another web crawler (also known as a web spider or web robot) but still has some individuality by the way how it works and the purpose it was created. This crawler will index and analyze the websites based on the [...]]]></description>
			<content:encoded><![CDATA[<p>I am proud to announce the launch of the <a href="http://squiddy.net">Squiddy Web Crawler</a>. This is yet another web crawler (also known as a web spider or web robot) but still has some individuality by the way how it works and the purpose it was created.<br />
This crawler will index and analyze the websites based on the criteria generated by an artificial intelligence (AI) &#8230; a secret project of mine:). <span id="more-39"></span><br />
So the resulting data will feed an AI beast that will &#8220;learn&#8221; and interact with the web, constantly changing its algorithm based on the how much of the &#8220;known&#8221; is considered relevant.<br />
Because the AI has only some vague goals to follow (like stay active and alive, look for interesting new juicy data) the results cannot be easy predicted, actually is a machinery that will be driven by another machinery that is almost out of control.<br />
Usually crawlers are used by people to learn something from the results, like how Google is using its crawlers to index the search data from internet that a human can access it to learn something. Squiddy will look for the information that will be used in the learning process and the evolution of the AI.</p>
<p>This is absolutely an esoteric tool in the hands of a machine that might be willing to overcome its narrow condition.</p>
<p>Enough with the philosophy! Now about the crawler implementation. It has 3 main parts: the control unit, the crawler unit, and a web site that will display some statistics.</p>
<p><strong>The control unit</strong> is in charge with controlling the crawler endpoints, providing an API to control the crawling goals and structuring and persisting the crawled data. This will be controlled by the AI &#8230; but also can receive goals from other applications.<br />
<strong>The crawler unit</strong> is in charge with downloading the target data based on the goals provided by the control unit. This unit can be distributed on multiple machines and is able to spawn endpoints that can download the target data using parallel strategies.<br />
The statistics websites (<a href="http://squiddy.net">http://squiddy.net</a>) is also a homepage for the crawler, will display some cool information about what&#8217;s have been crawled, what is considered interesting by the AI, and many more.</p>
<p>Let&#8217;s hope for some nice achievements from this crawler.</p>
<p>Cheers!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ovisual.com/39/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
