Kimonolabs — getting updates from sites without feeds

Some web sites contain interesting information or updates but do not offer feeds or APIs.  Here we describe how to create a feed for such sites, add the feed to mergeflow as a source, and analyze its content.  We already posted a very short article on this topic a while ago.  Here we go into details.

As an example web site with interesting updates but no feeds or APIs, we selected baypat. This is the web site of “Bayerische Patentallianz”, a Bavarian government organization that hosts interesting technology offers that come out of Bavarian research institutions (e.g. universities).  They do other things too at BayPat, but here we are interested in their technology offers (cf. http://www.baypat.de/en/technologyoffers; please click on the screenshot below in order to see a larger version):

Screenshot1

Even though Kimonolabs has recently joined Palantir, it has been surprising to find kimolabs through our own software, as it helped us to create a web feed for this page with a very easy-to-use tool:

Screenshot2

The web feed from kimono will then allow us to add BayPat technology offers (existing ones and new updates) as a source to mergeflow.

Creating the web feed with kimono

After installing the kimono extension to your chrome browser (we recommend you do this as it makes life easier downstream), you can request the webpage of interest, in our case www.baypat.de/en/technologyoffers.  Then click on the kimono icon at the top right of your browser.

When the kimono extension has started, choose “title” as first data type and select the list item headlines from text below.

Screenshot3

Then, add a further datatype, “description”, and select appropriate passages:

Screenshot4

 

Now, in order to get all existing technology offers from BayPat, use kimono’s pagination function, and browse all existing pages:

Screenshot5

Then finish your API (which will deliver the web feed) by clicking on “done”.  Choose all settings as shown below:

Screenshot6After following the link, you can check and edit your newly created API and finally create a web feed by clicking on “rss”:

Screenshot7

 

Adding the kimono web feed to mergeflow

Now, add this new web feed to the mergeflow custom repository of your choice (for information on custom repositories, please see http://blog.mergeflow.com/2015/01/custom-repository/):

Screenshot8

 

Analyzing the content

Now you can start analyzing the retrieved documents, using mergeflow’s analytics.  For instance, once you add a feed to mergeflow as a source, mergeflow automatically identifies organizations, technologies, locations, and other objects in the contents delivered by the feed.  You can, for instance, use a relationship graph to explore how these objects relate to each other (for more information on how to use mergeflow’s relationship graphs, please see http://blog.mergeflow.com/2015/01/relationship-graphs/):

baypat-networkThe relationship graph suggests that many of BayPat’s technology offers are related to health care (as evidenced by the high number of “Disease” nodes in the graph).  For instance, one technology offer from the field of ophthalmology…

cataract…describes a new coating for after-cataract intraocular lenses (cf. http://www.baypat.de/en/technologyoffers?tech_ang=1547):

cataract-tech-offer