glacier-583419.jpg
 

Wrapidity

51°45’35.5” N,  1°15’30.9” W

 

Our Mission

Wrapidity’s mission is to open up humanity’s greatest resource—the web—to everyone.

For decades, we were promised that the web would become humanity’s greatest database. And that all that data would be available as XML, RDF, or Web 2.0 APIs. Yet, what remains is a vast ocean of dark data, hidden behind the surface in the silos of the deep web. To unearth the potential insights hidden in this accumulated information, the hidden data (estimated to be 80% of all Web data) has to be painstakingly extracted and refined to become amenable to further analysis. 

Businesses and data scientists in many verticals have recognized the tremendous value of such insights, but also the punitive cost for their collection. Data scientists spend around 80% of their time on data collection and preparation, and they really hate doing so

With Wrapidity’s technology every business, every policy maker, and every scientist will be able to afford including relevant web data into their decision models, ultimately leading to better, more fact-driven decisions.

Extract structured data from millions of websites - automatically.

Technological DNA

Knowledge + redundancy + specialised AI.

RESEARCH DNA

$5M dollar top-tier European research grant, leading to dozens of publications in top international AI, database, and web technologies conferences.

TEAM DNA

6 year research team at Oxford University with PhDs from Europe, the Americas, and Asia.


What Wrapidity can do for you?

You think you have an idea how to better match people and jobs? Or how to answer “where are Italian restaurants along this route”? Or how to find out where a competitor is currently building up new operations?

But where do you get the necessary data on current job offers, restaurant menus, or other product offers? That’s where Wrapidity comes in: We quickly provide a highly structured database of all offers or goods you are interested in, whether they come from a few websites or hundreds of them—a database you can use to build better search, better recommenders, or better analytics.

 

This doesn't just help you build your application faster, it also makes applications possible that previously were out of reach even for the internet giants:

  • Answering queries that link data from different domains: a place that's playing the latest Batman movie and where I can get a good burger afterwards.
  • Answering queries that link dynamic and static data: find me a cheap hotel in a low crime area.
  • Answering queries that involve data across many countries or even languages

How we do it

This is the era of specialized AI—AI judiciously tailored to specific problems such as image recognition, machine translation, or, with Wrapidity, automatic data extraction. Extracting structured data from the web has been one of the long standing challenges in search and knowledge acquisition that has withstood repeated attempts at solving it in a generic fashion. With Wrapidity we have developed an object extraction system that exploits extensive metadata about the relevant objects (in form of both a schema and sample instances). With this approach we outperform existing semi-supervised and unsupervised approaches by a wide margin (> 95% accuracy on a wide range of domains and sites).

Knowledge

How websites work and what data is relevant for your problem?

Most of this knowledge is generic, but some of it is task- or vertical-specific and thus needs to be acquired for each task or vertical—e.g., that location is key in real estate. While this acquisition often requires some human supervision, it is only needed once for an entire vertical.

 

Redundancy

Humans think in patterns and thus most websites follow a common set of conventions for presenting data, e.g., most shops will have a prominent price information. 

Wrapidity exploits redundancy at many levels, whether in the presentation of the data, the actual instances in the same source, or instances shared between sources.

 

Specialized AI

Wrapidity has developed a specialized AI for the autonomous exploration and classification of web sites and their constituent objects. This AI is able to adapt itself to each website by automatically composing atomic exploration actions and relies on a specialized entity recognition that considers page context rather than textual context.

Our Team

 

In Wrapidity we combine passion for solving hard problems for real customers with a drive for pushing the envelope in technology and science. The technical members of the team all have been previously involved in startups and build solutions for customers ranging from one-people companies to multi-national media conglomerates. They have jointly published more than 500 DBLP-listed publications with over 20k cumulative citations.  

Tim Furche is Wrapidity’s CTO. He has been extracting data from the web for nearly a decade, starting from one of the most-cited works on XPath during his undergraduate studies. When not extracting data, he loves looking into query languages and large-scale data management. He has managed a number of large-scale research grants (>$12M in total) and teams and currently also holds an appointment as Lecturer at the Department of Computer Science of the University of Oxford.

Tim Furche is Wrapidity’s CTO. He has been extracting data from the web for nearly a decade, starting from one of the most-cited works on XPath during his undergraduate studies. When not extracting data, he loves looking into query languages and large-scale data management. He has managed a number of large-scale research grants (>$12M in total) and teams and currently also holds an appointment as Lecturer at the Department of Computer Science of the University of Oxford.

Dr Tim Furche

Co-Founder, CTO

Giovanni Grasso is Wrapidity’s head of data extraction. In the last few years, he's been working on automating web scraping at large scale exploiting his background in AI/Knowledge Representation and Reasoning. He currently also holds an appointment as Associate Professor at the University of Calabria. 

Giovanni Grasso is Wrapidity’s head of data extraction. In the last few years, he's been working on automating web scraping at large scale exploiting his background in AI/Knowledge Representation and Reasoning. He currently also holds an appointment as Associate Professor at the University of Calabria. 

Dr Giovanni Grasso

Co-Founder, Engineering

Giorgio Orsi is Wrapidity’s head of data engineering and Lecturer in Database Systems at the School of Computer Science of the University of Birmingham. His current research focuses on Big Data wrangling and automated reasoning, in particular large-scale data harvesting from the web. He is a co-investigator of the EPSRC Programme Grant VADA (Value-Added Data Systems—Principles and Architecture).

Giorgio Orsi is Wrapidity’s head of data engineering and Lecturer in Database Systems at the School of Computer Science of the University of Birmingham. His current research focuses on Big Data wrangling and automated reasoning, in particular large-scale data harvesting from the web. He is a co-investigator of the EPSRC Programme Grant VADA (Value-Added Data Systems—Principles and Architecture).

Dr Giorgio Orsi

Co-Founder, Engineering

georg-gottlob_med.jpg

Dr Georg Gottlob, FRS

Co-Founder, CSO

leon.jpeg

Leon Shpilsky

Co-Founder, CEO

Dr Christian Schallhart

Co-Founder

Demonstration

Background

One Page Teaser

Want to get a quick overview of what Wrapidity is about? Then this is the right place for you. Read more.

Pitch Slide Set

Quick overview of Wrapidity’s vision and technology. Look out for the case studies, including millions of restaurant locations extracted with 90%+ accuracy. Read more.

VLDB 2014, Innovative Systems Paper

Description of the core architecture and adaptive exploration and analysis engine used in Wrapidity. Read more.

OXPath 2013 VLDBJ, Best Paper Issue

Wrapidity’s extraction language OXPath, originally published at VLDB 2011 and selected for inclusion into this best paper issue of the VLDB Journal. Read more.

Extended background slides

Extended slide set that also illustrates the underlying components and technology. Read more.

Wrapper Repair, ICDE 2016

Wrapidity doesn’t just automatically induce wrappers, but also repairs and maintains wrappers with no additional supervision. Read more.

Contact Us

Name *
Name

 

Our Office

67-71 Shoreditch High St
London E1 6JJ
United Kingdom