Fact Checking and Analyzing the Web Fact checking and data journalism are currently strong trends. The sheer amount of data at hand makes it difficult even for trained professionals to spot biased, outdated or simply incorrect information. We propose to demonstrate FactMinder, a fact checking and analysis assistance application. Attendees will be able to analyze documents using FactMinder and experience how background knowledge and open data repositories help build insightful overviews of current topics.
|
François Goasdoué, Konstantinos Karanasos, Yannis Katsis, Julien Leblay, Ioana Manolescu and Stamatis Zampetakis
|
MicroFilter: Scalable Real-Time Filtering Of Micro-blogging Content Microblogging systems have become a major trend over the Web. After only 7 years of existence, Twitter for instance claims more than 500 million users with more than 350 billion delivered update each day. As a consequence the user must today manage possibly extremely large feeds, resulting in poor data readability and loss of valuable information and the system must face a huge network load.
In this demonstration, we present and illustrate the features of MicroFilter (MF in the the following), an inverted list-based filtering engine that nicely extends existing centralized microblogging systems by adding a real-time filtering feature. The demonstration proposed illustrates how the user experience is improved, the impact on the traffic for the overall system, and how the characteristics of microblogs drove the design of the indexing structures.
|
Ryadh Dahimene and Cedric Du Mouza
|
Processing XML Queries and Updates on Map/Reduce Clusters Ces dernières années, le Cloud Computing a émergé proposant de nouvelles manières d'utiliser des fermes de machines avec pour objectifs passage à l'échelle et élasticité. Les plateformes les plus utilisées et ayant prouvé leur capacité de passage à l'échelle sont actuellement Amazon EC2, Google Map/Reduce et Hadoop. Cette demo est en lien direct avec le développement de stratégies d'exécution à la Map-Reduce. Nous présentons un prototype permettant l'exécution de requêtes et de mises à jour XQuery sur des documents XML de très grande taille. Ce prototype est développé sur la base d'un partitionnement statique et dynamique du document en entrée ce qui rend possible la distribution des charges de calcul sur les machines d'une grappe Map-Reduce. La démonstration permettra aux participants d'exécuter des requêtes et des mises à jour prédéfinies sur des documents valides relativement au schéma XMark et aussi de soumettre leurs propres requêtes et mises à jour.
|
Nicole Bidoit-Tollu, Dario Colazzo, Noor Malla, Maurizio Nolé, Carlo Sartiani and Federico Ulliana
|
MEoWS Reader: Top-k News Recommendation, Filtering and Personalization We present MEoWS Reader, a personalized news reader featuring an efficient algorithm for processing a new class of continuous top-k textual queries using dynamic non-homogeneous ranking functions.
|
Nelly Vouzoukidou, Bernd Amann and Vassilis Christophides
|
SQuIF: Subtile Quête d'Informations personnelles issues de Fichiers De nos jours, une quantité croissante de données personnelles migre vers une toujours plus grande variété de fichiers numériques stockés sur des supports informatiques. La numérisation de nos vies soulève clairement l'intérêt de la gestion de données personnelles depuis des fichiers. Notre problématique est de proposer aux utilisateurs des techniques simples de gestion de leurs données, pour trouver leur chemin à travers leurs nombreux fichiers, mais également pour interroger leurs données issues de ces fichiers. Dans cet article, nous décrivons le système SQuIF (Subtile Quête d'Informations personnelles issues de Fichiers), visant à faciliter considérablement la gestion de données personnelles. Dans ce but, nous proposons une représentation homogène de la pléthore de types de fichiers que l'utilisateur doit gérer: le modèle de fichier SQuIF. Nous détaillons ensuite l'architecture de notre système SQuIF. Enfin, nous décrivons un scénario de démonstration pour la gestion de données personnelles, avec des requêtes sur des fichiers et dossiers hétérogènes issus de ce scénario en utilisant un langage déclaratif "à la SQL".
|
Sabina Surdu, Vincent Primault and Yann Gripay
|
WaRG: Warehousing RDF Graphs We propose to demonstrate WaRG, a system for performing warehouse-style analytics on RDF graphs. To our knowledge, our framework is the first to keep the warehousing process purely in the RDF format and take advantage of the heterogeneity and semantics inherent to this model.
|
Dario Colazzo, Tushar I. Ghosh, François Goasdoué, Ioana Manolescu and Alexandra Roatis
|
Rule-Based Application Development using Webdamlog We present the WebdamLog system for managing distributed data on the Web in a peer-to-peer manner. We demonstrate the main features of the system through an application called Wepic for sharing pictures between attendees of the sigmod conference. Using Wepic, the attendees will be able to share, download, rate and annotate pictures in a highly decentralized manner. We show how WebdamLog handles heterogeneity of
the devices and services used to share data in such a Web setting. We exhibit the simple rules that define the Wepic application and show how to easily modify the Wepic application
|
Serge Abiteboul, Émilien Antoine, Gerome Miklau, Julia Stoyanovich and Jules Testard
|
Une démonstration d’un crawler intelligent pour les applications Web We demonstrate here a new approach to Web archival crawling, based on an application-aware helper that drives crawls of Web applications according to their types (especially, according to their content management systems). By adapting the crawling strategy to the Web application type, one is able to crawl a given Web application (say, a given forum or blog) with fewer requests than traditional crawling techniques. Additionally, the application-aware helper is able to extract semantic content from the Web pages crawled, which results in a Web archive of richer value to an archive user. In our demonstration scenario, we invite a user to compare application-aware crawling to regular Web crawling on the Web site of their choice, both in terms of efficiency and of experience in browsing and searching the archive.
|
Muhammad Faheem and Pierre Senellart
|
CoBRa for optimizing global queries This demonstration presents the CoBRa Optimizer that allows efficient evaluation of global queries without having complete knowledge on data. This optimizer advantageously applies the Case Based Reasoning paradigm to the query optimization process.
The CoBRa Optimizer acquires performance knowledge (other than classical metadata) while evaluating queries and exploits this knowledge for generating new execution plans for similar queries. The CoBRa Optimizer will be demonstrated simulating a scenario on our testbed platform supporting network-oriented applications. The internal state of the CoBRa Optimizer will be explored using this simulation platform.
|
Lourdes Martinez, Christine Collet, Christophe Bobineau and Etienne Dublé
|
Evaluating Cooperation in coauthorship graphs with degeneracy Community subgraphs are characterized by dense connections or interactions among their vertices. In this paper, we present a demonstrator that enables coauthorship communities evaluation capitalizing on graph degeneracy for weighted graphs. We implement this using the novel fractional core structure a property not captured by established community evaluation metrics. The fractional core concept has been presented in previous work [6] and extends the traditional k-core concept for the case of weighted graphs. It fractional cores are applied in weighted graphs as a whole but can also be used to evaluate individuals. In the DBLP coauthoriship network, the fractional core measurement upon individuals represents the level of collaboration of an author with others of at least the same index. With this application we provide a, simple to use, interface to query for the fractional core ranking of an author and see who are his/her closest circle of collaborators of the same or higher index. Based on the k-core concept, which essentially measures the robustness of a community under degeneracy, we extend it to weighted graphs, devising the novel concept of fractional cores for undirected graphs with weighted edges. We applied these approaches to large real world graphs investigating the coauthorship case for bibliographic data sets from Computer Science (DBLP) and High energy Physics (ARXIV.hep-th). Our ndings are intuitive and we report interesting results and observations with regards to collaboration among authors. We prepesent here an application that graphically displays the fractional core ranking of authors in the DBLP dataset.
NB: demo published in KDD 2012
|
Christos Giatis, Klaus Berberich, Dimitrios Thilikos and Michalis Vazirgiannis
|
Block-o-Matic: a Web Page Segmentation Tool and its Evaluation In this paper we present our prototype for the web page segmentation called Block-o-matic and its counterpart Block-o-manual, for manual segmentation. The main idea is to evaluate the correctness of the segmentation algorithm. Build a ground truth database for evaluation can take days or months depending on the collection size, however we address our solution with our manual segmentation tool intended to minimize the time of annotation of blocks in web pages. Both tools implements the same rules for segmentation, for the manual version allows to propose blocks to assessor and for the automatic the block selection. We present our demonstration scenario with a collection of web pages organized in categories. After its annotation they are compared with the automatic segmentation version and it is given a score and a visual comparison.
|
Andres Sanoja and Stephane Gancarski
|
DataTour: Location-Based Datastore over a Community Cloud We describe the DataTour system, a prototype of Location-Based DataStore over the Orange Community Cloud. DataTour aggregates heterogeneous non-dedicated storage and computing resources scattered in a network covering a wide geographical area. DataTour provides efficient data access for Location-Based Services (LBS). DataTour optimizes data locality in order to achieve bounded query response time. We present prototype design and implementation. Then, we demonstrate our dynamic data placement algorithm for several realistic workloads and under several different strategies about overload decision and data selection.
|
Kun Mi, Bo Zhang, Hubert Naacke, Daniel Stern and Stéphane Gançarski
|
An Intelligent PubSub Filtering System Content syndication has become a popular mean for timely delivery of frequently updated information on the Web. It essentially enhances traditional pull-oriented searching and browsing of web pages with push-oriented protocols. In such paradigm, publishers deliver brief information summaries on the Web, called news items, while information consumers subscribe to a number of feeds seen as information channels and get informed about the addition of recent items. However, many Web syndication applications imply a tight coupling between feed producers with consumers, and they do not help users nding news items with interesting content. This demonstration shows a prototype which integrates the whole process of thin ltering steps on numerous RSS feeds in memory, based on keyword-based subscriptions. Our system proposes to notify items related to a set of keywords either by broad-match semantic, partial matching and a diversity/novelty ltering. In this demonstration we discuss how our system integrates the three paradigms through dedicated indexes and a window-based structure for diversity/novelty ltering. We will demonstrate the management of notications and global information on our system.
|
Zeinab Hmedeh, Cedric Du Mouza and Nicolas Travers
|