WEBIST 2016 Abstracts


Area 1 - Service based Information Systems

Full Papers
Paper Nr: 9
Title:

Academics’ Intention to Adopt SNS for Engagement Within Academia

Authors:

Eleni Dermentzi, Savvas Papagiannidis, Carlos Osorio and Natalia Yannopoulou

Abstract: Although Social Networking Sites (SNS) have become popular among scholars as tools for engagement within academia, there is still a need to examine the motives behind academics’ intentions to adopt SNS. This study proposes and tests a research model based on the Decomposed Theory of Planned Behaviour and Gratifications Theory with a sample of 370 academics around the world in order to address the objective set. Our findings suggest that while attitude and perceived behavioural control are the main drivers of academics’ intentions to adopt SNS for engagement, the effect of social norms on intentions is not significant. In addition, networking needs, perceived usefulness, image, and perceived reciprocity affect attitude, while self-efficacy affects perceived behavioural control. Implications for SNS providers and universities that want to promote and encourage online engagement within their faculties are discussed.

Area 2 - Internet Technology

Full Papers
Paper Nr: 16
Title:

Query and Product Suggestion for Price Comparison Search Engines based on Query-product Click-through Bipartite Graphs

Authors:

Lucia Noce, Ignazio Gallo and Alessandro Zamberletti

Abstract: Query suggestion is a technique for generating alternative queries to facilitate information seeking, and has become a needful feature that commercial search engines provide to web users. In this paper, we focus on query suggestion for price comparison search engines. In this specific domain, suggestions provided to web users need to be properly generated taking into account whether both the searched and the suggested products are still available for sale. To this end, we propose a novel approach based on a slightly variant of classical query-URL graphs: the query-product click through bipartite graph. Such graph is built using information extracted both from search engine logs and specific domain features such as categories and products popularities. Information collected from the query-product graph can be used to suggest not only related queries but also related products. The proposed model was tested on several challenging datasets, and also compared with a recent competing query suggestion approach specifically designed for price comparison engines. Our solution outperforms the competing approach, achieving higher results both in terms of relevance of the provided suggestions and coverage rates on top-8 suggestions.

Paper Nr: 31
Title:

Increasing Trust Towards eCommerce - Privacy Enhancing Technologies Against Price Discrimination

Authors:

Christos Makris, Konstantinos Patikas and Yannis C. Stamatiou

Abstract: Price discrimination is a recently introduced practice in the domain of eCommerce. It is manifested by the appearance of different prices when the same product is browsed by different prospective buyers, based on their profiles. Thus, for instance, the price of an item may increase at the instance it is browsed by a user coming from a rich neighbourhood or has performed a series of purchases of expensive objects in the past. Price discrimination can lead to decrease of profits and loss of clientele, in the long run, as well as decrease of people’s trust towards eCommerce. In this paper, we propose the deployment of Privacy Enhancing Technologies in order to handle users’ personal information. These technologies empower users to have command over their own privacy by allowing them to reveal only what is absolutely necessary (minimal disclosure principle), or what they agree to reveal, in order to use a service avoiding any Personally Identifiable Information (PII). Thus, eCommerce services that employ such technologies for handling their clients’ personal data can attract more loyal clients, increase their popularity while, at the same time, suffer from minimal client data and company image loss in case of a massive customer data theft attacks.

Paper Nr: 61
Title:

Methods of Data Processing and Communication for a Web-based Wind Flow Visualization

Authors:

Marc Skutnik, Luigi Lo Iacono and Christian Neuhaus

Abstract: This paper presents methods for the reduction and compression of meteorological data for web-based wind flow visualizations, which are tailored to the flow visualization technique. Flow data sets represent a large amount of data and are therefore not well suited for mobile networks with low data throughput rates and high latency. Using the mechanisms introduced in this paper, an efficient transfer of thinned out and compressed data can be achieved, while keeping the accuracy of the visualized information almost at the same quality level as for the original data.

Paper Nr: 77
Title:

The Spanning Tree based Approach for Solving the Shortest Path Problem in Social Graphs

Authors:

Andrei Eremeev, Georgiy Korneev, Alexander Semenov and Jari Veijalainen

Abstract: Nowadays there are many social media sites with a very large number of users. Users of social media sites and relationships between them can be modelled as a graph. Such graphs can be analysed using methods from social network analysis (SNA). Many measures used in SNA rely on computation of shortest paths between nodes of a graph. There are many shortest path algorithms, but the majority of them suits only for small graphs, or work only with road network graphs that are fundamentally different from social graphs. This paper describes an efficient shortest path searching algorithm suitable for large social graphs. The described algorithm extends the Atlas algorithm. The proposed algorithm solves the shortest path problem in social graphs modelling sites with over 100 million users with acceptable response time (50 ms per query), memory usage (less than 15 GB of the primary memory) and applicable accuracy (higher than 90% of the queries return exact result).

Area 3 - Web Interfaces

Full Papers
Paper Nr: 85
Title:

Triple-based Sharing of Context-Aware Composite Web Applications for Non-programmers

Authors:

Gregor Blichmann, Carsten Radeck, Robert Starke and Klaus Meißner

Abstract: Composite web applications are a promising way to support the long tail of user needs. While most mashup platforms only support single-user scenarios, CRUISE enables the reconfiguration of multi-user mashups during runtime. Thereby, synchronizing different parts of an application based on black-box components from different vendors causes special challenges for the rights management system. Cause we additionally focus on non-programmers as target group, an adequate user interface concept is needed. To overcome these challenges, we present a triple-based rights management concept as well as a corresponding user interface support. It supports fine-grained sharing of whole applications, single components or UI parts of components under configurable permissions. Thereby, users can select semantically compatible components during the collaborative session. The practicability of our concept is validated by a prototypically implementation as well as a user acceptance test.

Area 4 - Internet Technology

Full Papers
Paper Nr: 112
Title:

A Framework for Parameterized Semantic Matchmaking and Ranking of Web Services

Authors:

Fatma Ezzahra Gmati, Nadia Yacoubi Ayadi, Afef Bahri, Salem Chakhar and Alessio Ishizaka

Abstract: The Parameterized Semantic Matchmaking and Ranking (PMRF) is a highly configurable framework supporting a parameterized matching and ranking of Web services. The paper introduces the matching and ranking algorithms supported by the PMRF and presents its architecture. It also evaluates the performance of the PMRF and compares it to the iSeM-logic-based and SPARQLent frameworks using the OWLS-TC4 datasets. The comparison results show that the algorithms included in PMRF behave globally well in comparison to iSeM-logic-based and SPARQLent.

Short Papers
Paper Nr: 19
Title:

The PERICLES Process Compiler: Linking BPMN Processes into Complex Workflows for Model-Driven Preservation in Evolving Ecosystems

Authors:

Noa Campos-López and Oliver Wannenwetsch

Abstract: Understanding and reusing archived digital information after a decade of technological advancements, constant developments and changes to software and hardware is a very complex and time consuming task. Without a clear documentation and records of changes, research on the creation and modification of information systems and its associated data is almost impossible. To ease these problems, the EU-funded project PERICLES follows a model-driven preservation approach, where the digital ecosystem is modelled and updated constantly to handle changes for generating a holistic record on information systems involvement. For realising this approach, we present in this paper the PERICLES Process Compiler, which consumes descriptive models of changing environments and evolving semantics to generate executable workflows that are capable of reenacting changes in information systems and mitigate their impact. By this, our contribution narrows the gap between the theory of model-driven preservation and its application in real information system environments.

Paper Nr: 28
Title:

Massive Detailed 3D Geographic Information Collection on the Web

Authors:

Zengshi Huang, Naijie Gu and Jianlin Hao

Abstract: Detailed three-dimensional (3D) geographic data are important for many kinds of spatial analysis and applications. However, professional Computer Aided Design (CAD) tools are essential to construct 3D models when collecting these data. As a result, the collection is limited to computers and professional persons in both commercial projects and Volunteered Geographic Information (VGI) projects. This paper presents a new system for detailed 3D geographic information collection through VGI. The system combines Web3D Geographic Information System (Web3DGIS) and a template based CAD methodology on the web. Based on Extensible 3D (X3D) and X3DOM, it is applicable for both computers and mobile devices and extends the collection to a large number of non-professional VGI contributors. With a methodology to transform the collected data into international standard CityGML format, massive detailed 3D geographic information collection will be achieved.

Area 5 - Service based Information Systems

Short Papers
Paper Nr: 35
Title:

Interest Assortativity in Twitter

Authors:

Francesco Buccafurri, Gianluca Lax, Serena Nicolazzo and Antonino Nocera

Abstract: Assortativity is the preference for a person to relate to others who are someway similar. This property has been widely studied in real-life social networks in the past and, more recently, great attention is devoted to study various forms of assortativity also in online social networks, being aware that it does not suffice to apply past scientific results obtained in the domain of real-life social networks. One of the aspects not yet analyzed in online social networks is interest assortativity, that is the preference for people to share the same interest (e.g., sport, music) with their friends. In this paper, we study this form of assortativity on Twitter, one of the most popular online social networks. After the introduction of the background theoretical model, we analyze Twitter, discovering that users clearly show interest assortativity. Beside the theoretical assessment, our result leads to identify a number of interesting possible applications.

Paper Nr: 39
Title:

GO4SOA: Goal-Oriented Modeling for SOA

Authors:

Inaldo Capistrano Costa and José M. Parente de Oliveira

Abstract: The service-oriented architecture (SOA) has become a standard in business integration. In software engineering, several authors propose requirements elicitation from business goals. However, SOA application modeling does not address these goals, causing a gap that can hinder the application design. The work outlined in this paper proposes an approach to modeling SOA applications based on business goals. The goals are incorporated as semantic information to the application’s architecture and are preserved until its implementation. Thus, the components that perform a particular business goal can be identified from its architecture model, through detailed design, and implementation. Case study “Purchase Order” was selected to verify the proposed approach. The major contribution of this research is the application of business knowledge to improve the service’s descriptions in the application design. The case study indicated that business goals are preserved on models and implementation, making it easy to verify, through tracing its features, if the organization’s goals were addressed as completely as possible.

Area 6 - Internet Technology

Short Papers
Paper Nr: 51
Title:

Evaluating Twitter Influence Ranking with System Theory

Authors:

Georgios Drakopoulos, Andreas Kanavos and Athanasios Tsakalidis

Abstract: A considerable part of social network analysis literature is dedicated to determining which individuals are to be considered as influential in particular social settings. Most established algorithms, such as Freeman and Katz- Bonacich centrality metrics, place emphasis on various structural properties of the social graph. Although this makes centrality metrics generic enough to be applied in virtually any setting, they are oblivious to the functionality of the underlying social network. This paper examines five social influence metrics designed especially for Twitter and their implementation in a Java client retrieving network information from a Neo4j server. Additionally, a sceheme is proposed for evaluating the performance of an influence ranking based on estimating the exponent of a Zipf model fitted to the ranking score.

Paper Nr: 55
Title:

Adaptive Push-based Media Streaming in the Web

Authors:

Luigi Lo Iacono and Silvia Santano Guillén

Abstract: Online media consumption is the main driving force for the recent growth of the Web. As especially realtime media is becoming more and more accessible from a wide range of devices, with contrasting screen resolutions, processing resources and network connectivity, a necessary requirement is providing users with a seamless multimedia experience at the best possible quality, henceforth being able to adapt to the specific device and network conditions. This paper introduces a novel approach for adaptive media streaming in the Web. Despite the pervasive pullbased designs based on HTTP, this paper builds upon a Web-native push-based approach by which both the communication and processing overheads are reduced significantly in comparison to the pull-based counterparts. In order to maintain these properties when enhancing the scheme by adaptation features, a server-side monitoring and control needs to be developed as a consequence. Such an adaptive push-based media streaming approach is introduced as main contribution of this work. Moreover, the obtained evaluation results provide the evidence that with an adaptive push-based media delivery, on the one hand, an equivalent quality of experience can be provided at lower costs than by adopting pull-based media streaming. On the other hand, an improved responsiveness in switching between quality levels can be obtained at no extra costs.

Paper Nr: 56
Title:

Automated Analysis and Evaluation of Web Applications Design: The CMS-based Web Applications Case Study

Authors:

Vassiliki Gkantouna, Athanasios Tsakalidis and Giannis Tzimas

Abstract: This paper addresses the automated design quality evaluation of Web applications built on a CMS platform by inspecting their conceptual model under the viewpoint of consistent design reuse. We have utilized WebML as the design platform of the proposed methodology and we attempt to capture design reuse by detecting all the recurrent patterns within the WebML hypertext model of an application. A pattern consists of a core specification, i.e., an invariant composition of WebML elements that characterizes the pattern and by a number of pattern variants which extend the core specification with all the valid modalities in which the pattern composition can start (starting variants) or terminate (termination variants). We have developed a methodology that automatically extracts the hypertext model of a web application which is subsequently submitted to a pattern-based analysis in order to identify the occurrences of all the incorporated recurrent patterns implying design reuse. Then, we calculate evaluation metrics revealing whether the identified patterns variants are used consistently throughout the application. By using the methodology, designers can detect either effective reusable design solutions consistently used throughout the application model for obtaining certain functionality within the application’s context or recurrent design constructs causing design inconsistencies and lowering the quality of the final application.

Area 7 - Service based Information Systems

Short Papers
Paper Nr: 57
Title:

Designing Online Service: From State-of-the-Art to a Unified Framework

Authors:

Noor Farizah Ibrahim and Christopher Durugbo

Abstract: Online services are now extensively used all over the world in desktop and mobile applications. These services are driven by technologies that attempt to enhance how people manage their lives in more networked and interconnected ways. The aim of this study is to conduct a systematic review of 103 articles published from year 1998 to 2014 by reviewing the research objectives, research findings, methodologies and research gaps of prior online service studies. Since, there are limited focus on online services research as a whole albeit several researchers concentrated specifically on areas such as e-banking and e-commerce, the authors summarized prior research by developing a unified framework –the Design for Online Service (DOS) framework– that synthesized the findings of success factors – web design, social networking, service provisioning, user involvement and critical success factors- in designing online services. The DOS framework provides a holistic view of determinants for the success of online service design, incorporating range set of factors and its criteria. Overall, these findings provide insights and greater understanding of online service research agenda.

Area 8 - Internet Technology

Short Papers
Paper Nr: 63
Title:

MR-SAT: A MapReduce Algorithm for Big Data Sentiment Analysis on Twitter

Authors:

Nikolaos Nodarakis, Spyros Sioutas, Athanasios K. Tsakalidis and Giannis Tzimas

Abstract: Sentiment analysis on Twitter data has attracted much attention recently. People tend to express their feelings freely, which makes Twitter an ideal source for accumulating a vast amount of opinions towards a wide diversity of topics. In this paper, we develop a novel method to harvest sentiment knowledge in the MapReduce framework. Our algorithm exploits the hashtags and emoticons inside a tweet, as sentiment labels, and proceeds to a classification procedure of diverse sentiment types in a parallel and distributed manner. Moreover, we utilize Bloom filters to compact the storage size of intermediate data and boost the performance of our algorithm. Through an extensive experimental evaluation, we prove that our solution is efficient, robust and scalable and confirm the quality of our sentiment identification.

Paper Nr: 70
Title:

Personalized, Context-aware Intermodal Travel Information

Authors:

Christian Samsel, Karl-Heinz Krempels and Gerrit Garbereder

Abstract: The integration of heterogeneous mobility services increases the number of itinerary choices exponentially. To support travelers with the selection of such an intermodal itinerary this work proposes the use of a recommendation system. The developed framework rates intermodal itineraries supplied by an external travel information system based on learned personal preferences and user context (e.g. weather). This rating can be used by the client application (e.g. a mobile app) for sorting or a five-star rating. The framework realizes a set of interfaces to extract feature data of the user context and the possible itineraries and applies a combination of item-based and context-based recommendation algorithms. As evaluation an online questionnaire (n = 101) applying the framework was conducted to assess the feasibility of the approach. The number of participants preferring the personalized and context-aware itinerary presentation compared to the traditional departure time-based presentation was significant. Furthermore it could be verified that a mobility self-assessment is suitable as initial training data.

Paper Nr: 71
Title:

Compressing Inverted Files using Modified LZW

Authors:

Vasileios Iosifidis and Christos Makris

Abstract: In the paper, we present a compression algorithm that employs a modification of the well known Ziv Lempel Welch algorithm (LZW); it creates an index that treats terms as characters, and stores encoded document identifier patterns efficiently. We also equip our approach with a set of preprocessing {reassignment of document identifiers, Gaps} and post-processing methods {Gaps, IPC encoding, GZIP} in order to attain more significant space improvements. We used two different combinations of those discrete steps to see which one maximizes the performance of the modification we made on the LZW algorithm. Performed experiments in the Wikipedia dataset depict the superiority in space compaction of the proposed technique.

Area 9 - Service based Information Systems

Short Papers
Paper Nr: 75
Title:

Are Open News Systems Credible? - An Investigation Into Perceptions of Participatory and Citizen News

Authors:

Jonathan Scott, David Millard and Pauline Leonard

Abstract: The growth of the web has led to a shift in the news industry and the emergence of novel news services. Due to the importance of news media in society it is important to understand how these systems work and how they are perceived. Previous work has ranked news systems in terms of their openness to user contribution, noting that the most open systems (such as YouTube) are typically not viewed as news systems at all, despite having most of the same functional characteristics. In this paper we explore whether credibility is an appropriate characteristic to explain this perception by presenting the results of a survey of 79 people regarding their credibility assessments of online news websites. We compare this perceived credibility with the openness of the systems as identified in previous work. Results show that there is a modest but significant correlation between the openness of a news system and its credibility, and suggest that credibility is an appropriate if imperfect explanation of the difference in perception of open and closed news systems.

Area 10 - Internet Technology

Short Papers
Paper Nr: 76
Title:

Identity Use and Misuse of Public Persona on Twitter

Authors:

Dicle Berfin Köse, Jari Veijalainen and Alexander Semenov

Abstract: Social media sites have appeared during the last 10 years and their use has exploded all over the world. Twitter is a microblogging service that has currently 320 million user profiles and over 100 million daily active users. Many celebrities and leading politicians have a verified profile on Twitter, including Justin Bieber, president Obama, and the Pope. In this paper we investigate the '‘hundreds of Putins and Obamas phenomenon’ on Twitter. We collected two data sets in 2015 containing 582 and 6477 profiles that are related to the G20 leaders’ profiles on Twitter. The number of namesakes varied from 5 to 1000 per leader. We analysed in detail various aspects of the Putin and Erdogan related profiles. For the first ones we looked into the language of the profiles, their follower sets, the address in the profile and where the tweets were really sent from. For both profile sets we investigated why the accounts were created. For this, we deduced 12 categories based on the information in the profile and the contents of the sent tweets. The research is exploratory in nature, but we tentatively looked into online identity, communication and political theories that might explain emergence of these kinds of Twitter profiles.

Area 11 - Service based Information Systems

Short Papers
Paper Nr: 78
Title:

Impact of Online Product Reviews on Purchasing Decisions

Authors:

Efthymios Constantinides and Nina Isabel Holleschovsky

Abstract: Online consumer reviews, product and services recommendations and peer opinions play an increasingly growing role in the customer’s decision making process. The various online product review and recommendation platforms differ in their objectives, function and characteristics. The literature has so far paid little attention on function characteristics of these platforms as an element of customer adoption and preference. Given the importance of this form of customer generated content on business sales and profitability the monitoring and often responding to customer reviews by business organizations has become a major managerial challenge and an important reputation management issue. In order to respond efficiently to customer reviews companies need to identify consumer reviews platforms, understand their characteristics and continuously assess their impact on consumer purchasing decisions. This study identifies four main types of online review platforms: retail websites, independent reviewing platforms, video-sharing platforms and personal blogs. These platforms present product reviews in different formats with accent on various review function characteristics. An online survey analyzed consumer opinions about the various platforms and review mechanisms and the impact of those on consumer buying behavior. The results underline the importance of platform credibility and usability on consumer trust and reliance in reviews as input in the decision-making process.

Area 12 - Web Interfaces

Short Papers
Paper Nr: 106
Title:

Interactive Visualization and Big Data - A Management Perspective

Authors:

Thomas Plank and Markus Helfert

Abstract: This position paper presents a systematic literature review that aims to identify research topics and future research possibilities in the area of interactive visualizations of big data in a management perspective. Therefore, the authors reviewed journals listed in the Index of Information Systems Journals and the Computing Research and Education Association derived from the databases “EBSCO Business Source Premier”, “Sage Premier” and “Science Direct” from 2005 to 2015. The authors reviewed 993 abstracts and identified 122 peer-reviewed publications as relevant to the topic. Based on this interdisciplinary collection of research papers, the authors will identify the key research topics and derive future research possibilities that need to be undertaken.

Area 13 - Internet Technology

Short Papers
Paper Nr: 111
Title:

UML-based Model-Driven REST API Development

Authors:

Davide Rossi

Abstract: In the last few years we have witnessed the expansion of REST APIs as a method to implement machine-to- machine interactions in open distributed systems. Recently REST APIs can also be found in several B2B and enterprise scenarios that were previously reserved to alternative technologies such as SOAP-based Web Services. Despite that, the development of REST-based solutions has remained mostly inspired by agile approaches with no or limited explicit modeling artifacts produced during the development process. This clashes with software development methods in which modeling artifacts are expected to be available for all developed software. Another problem is related to the resource-based nature of these APIs that miss standardized methods to discover and understand their capabilities akin to what object-oriented interfaces can do for objects and services. In this paper we propose a model-driven approach to REST API development; this approach is composed by two main steps: (i) UML modeling of the API using specific profiles and (ii) a model transformation that exploits RAML, a recent RESTful API modeling language, as an intermediate notation that can be used to automatically produce documentation and code for various languages/platforms.

Area 14 - Service based Information Systems

Posters
Paper Nr: 15
Title:

An Overview of Service-Oriented Computing Challenges and Issues

Authors:

Flavio Corradini, Francesco De Angelis, Daniele Faní and Andrea Polini

Abstract: Service compositions allow an high dynamism and flexibility, indispensable properties in today software systems, where services could dynamically enter and leave a system during its execution. This paper provides an overview about the Service-Oriented Computing research, introducing the issues and engineering challenges in this context. We refer to Service Foundations, Service Composition and Service Management&Monitoring showing how each of these layers defines a set of constructs, roles and responsibilities. A general overview without claim of completeness is given on the these aspects while a particular attention is given to the research problems related to dynamic service compositions, such as choreographies, addressing the role fulfillment and the choreography realizability problems.

Area 15 - Internet Technology

Posters
Paper Nr: 17
Title:

XML Labels Compression using Prefix-encodings

Authors:

Hanaa Al Zadjali and Siobhán North

Abstract: XML is the de-facto standard for data representation and communication over the web, and so there is a lot of interest in querying XML data and most approaches require the data to be labelled to indicate structural relationships between elements. This is simple when the data does not change but complex when it does. In the day-to-day management of XML databases over the web, it is usual that more information is inserted over time than deleted. Frequent insertions can lead to large labels which have a detrimental impact on query performance and can cause overflow problems. Many researchers have shown that prefix encoding usually gives the highest compression ratio in comparison to other encoding schemes. Nonetheless, none of the existing prefix encoding methods has been applied to XML labels. This research investigates compressing XML labels via different prefix-encoding methods in order to reduce the occurrence of any overflow problems and improve query performance. The paper also presents a comparison between the performances of several prefix-encodings in terms of encoding/decoding time and compressed code size.

Paper Nr: 25
Title:

UniQue: An Approach for Unified and Efficient Querying of Heterogeneous Web Data Sources

Authors:

Markku Laine, Jari Kleimola and Petri Vuorimaa

Abstract: Governments, organizations, and people are publishing open data on the Web more than ever before. To consume the data, however, requires substantial effort from web mashup developers, as they have to familiarize themselves with a diversity of data formats and query techniques specific to each data source. While several solutions have been proposed to improve web querying, none of them covers aforementioned aspects in a developer friendly and efficient manner. Therefore, we devised a unified querying (UniQue) approach and a proxy-based implementation that provides a uniform and declarative interface for querying heterogeneous data sources across the Web. Besides hiding the differences between the underlying data formats and query techniques, UniQue heavily embraces open W3C standards to minimize the learning effort required by developers. Pursuing this further, we propose Unified Query Language (UQL) that combines the expressiveness of CSS Selectors and XPath into a single and flexible selector language. We show that the adoption of UniQue and UQL can effectively streamline web querying, leverage developers’ existing knowledge, and reduce generated network traffic compared to the current state-of-the-art approach.

Paper Nr: 40
Title:

Two Novel Techniques for Space Compaction on Biological Sequences

Authors:

George Volis, Christos Makris and Andreas Kanavos

Abstract: The number and size of genomic databases have grown rapidly the last years. Consequently, the number of Internet-accessible databases has been rapidly growing .Therefore there is a need for satisfactory methods for managing this growing information. A lot of effort has been put to this direction. Contributing to this effort this paper presents two algorithms which can eliminate the amount of space for storing genomic information. Our first algorithm is based on the classic n-grams/2L technique for indexing a DNA sequence and it can convert the Inverted Index of this classic algorithm to a more compressed format. Researchers have revealed the existence of repeated and palindrome patterns in DNA of living organisms. The main motivation of this technique is based on this remark and proposes an alternative data structure for handling these sequences. Our experimental results show that our algorithm can achieve a more efficient index than the n-grams/2L algorithm and can be adapted by any algorithm that is based to n-grams/2L The second algorithm is based on the n-grams technique. Perceiving the four symbols of DNA alphabet as vertex of a square scheme imprint a DNA sequence as a relation between vertices, sides and diagonals of a square. The experimental results shows that this second idea succeed even more successfully compression of our index structure.

Area 16 - Web Interfaces

Posters
Paper Nr: 86
Title:

Utilizing Virtual Communities for Information Retrieval and User Modeling

Authors:

Azza Harbaoui, Sahbi Sidhom, Malek Ghenima and Henda Ben Ghezala

Abstract: Internet has become the largest library in human history. Having such a large library made the search process more complicated. In fact, traditional search engines respond users by sending back the same results to different users having expressed different information needs and different preferences. A significant part of difficulties,report to vocabulary problems (polysemy, synonymy...). Such problems trigger a strong need to personalize the search results based on user preferences. The goal of personalized information is to generate meaningful results interesting to a number of information users using their profile. This paper presents a personalized information retrieval approach based on user profile. User profile is built from the acquisition of explicit and implicit user data. The proposed approach also presents a semantic-based optimization method for user query. The system uses user profile to construct virtual communities. Moreover, it uses the user’s navigation data to predict user’s preferences in order to update virtual communities.

Area 17 - Internet Technology

Posters
Paper Nr: 94
Title:

Volkswagen Emission Crisis – Managing Stakeholder Relations on the Web

Authors:

Boyang Zhang, Jari Veijalainen and Denis Kotkov

Abstract: Organizations establish their own profiles at social media sites to publish pertinent information to customers and other stakeholders. During a long and severe crisis, multiple issues may emerge in media interaction. Positive responses and prompt interaction from the official account of e.g. a car manufacturer creates clarity and reduces anxiety among stakeholders. This research targets the Volkswagen 2015 emission scandal that became public on Sept. 18, 2015. We report its main phases over time based on public web information. To better understand the online interaction and reactions of the company, we scrutinized what information was published on VW’s official web sites, Facebook, and Twitter profiles and how the image of the company developed over time among various stakeholders. To investigate this, Twitter and Facebook data sets were collected, cleaned, and analysed. We also compared this crisis in several respects with the Toyota recall crisis in 2010-2011 that was caused by sticking accelerator pedals and floor mats, as well as the GM crisis in 2014 that was caused by faulty ignition switches. Further we compare our findings with the Malaysian airline crisis that was caused by the disappeared flight MH370 and downed MH14.

Area 18 - Web Interfaces

Posters
Paper Nr: 100
Title:

Using Gamification to Enhance User Motivation in an Online-coaching Application for Flexible Workers

Authors:

Sophie Jent and Monique Janneck

Abstract: The number of people who are solo-self-employed or experience very flexible, individualized working conditions has grown over the last years. As a consequence, these persons need to design their own working conditions in the sense of ‘job crafting’. We are developing an online coaching application for this target group to convey job design skills, increase well-being, and reduce stress. To enhance user motivation gamification elements are used in the online coach. In this paper we report on the evaluation of a prototype of the coaching application with different gamification elements by means of a user test with the target group. The results show that gamification has only a small effect in short-term use, but seems promising in the long term.

Area 19 - Service based Information Systems

Posters
Paper Nr: 102
Title:

Can we Get Some Service Here? - On the Company Transformation from a Software Vendor to a SaaS Provider

Authors:

Aapo Koski and Tommi Mikkonen

Abstract: The software industry is in the middle of a major change in the fashion how services provided by all kinds of information systems are offered to the users. This change has been initiated by the customers who no longer want to carry out the same responsibilities and risks they previously did as system owners. Consequently, the software vendors need to find a way to change their mind-sets from software developers to service providers, being able to constantly satisfy the changed and new needs of their customers. The transformation from the license based software development to a SaaS offering poses challenges related not only to technical issues but to a great extent also to organisational and even mental issues. We reflect the experiences on this transformation gathered from two software companies, and, based on these, present some prerequisites and guidelines for the transformation to succeed. In conclusion, with the SaaS model, many of the principles manifested by the agile movement can and should be followed closely and the advantages gained with the SaaS model are very close to the objectives set by the agile manifesto.

Area 20 - Internet Technology

Posters
Paper Nr: 105
Title:

Performance Gains from Web Performance Optimization - Case Including the Optimization of Webpage Resources in a Comprehensive Way

Authors:

Juha Vihervaara, Pekka Loula and Tommi Tuominen

Abstract: Web performance optimization tries to minimize the time in which web pages are downloaded and displayed on the web browser. It also means that the sizes of website resources are usually minimized. By optimizing their websites, organizations can verify the quality of response times on their websites. This increases visitor loyalty and user satisfaction. A fast website is also important for search engine optimization. Minimized resources also cut the energy consumption of the Internet. In spite of the importance of optimization, there has not been so much research work to find out how much the comprehensive optimization of a website can reduce load times and the sizes of web resources. This study presents the results related to an optimization work where all the resources of the website were optimized. The results obtained were very significant. The download size of the front page was reduced by a total of about 80 percent and the downloading time about 60 percent. The server can now handle more than three times as much concurrent users as earlier.

Paper Nr: 113
Title:

Facebook Posts Text Classification to Improve Information Filtering

Authors:

Randa Benkhelifa and Fatima Zohra Laallam

Abstract: Facebook is one of the most used socials networking sites. It is more than a simple website, but a popular tool of communication. Social networking users communicate between them exchanging a several kinds of content including a free text, image and video. Today, the social media users have a special way to express themselves. They create a new language known as “internet slang”, which crosses the same meaning using different lexical units. This unstructured text has its own specific characteristics, such as, massive, noisy and dynamic, while it requires novel preprocessing methods adapted to those characteristics in order to ease and make the process of the classification algorithms effective. Most of previous works about social media text classification eliminate Stopwords and classify posts based on their topic (e.g. politics, sport, art, etc). In this paper, we propose to classify them in a lower level into diverse pre-chosen classes using three machine learning algorithms SVM, Naïve Bayes and K-NN. To improve our classification, we propose a new preprocessing approach based on the Stopwords, Internet slang and other specific lexical units. Finally, we compared between all results for each classifier, then between classifiers results.

Paper Nr: 115
Title:

Analysis of Performance of the Routing Protocols Ad Hoc using Random Waypoint Mobility Model Applied to an Urban Environment

Authors:

Liliana Enciso, Pablo Quezada, José Fernandez, Byron Figueroa and Verónica Espinoza

Abstract: Mobile ad-Hoc Network (MANET) is a group of mobile nodes interconnected and dynamic. A routing protocol is used to find the routes between the mobile nodes and facilitate the communication within the network. The aim of the protocols is to establish a correct and efficient route between a pair of mobile nodes, also it needs to be discovered and kept with a minimum consumption of bandwidth. This research work shows the performance assessment of six routing protocols: Destination Sequenced Distance-Vector (DSDV), Optimized Link State Routing (OLSR), Dynamic Source Routing Protocol (DSR), Ad Hoc On-Demand Distance Vector (AODV), Zone Routing Protocol (ZRP), Dynamic MANET On-Demand (DYMO). The evaluation was defined scenarios with 50, 90, 130, 170, 210 and 250 nodes and parameters such as: numbers of generated packages , broadcast packages, delay of node to node. The simulations and visualization from the results were executed in the network simulator NS2 version 2.34 and TraceGraph.

Area 21 - Web Intelligence

Full Papers
Paper Nr: 20
Title:

An Empirical Study of the Effectiveness of using Sentiment Analysis Tools for Opinion Mining

Authors:

Tao Ding and Shimei Pan

Abstract: Sentiment analysis is increasingly used as a tool to gauge people’s opinions on the internet. For example, sentiment analysis has been widely used in assessing people’s opinions on hotels, products (e.g., books and consumer electronics), public policies, and political candidates. However, due to the complexity in automated text analysis, today’s sentiment analysis tools are far from perfect. For example, many of them are good at detecting useful mood signals but inadequate in tracking and inferencing the relationships between different moods and different targets. As a result, if not used carefully, the results from sentiment analysis can be meaningless or even misleading. In this paper, we present an empirical analysis of the effectiveness of using existing sentiment analysis tools in assessing people’s opinions in five different domains. We also proposed several effectiveness indicators that can be computed automatically to help avoid the potential pitfalls in misusing a sentiment analysis tool.

Paper Nr: 21
Title:

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

Authors:

Stefan John, Michael Granitzer and Denis Helic

Abstract: Information networks are nowadays an important source of knowledge, indispensable for our daily tasks. Because of their size, however, efficient navigation can be a challenge. Following the idea to use network hierarchies as guidance in human as well as algorithmic search processes, this work focuses on the creation of optimized navigation hierarchies. Based on an established model of human navigation, decentralized search, we defined two quality criteria for network hierarchies and propose a genetic algorithm applying them. We conducted experiments on an information as well as a social network and analyzed the optimization effectivity of our approach. Furthermore, we investigated the structure of the resulting navigation hierarchies. We found our algorithm to be well-suited for the task of hierarchy optimization and found distinct structural properties influencing the quality of navigational hierarchies.

Paper Nr: 27
Title:

Crowdsourcing Reliable Ratings for Underexposed Items

Authors:

Beatrice Valeri, Shady Elbassuoni and Sihem Amer-Yahia

Abstract: We address the problem of acquiring reliable ratings of items such as restaurants or movies from the crowd. A reliable rating is a truthful rating from a worker that is knowledgeable enough about the item she is rating. We propose a crowdsourcing platform that considers workers’ expertise with respect to the items being rated and assigns workers the best items to rate. In addition, our platform focuses on acquiring ratings for items that only have a few ratings. Traditional crowdsourcing platforms are not suitable for such a task for two reasons. First, ratings are subjective and there is no single correct rating for an item which makes most existing work on predicting the expertise of crowdsourcing workers inapplicable. Second, in traditional crowdsourcing platforms there is no control over task assignment by the requester. In our case, we are interested in providing workers with the best items to rate based on their estimated expertise for the items and the number of ratings the items have. We evaluate the effectiveness of our system using both synthetic and real-world data about restaurants.

Paper Nr: 30
Title:

Verification of Fact Statements with Multiple Truthful Alternatives

Authors:

Xian Li, Weiyi Meng and Clement Yu

Abstract: When people are not sure about certain facts, they tend to use theWeb to find the answers. Two problems make finding correct answers from the Web challenging. First, the Web contains a significant amount of untruthful information. Second, currently there is a lack of systems/tools that can verify the truthfulness or untruthfulness of a random fact statement and also provide alternative answers. In this paper, we propose a method that aims to determine whether a given statement is truthful and to identify alternative truthful statements that are highly relevant to the given statement. Existing solutions consider only statements with a single expected correct answer. In this paper, we focus on statements that may have multiple relevant alternative answers. We first present a straightforward extension to the previous method to solve such type of statements and show that such a simple extension is inadequate. We then present solutions to two types of such statements. Our evaluation indicates that our proposed solutions are very effective.

Paper Nr: 33
Title:

AMBIT-SE: Towards a User-aware Semantic Enterprise Search Engine

Authors:

Giacomo Cabri, Stefano Gaddi and Riccardo Martoglia

Abstract: Search engines represent one of the most exploited tools both in our everyday life and in our work. In this paper we propose a user-aware semantic enterprise search engine called AMBIT-SE. It is "enterprise" in the sense that it is focused on the search in enterprise websites; the "semantic" aspect is related to the fact that it exploits not an exact word match, but relies also on the meaning of the words by means of synonyms and related terms; finally, to produce query results it takes into account also the user information, which turns out to be very useful to improve the search. We explain how our system works and report the results of experiments on different websites.

Paper Nr: 42
Title:

Estimating the Functionality of Mashup Applications for Assisted, Capability-centered End User Development

Authors:

Carsten Radeck, Gregor Blichmann and Klaus Meißner

Abstract: The mashup paradigm allows end users to build their own web applications consisting of several components in order to fulfill specific needs. Thereby, communicating on a non-technical level with non-programmers as end users is crucial. It is also necessary to assist them, for instance, by explaining inter-widget communication and by helping to understand a mashup's functionality. However, prevalent mashup approaches provide no or limited concepts for these aspects. In this paper, we present our proposal for estimating and formalizing the functionality of mashup compositions based on capabilities of components and their communication links. It is the foundation for our end-user-development approach comprising several assistance mechanisms, like presenting the functionality of mashups and recommended composition steps. The concepts are implemented and evaluated by means of example applications and an expert evaluation.

Paper Nr: 53
Title:

Subtopic Ranking based on Hierarchical Headings

Authors:

Tomohiro Manabe and Keishi Tajima

Abstract: We propose methods for generating diversified rankings of subtopics of keyword queries. Our methods are characterized by their awareness of hierarchical heading structure in documents. The structure consists of nested logical blocks with headings. Each heading concisely describes the topic of its corresponding block. Therefore, hierarchical headings in documents reflect the hierarchical topics referred to in the documents. Based on this idea, our methods score subtopic candidates based on matching between them and hierarchical headings in documents. They give higher scores to candidates matching hierarchical headings associated to more contents. To diversify the resulting rankings, every time our methods adopt a candidate with the best score, our methods exclude the blocks matching the candidate and re-score all remaining blocks and candidates. According to our evaluation result based on the NTCIR data set, our methods generated significantly better subtopic rankings than query completion results by major commercial search engines.

Paper Nr: 64
Title:

Cross-Domain Recommendations with Overlapping Items

Authors:

Denis Kotkov, Shuaiqiang Wang and Jari Veijalainen

Abstract: In recent years, there has been an increasing interest in cross-domain recommender systems. However, most existing works focus on the situation when only users or users and items overlap in different domains. In this paper, we investigate whether the source domain can boost the recommendation performance in the target domain when only items overlap. Due to the lack of publicly available datasets, we collect a dataset from two domains related to music, involving both the users' rating scores and the description of the items. We then conduct experiments using collaborative filtering and content-based filtering approaches for validation purpose. According to our experimental results, the source domain can improve the recommendation performance in the target domain when only items overlap. However, the improvement decreases with the growth of non-overlapping items in different domains.

Paper Nr: 99
Title:

Preference based Filtering and Recommendations for Running Routes

Authors:

Hassan Issa, Amir Guirguis, Shary Beshara, Stefan Agne and Andreas Dengel

Abstract: With the current trend of fitness and health tracking and quantified self, hundreds of relevant apps and devices are being released to the consumer market. Remarkably, some platforms were created to collect running-route data from these different sources in order to provide a better value for users. Such data could be employed in finding running routes based on the user’s preferences rather than being limited to the proximity to the user’s location. In this work, a classification system for running routes is introduced considering performance factors, visual factors and the nature of route. A running-route content-based recommender system is built on top of this classification enabling learning user preferences from their performance history. The system was evaluated using data from active runners and attained a promising recommendation accuracy averaging 84% among all subject users.

Paper Nr: 109
Title:

A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages

Authors:

Franck Michel, Catherine Faron-Zucker and Johan Montagnat

Abstract: Fostering the development of SPARQL interfaces to heterogeneous databases is a key to efficiently expose legacy data as RDF on the Web. To deal with the variety of modern database formats and query languages, this paper describes a two-step approach to translate a SPARQL query into an equivalent target database query. First, given an xR2RML mapping describing how native database entities can be mapped to RDF, a SPARQL query is translated into a pivot abstract query language independent of the database. In a second step, the pivot query is translated into the target database query language, considering the specific database capabilities. The paper focuses on the first step of the query translation, from SPARQL to a pivot query that takes into account join constraints and SPARQL filters, and embeds conditions entailed by matching SPARQL graph patterns with relevant mappings. It discusses the query optimisations that can be implemented at this level, and briefly describes an application to the case of MongoDB, a NoSQL document store.

Paper Nr: 117
Title:

A Lexicon-based Approach for Sentiment Classification of Amazon Books Reviews in Italian Language

Authors:

Franco Chiavetta, Giosuè Lo Bosco and Giovanni Pilato

Abstract: We present a system aimed at the automatic classification of the sentiment orientation expressed into book reviews written in Italian language. The system we have developed is found on a lexicon-based approach and uses NLP techniques in order to take into account the linguistic relation between terms in the analyzed texts. The classification of a review is based on the average sentiment strenght of its sentences, while the classification of each sentence is obtained through a parsing process inspecting, for each term, a window of previous items to detect particular combinations of elements giving inversions or variations of polarity. The score of a single word depends on all the associated meanings considering also semantically related concepts as synonyms and hyperonims. Concepts associated to words are extracted from a proper stratification of linguistic resources that we adopt to solve the problems of lack of an opinion lexicon specifically tailored on the Italian language. The system has been prototyped by using Python language and it has been tested on a dataset of reviews crawled from Amazon.it, the Italian Amazon website. Experiments show that the proposed system is able to automatically classify both positive and negative reviews, with an average accuracy of above 82%.

Short Papers
Paper Nr: 1
Title:

Can e-Commerce Recommender Systems be More Popular with Online Shoppers if they are Mood-aware?

Authors:

Fanjuan Shi and Jean-Luc Marini

Abstract: This paper presents the result of a controlled experiment studying how mood state can affect the usage of e-commerce recommender system. The authors develop a mood recognition tool to classify online shoppers into stressed or relaxed mood state unobtrusively. By analyzing their reactions to recommended products when surfing on an e-commerce website, the authors make two conclusions. Firstly, stress negatively impacts the usage of recommender system. Secondly, relaxed users are more receptive to recommendations. These findings suggest that mood recognition tool can help recommender systems find the "right time" to intervene. And mood-aware recommender systems can enhance marketer-consumer interaction.

Paper Nr: 36
Title:

Customer Perception Driven Product Evolution - Facilitation of Structured Feedback Collection

Authors:

Oleksiy Khriyenko

Abstract: Competitive environment not only requires effective advertising strategies from the product producers and service providers, but also to do comprehensive and sufficient analysis of their customers to understand their needs and expectations. Successfully involving customers into a product/service co-creation process, companies more likely increase their future revenue. Customer feedback analysis is widely applied in marketing and product development. Among other challenges (e.g. customer engagement, feedback collection, etc.) automation of customer feedback analysis becomes very demanding task and requires advance intelligent tools to understand customers’ product perception and preferences. Since, mining of free text feedbacks (which is still the most representing form of the real voice of the customer) is challenging, this work presents an approach towards customer-supported transformation of feedback into structured data. Further analysis and manipulation with semantically enhanced customer feedback and product/service description makes possible to automatically generate useful changes in existing products or even a new product description that takes into account actual needs and preferences of customers.

Paper Nr: 37
Title:

Content-based Title Extraction from Web Page

Authors:

Najlah Gali and Pasi Fränti

Abstract: Web pages are usually designed in a presentation oriented fashion, having therefore a large amount of non-informative data such as navigation banners, advertisement and functional text. For a particular user, only informative data such as title, main content, and representative images are considered useful. Existing methods for title extraction rely on the structural and visual features of the web page. In this paper, we propose a simpler, but more effective method by analysing the content of the title and meta tags in respect to the main body of the page. We segment the title and meta tags using a set of predefined delimiters and score the segments using three criteria: placement in tag, popularity within all header tags in the page, and the position in the link of the web page. The method is fully automated, template independent, and not limited to any certain type of web pages. Experimental results show that the method significantly improves the accuracy (average similarity to the ground truth title) from 62 % to 84 %.

Paper Nr: 41
Title:

Twitter Topic Modeling for Breaking News Detection

Authors:

Henning M. Wold, Linn Vikre, Jon Atle Gulla, Özlem Özgöbek and Xiaomeng Su

Abstract: Social media platforms like Twitter have become increasingly popular for the dissemination and discussion of current events. Twitter makes it possible for people to share stories that they find interesting with their followers, and write updates on what is happening around them. In this paper we attempt to use topic models of tweets in real time to identify breaking news. Two different methods, Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Process (HDP) are tested with each tweet in the training corpus as a document by itself, as well as with all the tweets of a unique user regarded as one document. This second approach emulates Author-Topic modeling (AT-modeling). The evaluation of methods relies on manual scoring of the accuracy of the modeling by volunteered participants. The experiments indicate topic modeling on tweets in real-time is not suitable for detecting breaking news by itself, but may be useful in analyzing and describing news tweets.

Paper Nr: 72
Title:

Towards a Temporal Extension to OWL 2: A Study based on tOWL Language

Authors:

Deborah Mendes Ferreira and Flavio de Barros Vidal

Abstract: The Semantic Web is becoming fundamental in the Web. The amount of data available in the Web is increasing in great proportions, it is very important to find ways to deal with this trend. The Semantic Web can help with this, by giving meaning to this data in a way that machines can understand what information it contains and automatically process it. In this paper we present a study about the compatibility between the latest version of the Web Ontology Language (OWL 2) and the Temporal Web Ontology Language (tOWL), based on the first version of OWL. We analyze which constructs of tOWL can be used and modified into OWL 2 maintaining decidability aspects. The current version of OWL does not have resources to represent complex time information and the main contribution of this work is a new proposal to represent time in OWL 2.

Paper Nr: 74
Title:

Concept-based Semantic Search over Encrypted Cloud Data

Authors:

Fateh Boucenna, Omar Nouali and Samir Kechid

Abstract: Cloud computing is a technology that allows companies and individuals to outsource their data and their applications. The aim is to take advantage from the power of storage and processing offered by such technology. However, in order to preserve data privacy, it is crucial that all data must be encrypted before being outsourced into the cloud. Moreover, authorized users should be able to recover their outsourced data. This process can be complicated due to the fact that data are encrypted. The traditional information retrieval systems only work over data in the clear. Therefore, dedicated information retrieval systems were developed to deal with the encrypted cloud data. Several kinds of search over cloud data have been proposed in the literature such as Boolean search, multi-keyword ranked search and fuzzy search. However, the semantic search is little addressed in the literature. In this paper, we propose an approach called SSE-S that take into account the semantic search in the cloud by using Wikipedia ontology to understand the meaning of documents and queries with maintaining the security and the privacy issues.

Paper Nr: 79
Title:

Enhancing Recommender Systems for TV by Face Recognition

Authors:

Toon De Pessemier, Damien Verlee and Luc Martens

Abstract: Recommender systems have proven their usefulness as a tool to cope with the information overload problem for many online services offering movies, books, or music. Recommender systems rely on identifying individual users and deducing their preferences from the feedback they provide on the content. To automate this user identification and feedback process for TV applications, we propose a solution based on face detection and recognition services. These services output useful information such as an estimation of the age, the gender, and the mood of the person. Demographic characteristics (age and gender) are used to classify the user and cope with the cold start problem. Detected smiles and emotions are used as an automatic feedback mechanism during content consumption. Accurate results are obtained in case of a frontal view of the face. Head poses deviating from a frontal view and suboptimal illumination conditions may hinder face detection and recognition, especially if parts of the face, such as eyes or mouth are not sufficiently visible.

Paper Nr: 92
Title:

Challenges of Serendipity in Recommender Systems

Authors:

Denis Kotkov, Jari Veijalainen and Shuaiqiang Wang

Abstract: Most recommender systems suggest items similar to a user profile, which results in boring recommendations limited by user preferences indicated in the system. To overcome this problem, recommender systems should suggest serendipitous items, which is a challenging task, as it is unclear what makes items serendipitous to a user and how to measure serendipity. The concept is difficult to investigate, as serendipity includes an emotional dimension and serendipitous encounters are very rare. In this paper, we discuss mentioned challenges, review definitions of serendipity and serendipity-oriented evaluation metrics. The goal of the paper is to guide and inspire future efforts on serendipity in recommender systems.

Paper Nr: 114
Title:

ViewSameAs: A Novel Link in Instance Matching Process

Authors:

Wafa Ghemmaz and Fouzia Benchikha

Abstract: In recent years, the Web has evolved from a global information space of interlinked documents to a space where both documents and data are linked. To integrate and share data, instance matching has been become the fundamental issue especially with the rapid development of linked data. In this paper, we propose an instance matching approach based on two main processes: the former is based on property classification (IM_PC) and the later is based on ViewSameAs link (IM_VSA). To accelerate greatly the matching process, IM_PC determines at first the matching candidate by comparing the discriminative property values. Then, the refinement result is done by comparing the description property values. In IM_PC two links are established: identity SameAs link and a novel proposed link ViewSameAs that aims to keep track of instances which share similar discriminative property values. In instance matching, another problem should be addressed when instances may have different descriptions even if their meanings are similar. So, this problem is addressed in IM_VSA process. The aim of this later is trying to get more identity link SameAs by Clustering instances matched with ViewSameAs. The Clustered instances are modeled as bags.

Posters
Paper Nr: 23
Title:

Introducing Wild-card and Negation for Optimizing SPARQL Queries based on Rewriting RDF Graph and SPARQL Queries

Authors:

Faisal Alkhateeb

Abstract: In this paper, we extend the SPARQL triple patterns to include two operators (the negation and the wild-card). We define the syntax and the semantics of these operators, in particular, when using them in the predicate position of SPARQL triple patterns. The use of the negation and wild-card operators and thus the semantics are different from the literature. Then, we show that these two operators could be used to enhance the evaluation performance of some SPARQL queries and to add extra expressiveness.

Paper Nr: 32
Title:

Analyzing Social Media Discourse - An Approach using Semi-supervised Learning

Authors:

Álvaro Figueira and Luciana Oliveira

Abstract: The ability to handle large amounts of unstructured information, to optimize strategic business opportunities, and to identify fundamental lessons among competitors through benchmarking, are essential skills of every business sector. Currently, there are dozens of social media analytics’ applications aiming at providing organizations with informed decision making tools. However, these applications rely on providing quantitative information, rather than qualitative information that is relevant and intelligible for managers. In order to address these aspects, we propose a semi-supervised learning procedure that discovers and compiles information taken from online social media, organizing it in a scheme that can be strategically relevant. We illustrate our procedure using a case study where we collected and analysed the social media discourse of 43 organizations operating on the Higher Public Polytechnic Education Sector. During the analysis we created an “editorial model” that characterizes the posts in the area. We describe in detail the training and the execution of an ensemble of classifying algorithms. In this study we focus on the techniques used to increase the accuracy and stability of the classifiers.

Paper Nr: 44
Title:

A Framework for Enriching Job Vacancies and Job Descriptions Through Bidirectional Matching

Authors:

Sisay Adugna Chala, Fazel Ansari and Madjid Fathi

Abstract: There is a huge online data about job descriptions which has been entered by job seekers and job holders that can be utilized to give insight into the current state of jobs. Employers also produce large volume of vacancy data online which can be exploited to portray the current demand of the job market. When preparing job vacancies, taking into account the information contained in job descriptions, and vice versa, the likelihood of getting the bidirectional match of a job description and a vacancy will be improved. To improve the quality of job descriptions and job vacancies, a mediating system is required that connects and supports job designers and employers, respectively. In this paper, we propose a framework of an automatic bidirectional matching system that measures the degree of semantic similarity of job descriptions provided by job-seeker, job-holder or job-designer against the vacancy provided by employer or job-agent. The system provides suggestions to improve both job descriptions and vacancies using a combination of text mining methods.

Paper Nr: 107
Title:

Randomised Optimisation of Discrimination Networks Considering Node-sharing

Authors:

Fabian Ohler, Karl-Heinz Krempels and Christoph Terwelp

Abstract: Because of their ability to store, access, and process large amounts of data, Database Management Systems (DBMSs) and Rule-based Systems (RBSs) are used in many information systems as information processing units. A basic function of a RBS and a function of many DBMSs is to match conditions on the available data. To improve performance intermediate results are stored in Discrimination Networks (DNs). The resulting memory consumption and runtime cost depend on the structure of the DN. A lot of research has been done in the area of optimising DNs. In this paper, we focus on re-using network parts considering multiple rule conditions and exploiting the characteristics of equivalences. We present an approach incorporating the potential of both concepts and balance their application in a randomised fashion. To evaluate the algorithms developed, they were implemented and yielded promising results. Shortcomings of this approach are discussed and their removal constitutes our current work.

Paper Nr: 108
Title:

An Enhanced Block Notation for Discrimination Network Optimisation

Authors:

Christoph Terwelp, Karl-Heinz Krempels and Fabian Ohler

Abstract: Because of their ability to efficiently store, access, and process data, Database Management Systems (DBMSs) and Rule-based Systems (RBSs) are used in many information systems as information processing units. A basic function of a RBS and a function of many DBMSs is to match conditions on the available data. To improve performance, intermediate results are stored in Discrimination Networks (DNs). The resulting memory consumption and runtime cost depend on the structure of the DN. A lot of research has been done in the area of optimising DNs. In this paper, we focus on re-using network parts considering multiple rule conditions and exploiting the characteristics of equivalences. Hence, we present an approach incorporating the potential of both concepts as an enhancement to previous work.

Paper Nr: 118
Title:

AEMIX: Semantic Verification of Weather Forecasts on the Web

Authors:

Angel-Luis Garrido, María G. Buey, Gema Muñoz and José-Luis Casado-Rubio

Abstract: The main objectives of a meteorological service are the development, implementation and delivery of weather forecasts. Weather predictions are broadcasted to society through different channels, i.e. newspaper, television, radio, etc. Today, the use of the Web through personal computers and mobile devices stands out. The forecasts, which can be presented in numerical format, in charts, or in written natural language, have a certain margin of error. Providing automatic tools able to assess the precision of predictions allows to improve these forecasts, quantify the degree of success depending on certain variables (geographic areas, weather conditions, time of year, etc.), and focus future work on areas for improvement that increase such accuracy. Despite technological advances, the task of verifying forecasts written in natural language is still performed manually by people in many cases, which is expensive, time-consuming, and subjected to human errors. On the other hand, weather forecasts usually follow several conventions in both structure and use of language, which, while not completely formal, can be exploited to increase the quality of the verification. In this paper, we describe a methodology to quantify the accuracy of weather forecasts posted on the Web and based on natural language. This work obtains relevant information from weather forecasts by using ontologies to capture and take advantage of the structure and language conventions. This approach is implemented in a framework that allows to address different types of predictions with minimal effort. Experimental results with real data are promising, and most importantly, they allow direct use in a real meteorological service.

Area 22 - Mobile Information Systems

Full Papers
Paper Nr: 24
Title:

Extending Content-Boosted Collaborative Filtering for Context-aware, Mobile Event Recommendations

Authors:

Daniel Herzog and Wolfgang Wörndl

Abstract: Recommender systems support users in filtering large amounts of data to find interesting items like restaurants, movies or events. Recommending events poses a bigger challenge than recommending items of many other domains. Events are often unique and have an expiration date. Ratings are usually not available before the event date and not relevant after the event has taken place. Content-boosted Collaborative Filtering (CBCF) is a hybrid recommendation technique which promises better recommendations than a pure content-based or collaborative filtering approach. In this paper, CBCF is adapted to event recommendations and extended by context-aware recommendations. For evaluation purposes, this algorithm is implemented in a real working Android application we developed. The results of a two-week field study show that the algorithm delivers promising results. The recommendations are sufficiently diversified and users are happy about the fact that the system is context-aware. However, the study exposed that further event attributes should be considered as context factors in order to increase the quality of the recommendations.

Short Papers
Paper Nr: 82
Title:

Refining a Reference Architecture for Model-Driven Business Apps

Authors:

Jan Ernsting, Christoph Rieger, Fabian Wrede and Tim A. Majchrzak

Abstract: Despite much progress, cross-platform app development frameworks remain a topic of active research. While frameworks that yield native apps are particularly attractive, their spread is very limited. It is apparent that (theoretical) technological superiority needs to be accompanied with profound support for developers and adequate capabilities for maintaining the framework itself. We deem so called reference architectures to be a major step for building better cross-platform app development frameworks, particularly if they are based on techniques of model-driven software development (MDSD). In this paper, we describe a refinement of a reference architecture for business apps. We employ the model-driven cross-platform development framework MD2 for this purpose. Its general design has been described extensively in the literature. The framework has a sound foundation in MDSD, yet lacks a generator support that fulfils the above sketched goals. After describing the required background, we argue in detail for a suitable reference architecture. While it will be a valuable addition to the MD2 framework, the discussion of our findings also makes a contribution for generative app development in general.

Paper Nr: 88
Title:

Context-aware Query Routing Protocol for P2P Mobile Systems

Authors:

Taoufik Yeferny

Abstract: In a Mobile Adhoc Network (MANET), mobile peers can communicate without a fixed infrastructure. Each peer can communicate directly with its neighborhood. To communicate with peers outside the transmission range, messages are propagated across multiple hops in the network. Furthermore, devices in MANETs usually have limited resources such as battery power, CPU capacity, memory and bandwidth. So, protocols and applications have to be optimized for such resources limitations. Internet-distributed applications like P2P file sharing are also deployed over MANET (i.e., P2P mobile systems). These applications allow users to search and share diverse resources over MANET. Due the nature of MANET, P2P mobile systems brought up many new thriving challenges, in particular with regard to the query routing protocol (i.e., content discovery protocol). To tackle this problem, we introduce a novel context-aware integrated routing protocol for unstructured P2P mobile file sharing systems. Our protocol (i) locates the best peers that share pertinent resources for user's query; and (ii) guarantees that those peers would be reached by considering MANET constraints. We implemented the proposed protocol and compared its routing efficiency and retrieval effectiveness with an another protocol taken from the literature. Experimental results show that our scheme carries out better than the other one with respect to accuracy.

Posters
Paper Nr: 83
Title:

Adaptive Augmented Reality in Mobile Applications for Helping People with Mild Intellectual Disability in Ecuador

Authors:

Maritzol Tenemaza, Angélica de Antonio, Jaime Ramírez, Armando Vela and Diego Rosero

Abstract: Adaptive Augmented Reality (A2R) is an emerging technology that can support users in their daily life with useful information for their activities, which are really adapted to the user’s characteristics, as well as the environment and the context where the activities are taking place. In Ecuador, mild intellectual disability is being considered as part of the government policies, for this reason we considered developing an app to help locate people with mild intellectual disability who at one time may feel lost and may not know how to return home. So the app allows caregivers to always know where their dependents are every time. We adopted A2R approach for developing this app so we had to model both user needs and their interests. Apart from User Model, we will also show the other A2R models required.