WEBIST 2019 Abstracts


Area 1 - Internet Technology

Full Papers
Paper Nr: 12
Title:

The Web Computer and Its Operating System: A New Approach for Creating Web Applications

Authors:

Sergejs Kozlovičs

Abstract: Web applications require not only more sophisticated infrastructure than traditional single-PC applications, but also a different way of thinking, where network-specific aspects have to be considered. In this paper, we introduce the web computer concept, which factors out network-related issues and provides an illusion of a single computer with directly attached CPUs, memory, and I/O devices. By assuming the web computer and its open operating system (webAppOS) as a target platform for web applications, developers can preserve the same level of thinking as when developing classical desktop applications. With this approach, which corresponds to the physiology of the human brain, web applications can be created faster. Besides, the proposed web computer specification can be viewed as a standardized environment (a Java Virtual Machine analog) for web applications.

Paper Nr: 23
Title:

The Vehicle Data Value Chain as a Lightweight Model to Describe Digital Vehicle Services

Authors:

Christian Kaiser, Andreas Festl, Gernot Pucher, Michael Fellmann and Alexander Stocker

Abstract: Digitalization has become an important driver of innovation in the automotive industry. For instance, the Quantified Self-movement has recently started spreading to the automotive domain, resulting in the provision of novel digital vehicle services for various stakeholders such as individual drivers and insurance companies. In this direction, a growing number of ICT start-ups from outside Europe have entered the market. Their digital vehicle services are grounded on the availability of vehicle Big Data. Hence, to better understand and capture this ongoing digital transformation, we introduce the Vehicle Data Value Chain (VDVC) as a lightweight model to describe and examine digital vehicle services. Furthermore, we classify current digital vehicle services offered by four start-ups and five car manufacturers by applying the VDVC, thereby identifying commonalities and differences within three crucial steps: data generation, acquisition, and usage. Additionally, we apply the VDVC to describe a digital mobility service provided by a European industry consortium. This exemplary application serves to evaluate the VDVC and show its general applicability in a practical context. We end our paper with a brief conclusion and an outlook on various current activities of standardization organizations, the European Commission and car manufacturers related to the future of vehicle services.

Paper Nr: 27
Title:

Agent-based Web Supported Simulation of Human-robot Collaboration

Authors:

André Antakli, Torsten Spieldenner, Dmitri Rubinstein, Daniel Spieldenner, Erik Herrmann, Janis Sprenger and Ingo Zinnikus

Abstract: In the production industry, in recent years more and more hybrid teams of workers and robots are being used to improve flexible processes. The production environment of the future will include hybrid teams in which workers cooperate more tightly together with robots and virtual agents. The virtual validation of such teams will require simulation environments in which various safety and productivity issues can be evaluated. In this paper, we present a framework for 3D simulation of hybrid teams in production scenarios based on an agent framework that can be used to evaluate critical properties of the planned production environment and the dynamic assignment of tasks to team members. The framework is embedded in a web-based distributed infrastructure that models and provides the involved components (digital human models, robots, visualization environment) as resources. We illustrate the approach with a use case in which a human-robot team works together in an aircraft manufacturing scenario.

Paper Nr: 32
Title:

Arabic Twitter User Profiling: Application to Cyber-security

Authors:

Rahma Basti, Salma Jamoussi, Anis Charfi and Abdelmajid Ben Hamadou

Abstract: In recent years, we witnessed a rapid growth of social media networking and micro-blogging sites such as Twitter. In these sites, users provide a variety of data such as their personal data, interests, and opinions. However, this data shared is not always true. Often, social media users hide behind a fake profile and may use it to spread rumors or threaten others. To address that, different methods and techniques were proposed for user profiling. In this article, we use machine learning for user profiling in order to predict the age and gender of a user’s profile and we assess whether it is a dangerous profile using the users’ tweets and features. Our approach uses several stylistic features such as characters based, words based and syntax based. Moreover, the topics of interest of a user are included in the profiling task. We obtained the best accuracy levels with SVM and these were respectively 73.49% for age, 83.7% for gender, and 88.7% for the dangerous profile detection.

Paper Nr: 34
Title:

HoT: Unleash Web Views with Higher-order Templates

Authors:

Fernando M. Carvalho and Luis Duarte

Abstract: Over the past decades, templates views have been the most used approach to build dynamic HTML pages. Simply put, a template engine (such as JSP, Handlebars, Thymleaf, and others) generates HTML by merging templates with given data models. Yet, this process may turn impractical for large data sets that postpone the HTML resolution until all data become available to the engine. This behavior results in poor user experience preventing the browser to render the end user-interface. In this paper we introduced the concept of higher-order templates (HoT) provided in Java implementation of HtmlFlow, which allows HTML to be resolved on demand as data becomes available. This lets the user-interface to be rendered incrementally by the browser in line with the availability of the data. Finally we also show some advantages of HtmlFlow over state of the art front-end frameworks such as ReactJS.

Paper Nr: 36
Title:

Of the Utmost Importance: Resource Prioritization in HTTP/3 over QUIC

Authors:

Robin Marx, Tom De Decker, Peter Quax and Wim Lamotte

Abstract: Not even five years after the standardization of HTTP/2, work is already well underway on HTTP/3. This latest version is necessary to make optimal use of that other new and exiting protocol: QUIC. However, some of QUIC’s unique characteristics make it challenging to keep HTTP/3’s functionalities on par with those of HTTP/2. Especially the efforts on adapting the prioritization system, which governs how multiple resources can be multiplexed on a single QUIC connection, have led to some difficult to answer questions. This paper aims to help answer some of those questions by being the first to provide experimental evaluations and result comparisons for 11 different possible HTTP/3 prioritization approaches in a variety of simulation settings. We present some non-trivial insights, discuss advantages and disadvantages of various approaches, and provide results-backed actionable advice to the standardization working group. We also help foster further experimentation by contributing our complete HTTP/3 implementation, results dataset and custom visualizations to the community.

Short Papers
Paper Nr: 11
Title:

Automatic Extraction of Legal Citations using Natural Language Processing

Authors:

Akshita Gheewala, Chris Turner and Jean-Rémi de Maistre

Abstract: The accessibility of legal documents to the different actors of the judicial system needs to be ensured for the implementation of a strong international rule of law. The gap of such accessibility is being addressed by the Jus Mundi multilingual search-engine for International Law. The data updated on this platform is qualified by skilled lawyers. However, the interconnection of references within such documents, is a key feature for lawyers since, a major part of the legal research is analysing such citations to support their arguments. The process of interconnecting such references can prove to be expensive as well as time-consuming, if completed manually. Hence, the purpose of this research is to automatically extract such legal citations within international law, using Natural Language Processing (NLP), enabling the interconnectivity of documents on Jus Mundi. This study also discusses and addresses research gaps within this subject, especially in the domain specific to International Law. The method followed to achieve the automation is building an adaptable model through Regular-Expression based annotation language named JAPE (Java Annotation Patterns Engine). This set of automatically extracted links are then to be integrated with the search engine, having direct implication in the enablement of smoother navigation, making the law more accessible. This research also contributes to the state of the art bringing closer the eventual use of NLP in applications used to interact with International Law documents.

Paper Nr: 15
Title:

Full Stack Web Development Teaching: Current Status and a New Proposal

Authors:

Anna Petrikoglou and Theodore H. Kaskalis

Abstract: The main purpose of this effort is to present a brand-new environment for practising some of the most broadly used – both client- and server-side – web technologies. It is about a web-based, access-free, educational platform, which provides a user-friendly interface, illustrative graphics and supporting material, as well. Full stack development platforms are rarely met online, as most of them are usually oriented towards either front- or backend development and focus on specific programming languages without offering an overview of actual, integrated projects. This research also involves evaluating existing teaching methods, scanning and comparing some of the most popular, educational web platforms and, furthermore, discovering simple techniques and efficient approaches to reach valuable programming resources for both students and self-learners. The paper places particular emphasis on the recognition of the applications’ key features and the variety of programming tools that promote learning and skill enhancement. Moreover, it discusses the roles of tutors and learners, while suggesting a learning path for novice developers. Given the fact that computer science courses often require exceptional practices, this study aims at encouraging active, self-motivated and self-paced learning.

Paper Nr: 22
Title:

Microblog Sentiment Prediction based on User Past Content

Authors:

Yassin Belhareth and Chiraz Latiri

Abstract: Analyzing massive, noisy and short microblogs is a very challenging task where traditional sentiment analysis and classification methods are not easily applicable due to inherent characteristics such social media content. Sentiment analysis, also known as opinion mining, is a mechanism for understanding the natural disposition that people possess towards a specific topic. Therefore, it is very important to consider the user context that usually indicates that microblogs posted by the same person tend to have the same sentiment label. One of the main research issue is how to predict twitter sentiment as regards a topic on social media? In this paper, we propose a sentiment mining approach based on sentiment analysis and supervised machine learning principles to the tweets extracted from Twitter. The originality of the suggested approach is that classification does not rely on tweet text to detect polarity, but it depends on users’ past text content. Experimental validation is conducted on a tweet corpus taken from data of SemEval 2016. These tweets talk about several topics, and are annotated in advance at the level of sentiment polarity. We have collected the past tweets of each author of the collection tweets. As an initial experiment in the prediction of user sentiment on a topic, based on his past, the results obtained seem acceptable, and could be improved in future work.

Paper Nr: 26
Title:

Efficient Shortest Path Routing Algorithms for Distributed XML Processing

Authors:

Ye Longjian, Hiroshi Koide, Dirceu Cavendish and Kouichi Sakurai

Abstract: This paper analyses the problem of efficiently routing XML documents on a network whose nodes are capable of distributed XML processing. The goal of our study is to find network paths for which XML documents’ transmission will result in high likelihood that a large portion of the documents be processed within the network, decreasing the amount of XML processing at documents arrival at the destination site. We propose several routing algorithms for single route and multipath routing and evaluate them on a distributed XML network simulation environment. We show the benefits of the proposed XML routing algorithms as compared with widespread minimum hop routing strategy of the Internet.

Paper Nr: 42
Title:

A Blockchain-based Application to Protect Minor Artworks

Authors:

Clara Bacciu, Angelica Lo Duca and Andrea Marchetti

Abstract: A new emerging trend concerns the implementation of services and distributed applications through the blockchain technology. A blockchain is an append-only database, which guarantees security, transparency and immutability of records. Blockchains can be used in the field of Cultural Heritage to protect minor artworks, i.e. artistic relevant works not as famous as masterpieces. Minor artworks are subjected to counterfeiting, thefts and natural disasters because they are not well protected as famous artworks. This paper describes a blockchain-based application, called MApp (Minor Artworks application), which lets authenticated users (private people or organizations), store the information about their artworks in a secure way. The use of blockchain produces three main advantages. Firstly, artworks cannot be deleted from the register thus preventing thieves to remove records associated stolen objects. Secondly, artworks can be added and updated only by authorized users, thus preventing counterfeiting in objects descriptions. Finally, records can be used to keep artworks memory in case of destruction caused by a natural disaster.

Paper Nr: 53
Title:

Discovering Emotions through the Building of a Linguistic Resource

Authors:

Manuela Angioni and Franco Tuveri

Abstract: Specific linguistic resources, syntactically annotated and distinctive for each language, related to the affective sphere are important in discovering terms or phrases associated with emotions in order to detect expressed emotions. The paper proposes the initial version of a linguistic resource for the Italian language, mapped on WordNet, where each concept, whose meaning falls into the sphere of emotions, is enriched by a category, allowing to better specify the type of emotion expressed by the term, and by a polarity value, whether the emotion is positive or negative. The resource is based on the model of emotions proposed by Robert Plutchik and has been developed, within a national project of Work-School Alternation, in collaboration with some high school students. The work has a twofold value. On one hand, the development of a linguistic resource, on the other the educational and didactic aspect of students’ involvement. Working on the analysis of literary texts with the task of elaborating and defining the emotions described, the students, assisted by their teachers and two researchers, had to face with their feelings and talk more freely about their affective states, recognizing the emotions and giving them a name.

Paper Nr: 10
Title:

Improving the Latency of Python-based Web Applications

Authors:

António Esteves and João Fernandes

Abstract: This paper describes the process of optimizing the latency of Python-based Web applications. The case study used to validate the optimizations is an article sharing system, which was developed in Django. Memcached, Celery and Varnish enabled the implementation of additional performance optimizations. The latency of operations was measured, before and after the application of the optimization techniques. The optimization of the application was performed at various levels, including the transfer of content across the network and the back-end services. HTTP caching, data compression and minification techniques, as well as static content replication using Content Delivery Networks, were used. Partial update of the application’s pages on the front-end and asynchronous processing techniques were applied. The database utilization was optimized by creating indexes and by taking advantage of a NoSQL solution. Memory caching strategies, with distinct granularities, were implemented to store templates and application objects. Furthermore, asynchronous task queues were used to perform some costly operations. All of the aforementioned techniques favorably contributed to the Web application’s latency decrease. Since Django operates on the back-end, and optimizations must be implemented at various levels, it was necessary to use other tools.

Paper Nr: 18
Title:

A Methodology for Experimental Evaluation of a Software Assistant for the Development of Safe and Economically Viable Software

Authors:

Alina Khayretdinova and Michael Kubach

Abstract: Very often software developers of IT security solutions tend to focus on subjects of privacy and security of the product neglecting other important aspects of the development such as socio-economics and usability that are crucial for the success of the product on the modern market. To address this problem, project CUES developed a software assistant that has an interdisciplinary approach. The assistant guides the developers of IT security solutions through an entire software development process by aiding to identify present problems and suggesting effective solutions from the fields of (a) Usability and User Experience, (b) socio-economics, (c) IT security, and other disciplines. In this paper, we propose a method to evaluate the assistant in the conditions that are closest to reality: the assessment of the software assistant is carried out through two case studies where at each two student teams have a task to develop a security related software that will also be attractive for users and the market. One of the student teams in each case study was supported by the assistant, whereas the second teams were not. The teams supported by the assistant performed better.

Paper Nr: 30
Title:

Evaluating the RESTfulness of “APIs from the Rough”

Authors:

Arne Koschel, Irina Astrova, Maximilian Blankschyn, Dominik Schöner and Kevin Schulze

Abstract: Nowadays, REST is the most dominant architectural style of choice at least for newly created web services. So called RESTfulness is thus really a catchword for web application, which aim to expose parts of their functionality as RESTful web services. But are those web services RESTful indeed? This paper examines the RESTfulness of ten popular RESTful APIs (including Twitter and PayPal). For this examination, the paper defines REST, its characteristics as well as its pros and cons. Furthermore, Richardson's Maturity Model is shown and utilized to analyse those selected APIs regarding their RESTfulness. As an example, a simple, RESTful web service is provided as well.

Paper Nr: 60
Title:

CMS-oriented Modeling Languages: An Attempt to Assist Model-driven Development in CMS Domain

Authors:

Vassiliki Gkantouna and Giannis Tzimas

Abstract: Nowadays, Content Management Systems (CMSs) are widely used as the underlying development platform for building complex Web applications. However, despite their widespread use, existing MDWE methodologies have focused mainly on traditional Web applications, and thus, they cannot support the model-driven development of CMS-based Web applications. Given that MDWE methodologies are driven by the expressiveness of the modeling languages which are being used within their context, the failure of the existing MDWE methodologies to support the automated development of CMS-based Web application is probably caused by the absence of modeling languages able to capture the particular development context of CMS platforms. To address this problem, we propose a new genre of modeling languages, called CMS-oriented modeling languages, which are particularly defined over the specific development context of CMS platforms. We provide a general framework to support their definition in three main stages, involving the analysis of the target CMS platform, the creation of its domain model and the formal definition of the CMS-oriented modeling language. In this way, the proposed framework supports the definition of CMS-oriented modeling languages, which can lay the foundation for the development of MDWE methodologies for CMS-based Web applications, thus enabling model-driven development in CMS domain.

Area 2 - Mobile and NLP Information Systems

Full Papers
Paper Nr: 5
Title:

Examining the Privacy Vulnerability Level of Android Applications

Authors:

Georgia M. Kapitsaki and Modestos Ioannou

Abstract: Mobile applications are often granted access to various data available on the mobile device. Android applications provide the notion of permissions to let the developers define the data their applications require to function properly. However, through accessing these data, applications may gain direct or indirect access to sensitive user data. In this paper, we address the detection of privacy vulnerabilities in mobile applications in Android via an analysis that is based mainly on the use of Android permissions. Different aspects of the application are analyzed in order to draw conclusions offering an aggregated view of permission analysis in the form of a penalty score, a feature that is missing in previous approaches that analyze permission use in Android. Our work is supported by a web application prototype of App Privacy Analyzer that allows users to upload an application and view the respective analysis results comparing them with other applications uploaded in previous uses of the system. This approach can be useful for security and privacy analysts and developers that wish to examine the privacy vulnerability level of their Android applications, but also for end users with technical expertise. We have used the tool for the analysis of 800 Android applications and are discussing the results the observed permission use.

Paper Nr: 25
Title:

A New Data Structure for Processing Natural Language Database Queries

Authors:

Richard A. Frost and Shane Peelar

Abstract: Natural Language Query Interfaces (NLQIs) have once again captured the public imagination, but developing them has proven to be non-trivial. One way is by using a Compositional Semantics (CS) to directly compute the answer to a query from the meanings of its parts. The query is treated as an expression of a formal language, and interpreted directly with respect to a database which provides meanings for words, which are the basic components. The meanings of compound phrases in the query, and the answer to the query itself, are computed from their constituent words and phrases using semantic rules that are applied according to the query’s syntactic structure. Montague Semantics (MS), which is a type of CS, has been used in various NLQIs previously. MS accommodates common and proper nouns, adjectives, conjunction and disjunction, intransitive and binary transitive verbs, quantifiers, and intentional and modal constructs. MS does not provide an explicit denotation for n-ary transitive verbs nor does it provide an explanation of how to handle prepositional phrases. By adding events to MS and by introducing a new data structure, transitive verbs and prepositional phrases can be accommodated as well as other NL features that are often considered to be non-compositional.

Short Papers
Paper Nr: 63
Title:

Interpreting the Results from the User Experience Questionnaire (UEQ) using Importance-Performance Analysis (IPA)

Authors:

Andreas Hinderks, Anna-Lena Meiners, Francisco D. Mayo and Jörg Thomaschewski

Abstract: User Experience Questionnaire is a common and valid method to measure the User Experience (UX) for a product or service. In recent years, these questionnaires have established themselves to measure various aspects of UX. In addition to the questionnaire, an evaluation tool is usually offered so that the results of a study can be evaluated in the light of the questionnaire. As a rule, the evaluation consists of preparing the data and comparing it with a benchmark. Often this interpretation of the data is not sufficient as it only evaluates the current User Experience. However, it is desirable to determine exactly where there is a need for action. In our article we present an approach that evaluates the results from the User Experience Questionnaire (UEQ) using the importance-performance analysis (IPA). The aim is to create another possibility to interpret the results of the UEQ and to derive recommendations for action from them. In a first study with 219 participants, we validated the approach presented with YouTube and WhatsApp. The results show that the IPA provides additional insights from which further recommendations for action can be derived.

Paper Nr: 74
Title:

A Framework for Context-dependent User Interface Adaptation

Authors:

Stephan Kölker, Felix Schwinger and Karl-Heinz Krempels

Abstract: Mobile information systems are operated in a large variety of different contexts – especially during intermodal journeys. Every context has a distinct set of properties, so the suitability of user interfaces differs in various contexts. But currently, the representation of information on user interfaces is hard-coded. Therefore, we propose a dynamic adaptation of user interfaces to the context of use to increase the value of an information system to the user. The proposed system focuses on travel information systems but is designed in a way that it is generalizable to other application domains. The adaptation system works as an independent service that acts as a broker in the communication between an application and the user. This service transforms messages between a user- and a system-oriented representation. The context of use, the device configuration, and user preferences affect the calculated user-oriented representation.

Paper Nr: 41
Title:

Consumers’ Cognitive, Emotional and Behavioral Responses towards Background Music: An EEG Study

Authors:

Athanasios Gkaintatzis, Rob van der Lubbe, Kalipso Karantinou and Efthymios Constantinides

Abstract: The physical environment affects individuals emotionally, behaviorally and cognitively. Servicescapes or atmospherics studies the effect of environmental stimulation to consumers. Environmental stimuli affect consumers’ attention and cause them emotions such as pleasure and arousal; those emotional responses affect in turn consumers’ behavioral responses such us approach and avoidance tendencies towards the environment. Hence, the level of arousal-nonarousal and the pleasure-displeasure experienced by a consumer along with other intervening variables such as momentary mood and stimulus screening ability will determine his/her approach-avoidance responses towards the environmental stimuli. The paper studies atmospherics with neuromarketing and conventional marketing research methods: Specifically, using electroencephalography (EEG) and surveys, it focuses on the effect of background music on Consumer’s arousal, pleasure, attention and approach/avoidance tendencies. The results are expected to have significant academic relevance both for the servicescapes / atmospherics and the neuromarketing / consumer neuroscience research streams of literature and also managerial implications.

Area 3 - Service Based Information Systems, Platforms and Eco-Systems

Full Papers
Paper Nr: 28
Title:

Social Media Advertising: The Role of the Social Media Platform and the Advertised Brand in Attitude Formation and Purchase Intention

Authors:

Maria Madlberger and Lisa Kraemmer

Abstract: Social media provide not only platforms for individual users for personal communication, but also serve as media for the dissemination of promotional messages. Recently, the role of social media platforms as earned media has been in focus, but paid advertising still plays a significant role in this respect. This study seeks to deepen the understanding of consumer behavior in this context by addressing consumer attitude towards social media advertising on two levels, i.e., general attitude towards advertising on a social media platform and specific attitude towards an individual advertisement on the social media platform. The study proposes and empirically tests a research model that shows a significant impact of platform enjoyment on general attitude towards advertising as well as brand familiarity as an antecedent of specific attitude towards an advertisement on the platform. Further, a significant impact of both levels of attitude on purchase intention could be shown. The findings stress the importance of a clear conceptualization of consumer attitude on different levels and highlight the relevance of paid advertising in social media.

Paper Nr: 69
Title:

The Missing Link between Requirements Engineering and Architecture Design for Information Systems

Authors:

Karl-Heinz Krempels, Fabian Ohler and Christoph Terwelp

Abstract: Methodologies for the design of software architectures based on identified system requirements differ in procedure, phases, and artifacts. Furthermore, the consideration of functional and quality requirements during the design process leads to diverse software architecture models varying in their capability regarding adaptivity to new or changed user or quality requirements. The paper discusses a novel methodology for the design of software architecture for information systems based on user and quality requirements. The distinctiveness from methodologies discussed in the literature is given in the comprehensible and traceable deduction of a domain model for the software system architecture from user requirements. The methodology was evaluated and adapted iteratively in many R&D projects.

Short Papers
Paper Nr: 13
Title:

Future CMS for e-Business: Will Microservices and Containerization Change the Game?

Authors:

Carina Landerer and Philipp Brune

Abstract: Content Management Systems (CMS) are widely adopted commodity applications today. Well-established implementations exist since many years, both commercial and Open Source, e.g. WordPress or Drupal. In consequence, research on CMS has strongly declined since its peak more than a decade ago. However, in recent years new trends for building and running large-scale web applications emerged, such as Node.js, microservice architectures and containerization, while most established CMS still use a traditional monolithic architecture. This rises the question how these emerging new technologies will challenge those established CMS implementations. Therefore, in this paper a microservice architecture for an examplified e-business CMS is proposed, and a Proof-of-Concept implementation is described. The approach is evaluated with respect to its feasibility using a qualitative empirical study. Results indicate that the approach is well-suited for building more state-of-the-art CMS in the near future, which are likely to challenge the position of the traditional monolithic implementations.

Paper Nr: 72
Title:

An Application of OSSpal for the Assessment of Open Source Project Management Tools

Authors:

Hugo Carvalho de Paula and Jorge Bernardino

Abstract: Projects are a necessity within any competitive business, and as the execution of complex projects becomes the norm, so grows the need for advances in project management. The use of project management tools is key towards taming said complexity. There are many such tools available; the current challenge resides in picking the right one. In this paper, we evaluate three different tools - OpenProject, dotProject, and Odoo - using the OSSpal methodology.

Paper Nr: 75
Title:

Self-sovereign Management of Privacy Consensus using Blockchain

Authors:

Francesco Buccafurri and Vincenzo De Angelis

Abstract: In this paper, we propose a solution implementing a self-sovereign approach to manage privacy consensus in a open domain. The idea is allowing the user to set her policies in a unique way, in such a way that she keeps the full control on her personal data. The goal is achieved by combining blockchain with Attribute-Based Encryption and Proxy Re-encryption. Blockchain is also used to implement the notarization of the critical actions to obtain accountability and non-repudiation.

Paper Nr: 37
Title:

Evaluating RuleCore as Event Processing Network Model

Authors:

Irina Astrova, Arne Koschel, Sebastian Kobert, Jan Naumann, Tobias Ruhe and Oleg Starodubtsev

Abstract: Our work is motivated primarily by the lack of standardization in the area of Event Processing Network (EPN) models. We identify general requirements for such models. These requirements encompass the possibility to describe events in the real world, to establish temporal and causal relationships among the events, to aggregate the events, to organize the events into a hierarchy, to categorize the events into simple or complex, to create an EPN model in an easy and simple way and to use that model ad hoc. As the major contribution, this paper applies the identified requirements to the RuleCore model.

Paper Nr: 39
Title:

Do We Really Need Another Blockchain Framework? A Case for a Legacy-friendly Distributed Ledger Implementation based on Java EE Web Technologies

Authors:

Philipp Brune

Abstract: Cryptocurrencies, blockchain technology and smart contracts could fundamentally change the way how financial products and financial services are implemented and operated. While many frameworks for implementing such blockchain applications already exist, these are usually implemented using languages either considered “fancy” today, like e.g. Go, or are traditionally used for system software, such as C++. On the other hand, the core business applications e.g. in financial services are typically implemented using enterprise platforms such as Java Enterprise Edition (EE) and/or COBOL. Therefore, to improve the integration of blockchain technology in such applications, in this paper we argue in favor of a legacy-friendly distributed ledger solution by introducing QWICSchain, an implementation build on web services using established open-source enterprise technologies such as Java EE and PostgreSQL. It supports the parallel execution of transactions on the blockchain and in existing legacy applications, thus enabling the blockchain-based modernization of existing IT infrastructures.

Paper Nr: 50
Title:

Evaluating Open Source Project Management Tools using OSSPal Methodology

Authors:

Antonio Oliveira and Jorge Bernardino

Abstract: One of the major differences between a successful project and a failed one is the project management abilities. Project management leads to better alignment of projects within the business’ strategy, so that companies can reduce their costs, accelerate product development, and focus on meeting their customers’ needs. To help with that, project management tools are highly recommended, once they ease planning, scheduling, resource allocation, communication and documentation tasks. In this paper, we assess three popular open source project management tools: OpenProject, Orangescrum and ProjectLibre with the help of the OSSPal methodology. This study can help project managers and programmers on choosing an adequate, current, high quality and affordable tool to perform their projects.

Paper Nr: 62
Title:

Evaluating GitLab, OpenProject, and Redmine using QSOS Methodology

Authors:

André Vicente and Jorge Bernardino

Abstract: Measure and planning all aspects and variables of a project is extremely important to have success. Therefore, having the right tool is extremely important. To evaluate project management tools, we can use several methodologies, that allow to choose the best tools according to our criteria. QSOS is one of the methodologies that allows to make a more weighted choose. In this paper, we evaluate popular open source project management tools GitLab, OpenProject, and Redmine, using QSOS methodology.

Area 4 - Web Interfaces

Full Papers
Paper Nr: 16
Title:

SmartMobility, an Application for Multiple Integrated Transportation Services in a Smart City

Authors:

Cristian Lai, Francesco Boi, Alberto Buschettu and Renato Caboni

Abstract: In this paper, we present SmartMobility, an application for multimobility information services in a smart city, exploiting an ecosystem of IoT devices. Such application, designed for a real case study, is extremely heterogeneous in terms of IoT devices and implements a wide range of services for citizens. The application aims at contributing to reducing traffic generated by private vehicles in the city besides helping drivers going towards high traffic areas by presenting real-time mobility data from different sources. The experiments carried out in this study have evaluated some behaviors of the application in front of different configurations, allowing understanding how the experience varies under a wide number of devices and services, particularly in terms of mobility alternatives offered to the final user. The research findings showed that the bus transportation service is the most common one, while carsharing and bikesharing are not widespread and must be improved.

Paper Nr: 54
Title:

Automatic Reuse of Prototypes in Software Engineering: A Survey of Available Tools

Authors:

A. Sánchez-Villarín, A. Santos-Montaño and J. G. Enríquez

Abstract: The use of prototypes as an excellent mechanism for communication between software users has been fully accepted in the literature. Both the academic and business worlds agree on its use as a software technique, used primarily to capture requirements and as a means of communication with the user. However, often, prototypes are developed very quickly or without the collaboration of the users. This is one of the main reasons why the real power of the prototypes is not used in software development. The initial hypothesis of this research is that this problem occurs because prototypes are regarded as disposable milestones. This paper analyzes whether there are adequate tools on the market to help development teams to reuse prototypes as an exceedingly mechanism for starting and expediting new software development projects. With the present study we show different tools for prototypes that, even though they are a preliminary evaluation, show the disadvantages to be solved in the future development and research of solutions.

Short Papers
Paper Nr: 40
Title:

The Problem of Finding the Best Strategy for Progress Computation in Adaptive Web Surveys

Authors:

Thomas M. Prinz, Jan Plötner and Anja Vetterlein

Abstract: Web surveys are typical web information systems. As part of the interface of a survey, progress indicators inform the participants about their state of completion. Different studies suggest that the progress indicator in web surveys has an impact on the dropout and answer behaviour of the participants. Therefore, researchers should invest some time in finding the right indicator for their surveys. But the calculation of the progress is sometimes more difficult than expected, especially, in surveys with a lot of branches. Current related work explains the progress computation in such cases based on different prediction strategies. However, the performance of these strategies varies for different surveys. In this position paper, we explain how to compare those strategies. The chosen Root Mean Square Error measure allows to select the best strategy. But experiments with two large real-world surveys show that there is no single best strategy for all of them. It highly depends on the structure of the surveys and sometimes even the best known strategy produces bad predictions. Dedicated research should find solutions for these cases.

Paper Nr: 57
Title:

Visualizing Learners’ Navigation Behaviour using 360 Degrees Interactive Videos

Authors:

Alisa Lincke, David L. Prieto, Romain C. Herault, Elin-Sofie Forsgärde and Marcelo Milrad

Abstract: The use of 360-degrees interactive videos for educational purposes in the medical field has increased in recent years, as well as the use of virtual reality in general. Learner’s navigation behavior in 360-degrees interactive video learning environments has not been thoroughly explored yet. In this paper, a dataset of interactions generated by 80 students working in 16 groups while learning about patient trauma treatment using 360-degrees interactive videos is used to visualize learners’ navigation behavior. Three visualization approaches were designed and implemented for exploring users’ navigation paths and patterns of interaction with the learning materials are presented and discussed. The visualization tool was developed to explore the issues above and it provides a comprehensive overview of the navigation paths and patterns. A user study with four experts in the information visualization field has revealed the advantages and drawbacks of our solution. The paper concludes by providing some suggestions for improvements of the proposed visualizations.

Paper Nr: 29
Title:

Interfaces of the Agriculture 4.0

Authors:

Letizia Bollini, Alessio Caccamo and Carlo Martino

Abstract: The introduction of information technologies in the environmental field is impacting and changing even a traditional sector like agriculture. Nevertheless, Agriculture 4.0 and data-driven decisions should meet user needs and expectations. The paper presents a broad theoretical overview, discussing both the strategic role of design applied to Agri-tech and the issue of User Interface and Interaction as enabling tools in the field. In particular, the paper suggests to rethink the HCD approach, moving on a Human-Decentered Design approach that put together user-technology-environment and the importance of the role of calm technologies as a way to place the farmer, not as a final target and passive spectator, but as an active part of the process to aim the process of mitigation, appropriation from a traditional cultivation method to the 4.0 one.

Paper Nr: 64
Title:

Using InDesign Tool to Develop an Accessible Interactive EPUB 3: A Case Study

Authors:

Barbara Leporini, Luca Minardi and Gregorio Pellegrino

Abstract: EPUB (Electronic Publication) is a format primarily used for digital books. EPUB 3 is based on the Open Web Platform, HTML, CSS and JavaScript. In short, EPUB content can be considered as a web interface. Thus, existing accessibility guidelines and techniques can be applied in order to obtain an accessible eBook. In our study, we investigate if popular editing tools are sufficiently mature to support the authors in designing interactive and multimedia EPUB 3 fully accessible via assistive technology like screen readers. To this purpose, in this work, the Adobe InDesign tool has been used to prepare an interactive EPUB 3 prototype as a case study. Our aim was to analyze whether the InDesign tool was able to support authors in getting a fully accessible interactive EPUB 3. The results revealed that accessibility is not fully guaranteed directly by the tool: some additional steps are required by the authors or professional operators.

Area 5 - Web Intelligence

Full Papers
Paper Nr: 1
Title:

Towards Predicting Mentions to Verified Twitter Accounts: Building Prediction Models over MongoDB with Keras

Authors:

Ioanna Kyriazidou, Georgios Drakopoulos, Andreas Kanavos, Christos Makris and Phivos Mylonas

Abstract: Digital influence and trust are central research topics in social media analysis with a plethora of applications ranging from social login to geolocation services and community structure discovery. In the evolving and diverse microblogging sphere of Twitter verified accounts reinforce digital influence through trust. These typically correspond either to an organization or to a person of high social status or to netizens who have been consistently proven to be highly influential. This conference paper presents a framework for estimating the probability that the next mention of any account will be to a verified account, an important metric of digital influence. At the heart of this framework lies a convolutional neural network (CNN) implemented in keras over TensorFlow. The training features are extracted from a dataset of tweets regarding the presentation of the Literature Nobel prize to Bob Dylan collected with the Twitter Streaming API and stored in MongoDB. In order to demonstrate the performance of the CNN, the results obtained by applying logistic regression to the same training features are shown in the form of statistical metrics computed from the corresponding contingency matrices, which are obtained using the pandas Python library.

Paper Nr: 55
Title:

CATI: An Active Learning System for Event Detection on Mibroblogs’ Large Datasets

Authors:

Gabriela Bosetti, Előd Egyed-Zsigmond and Lucas O. Ono

Abstract: Today, there are plenty of tools and techniques to perform text- or image-based classification of large datasets, targeting different levels of user expertise and abstraction. Specialists usually collaborate in projects by creating ground truth datasets and do not always have deep knowledge in Information Retrieval. This article presents a full platform for assisted binary classification of very large textual and text and image composed documents. Our goal is to enable human users to classify collections of several hundred thousand documents in an assisted way, within a humanly acceptable number of clicks. We propose a graphical user interface, based on several classification assistants: text- and image-based event detection, Active Learning (AL), search engine and rich visual metaphors to visualize the results. We also propose a novel query strategy in the context of Active Learning, considering the top unlabeled bi-grams and duplicated (e.g. re-tweeted) content in the target corpus to classify. These contributions are supported not only by a tool whose code is freely accessible but also by an evaluation of the impact of using the aforementioned methods on the number of clicks needed to reach a stable level of accuracy.

Short Papers
Paper Nr: 8
Title:

Domain Specific Grammar based Classification for Factoid Questions

Authors:

Alaa Mohasseb, Mohamed Bader-El-Den and Mihaela Cocea

Abstract: The process of classifying questions in any question answering systems is the first step in retrieving accurate answers. Factoid questions are considered the most challenging type of question to classify. In this paper, a framework has been adapted for question categorization and classification. The framework consists of three main features which are, grammatical features, domain-specific features, and grammatical patterns. These features help in preserving and utilizing the structure of the questions. Machine learning algorithms were used for the classification process in which experimental results show that these features helped in achieving a good level of accuracy compared with the state-of-art approaches.

Paper Nr: 9
Title:

Unsupervised Topic Extraction from Twitter: A Feature-pivot Approach

Authors:

Nada A. GabAllah and Ahmed Rafea

Abstract: Extracting topics from textual data has been an active area of research with many applications in our daily life. The digital content is increasing every day, and recently it has become the main source of information in all domains. Organizing and categorizing related topics from this data is a crucial task to get the best benefit out of this massive amount of information. In this paper we are presenting a feature-pivot based approach to extract topics from tweets. The approach is applied on a Twitter dataset in Egyptian dialect from four different domains. We are comparing our results to a document-pivot based approach and investigate which approach performs better to extract the topics in the underlying datasets. By applying t-test on recall, precision, and F1 measure values for both approaches on different datasets from different domains we confirmed our hypothesis that feature-pivot approach performs better in extracting topics from Egyptian dialect tweets in the datasets in question.

Paper Nr: 17
Title:

Automated Analysis of Job Requirements for Computer Scientists in Online Job Advertisements

Authors:

Joscha Grüger and Georg J. Schneider

Abstract: The paper presents a concept and a system for the automatic identification of skills in German-language job advertisements. The identification process is divided into Data Acquisition, Language Detection, Section Classification and Skill Recognition. Online job exchanges served as the data source. For identification of the part of a job advertisement containing the requirements, different machine-learning approaches were compared. Skills were extracted based on a POS-template. For classification of the found skills into predefined skill classes, different similarity measures were compared. The identification of the part of a job advertisement containing the requirements works with the pre-trained LinearSVC model for 100% of the tested job advertisements. Extracting skills is difficult because skills can be written in different ways in the German language – especially since the language allows ad-hoc creation of compound. For extraction of skills, POS templates were used. This approach worked for 87.33% of the skills. The combination of a fasttext model and Levenshtein distance achieved a correct assignment of skills to skill classes for 75.33% of the recognized skills. The results show that extracting required skills from German-language job ads is complex.

Paper Nr: 19
Title:

Applying Heuristic and Machine Learning Strategies to Product Resolution

Authors:

Oliver Strauß, Ahmad Almheidat and Holger Kett

Abstract: In order to analyze product data obtained from different web shops a process is needed to determine which product descriptions refer to the same product (product resolution). Based on string similarity metrics and existing product resolution approaches a new approach is presented with the following components: a) extraction of information from the unstructured product title extracted from the e-shops, b) inclusion of additional information in the matching process, c) a method to compute a product similarity metric from the available data, d) optimization and adaption of model parameters to the characteristics of the underlying data via a genetic algorithm and e) a framework to automatically evaluate the matching method on the basis of realistic test data. The approach achieved a precision of 0.946 and a recall of 0.673.

Paper Nr: 33
Title:

Suicidal Profiles Detection in Twitter

Authors:

Atika Mbarek, Salma Jamoussi, Anis Charfi and Abdelmajid Ben Hamadou

Abstract: About 800 000 people commit suicide every year and detecting suicidal people remains a challenging issue as mentioned in a number of suicide studies. With the increased use of social media, we witnessed that people talk about their suicide plans or attempts in public on these networks. This paper addresses the problem of suicide prevention by detecting suicidal profiles in social networks and specifically twitter. First, we analyse profiles from twitter and extract various features including account features that are related to the profile and features that are related to the tweets. Second, we introduce our method based on machine learning algorithms to detect suicidal profiles using Twitter data. Then, we use a profile data set consisting of people who have already committed suicide. Experimental results verify the effectiveness of our approachin terms of recall and precision to detect suicidal profiles. Finally, we present a Java based prototype of our work that shows the detection of suicidal profiles.

Paper Nr: 46
Title:

Model-based Integration of Unstructured Web Data Sources using Graph Representation of Document Contents

Authors:

Radek Burget

Abstract: Unstructured or semi-structured documents on the web are often used as a media for publishing structured, domain-specific data which is not available from other sources. Integration of such documents as a data source to a standard information system is still a challenging problem because of the very loose structure of the input documents and usually missing semantic annotation of the published data. In this paper, we propose an approach to data integration that exploits the domain model of the target information system. First, we propose a graph-based model of the input document that allows to interpret the contained data in different alternative ways. Further, we propose a method of aligning the document model with the target domain model by evaluating all possible mappings between the two models. Finally, we demonstrate the applicability of the proposed approach on a sample domain of public transportation timetables and we present the preliminary results achieved with real-world documents available on the web.

Paper Nr: 48
Title:

Detecting Political Bias Trolls in Twitter Data

Authors:

Soon A. Chun, Richard Holowczak, Kannan N. Dharan, Ruoyu Wang, Soumaydeep Basu and James Geller

Abstract: Ever since Russian trolls have been brought to light, their interference in the 2016 US Presidential elections has been monitored and studied. These Russian trolls employ fake accounts registered on several major social media sites to influence public opinion in other countries. Our work involves discovering patterns in these tweets and classifying them by training different machine learning models such as Support Vector Machines, Word2vec, Google BERT, and neural network models, and then applying them to several large Twitter datasets to compare the effectiveness of the different models. Two classification tasks are utilized for this purpose. The first one is used to classify any given tweet as either troll or non-troll tweet. The second model classifies specific tweets as coming from left trolls or right trolls, based on apparent extreme political orientations. On the given data sets, Google BERT provides the best results, with an accuracy of 89.4% for the left/right troll detector and 99% for the troll/non-troll detector. Temporal, geographic, and sentiment analyses were also performed and results were visualized.

Paper Nr: 59
Title:

A Novel Query Language for Data Extraction from Social Networks

Authors:

Francesco Buccafurri, Gianluca Lax, Lorenzo Musarella and Roberto Nardone

Abstract: Online Social Networks (OSNs) represent an important source of information since they manage a huge amount of data that can be used in many different contexts. Moreover, many people create and manage more than one social profile in the different available OSNs. The combination and the extraction of the set of data from contained in OSNs can produce a huge amount of additional information regarding both a single person and the overall society. Consequently, the data extraction from multiple social networks is a topic of growing interest. There are many techniques and technologies for data extraction from a single OSN, but there is a lack of simple query languages which can be used by programmers to retrieve data, correlate resources and integrate results from multiple OSNs. This work describes a novel query language for data extraction from multiple OSNs and the related supporting tool to edit and validate queries. With respect to existing languages, the designed language is general enough to include the variety of resources managed by the different OSNs. Moreover, thanks to the support of the editing environment, the language syntax can be customised by programmers to express searching criteria that are specific for a social network.

Paper Nr: 4
Title:

SATALex: Telecom Domain-specific Sentiment Lexicons for Egyptian and Gulf Arabic Dialects

Authors:

Amira Shoukry and Ahmed Rafea

Abstract: Given the sacristy of the Arabic sentiment lexicon especially for the Egyptian and Gulf dialects, together with the fact that a word’s sentiment depends mostly on the domain in which it is used, we present SATALex which is a two-part sentiment lexicon covering the telecom domain for the Egyptian and Gulf Arabic dialects. The Egyptian sentiment lexicon contains close to 1.5 thousand Egyptian words and compound phrases, while the Gulf sentiment lexicon contains close to 3.5 thousand Gulf words and compound phrases. The development of the presented lexicons has taken place iteratively, in each iteration manual annotators analyzed tweets for the corresponding dialect to try to extract as many domain specific words as possible and measure their effect on the performance of the classification. The result are lexicons which are more focused and related to the telecom domain more than any translated or general-purpose sentiment lexicon. To demonstrate the effectiveness of these built lexicons and how directly they can impact the task of sentiment analysis, we compared their performance to one of the biggest publicly available sentiment lexicon (WeightedNileULex) using Semantic Orientation (SO) approach on telecom test datasets; one for each dialect. The experiments show that using SATALex lexicons improved the results over the publicly available lexicon.

Paper Nr: 24
Title:

Knowledge Discovery from Log Data Analysis in a Multi-source Search System based on Deep Cleaning

Authors:

Fatma Z. Lebib, Hakima Mellah and Abdelkrim Meziane

Abstract: In a multi-source search system, understanding users’ interests and behaviour is essential to improve the search and adapt the results according to each user profile. The interesting information characterizing the users can be hidden in large log files, whereas it must be discovered, extracted and analyzed to build an accurate user profile. This paper presents an approach which analyzes the log data of a multi-source search system using the web usage mining techniques. The aim is to capture, model and analyze the behavioural patterns and profiles of users interacting with this system. The proposed approach consists of two major steps, the first step “pre-processing” eliminates the unwanted data from log files based on predefined cleaning rules, and the second step “processing” extracts useful data on user’s previous queries. In addition to the conventional cleaning process that removes irrelevant data from the log file, such as access of multimedia files, error codes and accesses of Web robots, deep cleaning is proposed, which analyzes the queries structure of different sources to further eliminate unwanted data. This allows to accelerate the processing phase. The generated data can be used for personalizing user-system interaction, information filtering and recommending appropriate sources for the needs of each user.

Paper Nr: 71
Title:

Power Plants Failure Reports Analysis for Predictive Maintenance

Authors:

Vincenza Carchiolo, Alessandro Longheu, Vincenzo di Martino and Niccolo Consoli

Abstract: The shifting from reactive to predictive maintenance heavily improves the assets management, especially for complex systems with high business value. This occurs in particular in power plants, whose functioning is a mission-critical task. In this work, an NLP-based analysis of failure reports in power plants is presented, showing how they can be effectively used to implement a predictive maintenance aiming to reduce unplanned downtime and repair time, thus increasing operational efficiency while reducing costs.

Paper Nr: 77
Title:

Emotion Recognition from Speech: A Survey

Authors:

Georgios Drakopoulos, George Pikramenos, Evaggelos Spyrou and Stavros J. Perantonis

Abstract: Emotion recognition from speech signals is an important field in its own right as well as a mainstay of many multimodal sentiment analysis systems. The latter may as well include a broad spectrum of modalities which are strongly associated with consciously or subconsciously communicating human emotional state such as visual cues, gestures, body postures, gait, or facial expressions. Typically, emotion discovery from speech signals not only requires considerably less computational complexity than other modalities, but also at the same time in the overwhelming majority of studies the inclusion of speech modality increases the accuracy of the overall emotion estimation process. The principal algorithmic cornerstones of emotion estimation from speech signals are Hidden Markov Models, time series modeling, cepstrum processing, and deep learning methodologies, the latter two being prime examples of higher order data processing. Additionally, the most known datasets which serve as emotion recognition benchmarks are described.