EXPLORING MANUAL CORRECTION AS A SOURCE OF USER FEEDBACK IN PAY-AS-YOU-GO INTEGRATION

Nurzety Aqtar Ahmad Azuan, Nurzety Aqtar (2021) EXPLORING MANUAL CORRECTION AS A SOURCE OF USER FEEDBACK IN PAY-AS-YOU-GO INTEGRATION. Doctoral thesis, THE UNIVERSITY OF MANCHESTER.

[img] Text
EXPLORING MANUAL CORRECTION AS SOURCE OF USER FEEDBACK IN PAY-AS-YOU-GO INTEGRATION_NUR ZETY AQTAR OCR.pdf - Submitted Version
Restricted to Registered users only

Download (23MB) | Request a copy

Abstract

Current practice in data integration typically requires extensive upfront effort, such as defining a schema mapping before a useful result can be produced. Dataspace is introduced to minimise the initial design process of data integration by automating the parts of the data integration task that need technical know-how and taking an incremental approach to components that require domain knowledge through user feedback. An imperfect integration of data is constructed quickly and at a minimal cost. This approach is promising, but open problems remain. One issue is that the continuous cycle of feedback in dataspace may not instantly reflect expected results even after obtaining feedback from the user. Because dataspace may use the feedback to fix its underlying structure (such as matching), the outcome from the feedback is invisible to the end user. As a result, the end-user chooses to manually fix the integration results outside the confine of dataspaces. Hence, the opportunity for dataspace to gather feedback and learn is lost. In this thesis, we aim to leverage the manual correction effort performed by the user of data, such as data scientists, on query results that they manually improve. This dis sertation demonstrated how manual correction could be used to infer feedback values without requiring extra effort from the user. This thesis proposes a general frame work for manual correction as another source of implicit feedback. The proposed general framework aims to determine the potential for manual correction and identify how manual correction can fit into dataspace settings. We then explore other areas of dataspace integration that can maximise the inferred value extracted from manual correction. First, we demonstrate an approach that uses manual correction to inferred feedback values for schema mapping. We compare our work with an existing practice that uses explicit feedback for schema mapping improvement to evaluate our proposed method. Next, we explore the usage of manual correction as example pairs for automated format transformation tasks. We devise three strategies for assessing the proposed approach, and we evaluate our strategies using existing format transformation tools. Lastly, we investigate ways to reuse the iterative effort of manual correction in dealing with changing data sets through a case study on a real-world database (UniPro-tKB). Our approach to tackling changes in data involves several tasks such as detecting and storing changes and re-apply changes inferred from previous manual correction. We measure our approach over existing work that also deals with real-world database. The results from all three works confirm that a manual correction approach is cost efficient when dealing with the specific areas of data integration that we explored.

Item Type: Thesis (Doctoral)
Subjects: Q Science > QA Mathematics
Depositing User: Encik Mohd Zulkarnain Hassan bin Mohd Zainudin
Date Deposited: 04 Oct 2024 15:02
Last Modified: 04 Oct 2024 15:02
URI: https://repositori.mohe.gov.my/id/eprint/14

Actions (login required)

View Item View Item