Development of a Lossy Online Mouse Tracking Method for Capturing User Interaction with Web Browser Content
Development of a lossy online mouse tracking method for capturing user interaction with web content. from Fajar Purnama
Declaration of Authorship
I, Fajar PURNAMA , declare that this thesis titled, “Development of a Lossy Online Mouse Tracking Method for Capturing User Interaction with Web Browser Content” and the work presented in it are my own. This thesis is based on few of my publications and I hereby confirmed that I have permission to reuse them:
Though people are confined inside their houses due to COVID-19, they are forced to continue their activities online. The demand for tools to monitor these activities increases for example, making sure students reads materials, and examiners does not cheat during online examinations. Unfortunately, conventional web logs cannot monitor those kinds of activities. One monitor tool is mouse tracking that tracks the actions of the mouse cursor that includes clicks, movements, and scrolls, which covers the majority of online users’ interaction to the browser contents. Though mouse tracking is promising, very few implemented this tool because (1) previous mouse tracking tools requires desktop installations which is bothersome to the users and (2) the rumors that mouse tracking generates big data such as the saying a swipe from left to right generates a megabyte of data. This thesis tackles those problem by building a mouse tracking server application that is easily installable and does not require users to install any additional applications other than the web browser. The application was implemented in an overseas quiz session between National University of Mongolia and Kumamoto University where the amount of data generated was also investigated. This thesis also contributes to a lossy online mouse tracking method that can greatly reduce the amount of data generated. Finally, some visualization of the mouse tracking data are shown and possible application such as online examination cheating prevention and force reading of term of service are discussed.
My first gratitude would be to my supervisor Prof. Tsuyoshi Usagawa for taking care of me for five years starting from my Master’s program until the end of my Doctoral program. His deeds are almost immeasurable because without him, Kumamoto University, and The Ministry of Education, Culture, Sports, Science and Technology Japan, my currently best five years of my life may not be possible. I would like to thank my reviewers Prof. Kohichi Ogata, Prof. Kenichi Sugitani, Prof. Masahiko Nishimoto, and Prof. Masayoshi Aritsugi for their time in reviewing this thesis. I greatly thank my friend Alvin Fungai as the co-founder for this topic where without him, the topic of this thesis would have been different and I may be late in finishing this thesis because I have tried doing other topics and found to be much more difficult or just does not suit me. The critical development phase of this research was thanks to the Computer Algorithm class by Prof. Masayoshi Aritsugi and all of the participating members that included Hendarmawan, Hamidullah Sokout, Alhafiz Akbar Maulana, and Sari Dewi where in those moments that I decided this topic as my Doctoral thesis. The implementation and data were thanks to Dr. Otgontsetseg Sukhbaatar, Prof. Lodoiravsal Choimaa, and the students in School of Engineering and Applied Sciences, National University of Mongolia where without them, this topic may not make it to two international journal publication and may prevent the completion of this thesis. Lastly, I would like to thank my mother Linawati, father Teddy Junianto, and Ni Nyoman Sri Indrawati for their daily support.
Table of Contents
List of Figures
List of Tables
Thanks to the development of information communication technology (ICT), humanity lives in convenience. It is no longer necessary to spend much effort to seek information. Whereas in the past, people needs to travel to libraries to seek books, buy newspapers to get the latest news, gather in a community to hear the latest rumors, or even start a pilgrimage to find a master. Nowadays, most information are available in the Internet. With ownerships of portable computer devices that can connect to the Internet from anywhere becoming mainstream, anyone can search for their desired information (Dentzel, 2013).
The Internet is not only an open massive source of information where anyone can publish, but also a tool for distant activities. People can interact with each other without meeting through text, voice, or video messages regardless the time and place. More people do not go to shop but order items through online shopping. In some countries like Indonesia, they develop an application that can order variety of services online (Azzuhri et al., 2018) such as meal delivery service, calling house cleaners, calling therapist, etc.
Due to the recent COVID-19 pandemic that occurred early February 2020, most regions are in a lockdown where people are to stay away from each other (mostly asked to stay at home) to prevent the spread of infection. Even school closes, most governments around the world have temporarily closed educational institutions in an attempt to contain the spread of the COVID-19 pandemic (UNESCO, 2020). All forms of activities are recommended to be done online which includes educational activities where courses are switched from face to face to online. The basic of online course that is known today is materials provided online, online text discussion forum, a feature to submit assignments online, online quiz session, (Linawati, Wirastuti, and Sukadarmika, 2017) and the features to analyze and evaluate students' performance. For interactivity, people prefer to join live streaming videos, webinars, online game sessions, interactive online programming, etc.
Unfortunately, conventional web analytic does not measure up to how teachers examine or analyze students during face to face private tutors. Teachers normally able to examine students' attention, emotion, and motivation during studying in real-time, but conventional web analytic does not provide such features for online education. This reason is especially true for a very crucial educational activity which is examination. Security is very tight for face to face examinations to prevent dishonest behavior but this is not true for online examinations today. This is why most educational institute implements blended (Paturusi, Chisaki, and Usagawa, 2012) learning which is a mixed face to face course and online course than implementing full online course. This applies to anything online, not only with education, for example during shopping, shop owners are able to identify the interest of their customers face to face and act accordingly. The simplest example people can see whether someone is skimming or pay close attention during reading when face to face. In online reading, people normally cannot know whether the viewer is actually reading the materials or not. An example crucial demand is reading detection of agreements or terms of services. Most people scrolls down and accept the terms of services without actually reading them.
The lack of data for online analytic can actually be solved by eye tracking, mouse tracking, and all other online monitoring techniques in real-time. Although these techniques were introduced in the early 20th century, they are still rarely implemented. One of the main reasons is the huge data generated by these techniques which is too much for most administrators and analyzers to handle (Leiva and Huang, 2015). This connects to the next reason that the previous applications only suit academia and does not suit wide implementation. For eye tracking is that the hardware are intrusive where users usually have to wear googles. Though non-intrusive ones exists but they are most likely expensive. Mouse tracking are non-intrusive and no cost because in default they are available in every computer where no additional hardware is needed. However, the previous application are only suited in laboratory where they are installed offline in each computer and not online. This thesis tackles that problem.
This thesis proposes a new preprocessing based on demand method specifically for online mouse tracking. It is a method that allows the implementer to determine the data they need before implementation. Amongst those data, the geometrical data (x and y mouse coordinates) are the largest one generated. Most of the time, implementer do not need all the data. Therefore, the data generation along with resource usage can be reduced if they choose the region of interest beforehand. In summary, by summarizing the coordinates into areas, the data generated can be reduced which will also reduce the resource usage.
1.6 Benefit and Significance
1.7 Thesis Structure
Other than the introduction, this thesis contains four more chapters. The second chapter is online mouse tracking implementation and investigation where this chapter discusses the implementation of online mouse tracking in any website and browser, and the amount of data generated. The third chapter is online mouse tracking resource usage reduction methods where known methods, real-time implementation, and the novel method of preprocessing based on demand is discussed. The fourth chapter is the depth levels of web logs and educational data which emphasizes mouse tracking logs as deeper level data than conventional educational data logs. The last chapter is conclusion and future work.
2 Online Mouse Tracking Implementation and Investigation
2.1 System Overview
2.1.1 Mouse Tracking in Web Development
The core of mouse tracking in web development is Domain Object Model (DOM) which is an Application Programming Interface (API) for Hypertext Markup Language (HTML) and Cross Markup Language (XML). It defines the logical structure of documents and the way a document is accessed and manipulated. Supposed a simple HTML page with the codes on Table 2.1, the DOM structure can be represented on Figure 2.2. With the Document Object Model, programmers can build documents, navigate their structure, and add, modify, or delete elements and content. Anything found in an HTML or XML document can be accessed, changed, deleted, or added using the Document Object Model, with a few exceptions. DOM is designed to be used with any programming language. Currently, it provides language bindings for Java and ECMAScript (an industry-standard scripting language based on JS and JScript) (Wood et al., 1998).
Table 2.1 A web page code in simple HTML that contains html, head, title, body, p, and footer tags (Purnama and Usagawa, 2020)
The implementation of mouse tracking is based on DOM events, specifically mouse, touch, and User Interface (UI) events which are actions that occur as a result of the user's mouse actions or as a result of state change of the user interface or elements of a DOM tree (Pixley et al., 2000). In this thesis jQuery is used to access the DOM API and receive information that are related to mouse, touch, and UI events. The following list shows the mouse events utilized in this thesis:
There are many DOM events that are not implemented by the application in this thesis. However, they maybe implemented in the future if they are found to be useful. But for now, the following DOM events other than mouse events are worth considering and are implemented:
After implementing the DOM events, the information is processed by adding important labels. The first labels are time information such as the date of the received information and duration by calculating the difference between the current and previous received events. The second labels are the place information such as the category, page, post, course, course content, or if those information are not available then the default information is the Uniform Resource Locator (URL). More in-dept place information are the areas or sections of the page, and the deepest of them all are the coordinates of the page. The third label is the identity label if available and permitted such as the name, email address, ip address, and location of the user.
2.1.2 Online Mouse Tracking System
The author developed an online mouse tracking application implementable on any website where the code is open source on GitHub (Purnama, 2019). It is written in HTML, Cascading Style Sheets (CSS), JS, jQuery, and PHP. The mouse tracking code can either be implemented on client side shown on Figure 2.3 or server side shown on Figure 2.4. The difference is that the client side can capture anything including all the web page that the user visits while the server side can only capture the events that happen on the server's website.
Figure 2.5 shows a more detailed server side implementation. The mouse, touch, and UI DOM events in the previous subsection are written in JS and jQuery and are placed on the representation side which is the website along with the HTML and CSS. The order the online mouse tracking in Figure 2.5 are:
For the server application, the advantage is that client does not need to install additional application, just browse the website and mouse tracking runs automatically but the disadvantage is that it cannot track outside of the website however it can still tell whether users' are leaving the page or not. For the server hardware depends on the amount of users that the administrators want to handle and as for the hardware specification used in this thesis is discussed on the next section. For the software, a standard web server is enough such as a server equipped with Apache2, PHP, and MySQL. For the installation, the author made it easy that all that are needed are to download the codes and install. In this thesis, the mouse tracking server application was implemented on an Learning Management System (LMS) called Moodle which is used to handle online courses. The mouse tracking codes are rearranged as a Moodle plugin where the author made a block and theme plugin for the Moodle shown on Figure 2.4. For usage, online choose one form of the plugin, either block or theme. The installation is also easy shown on Figure 2.6 where the process are only download, upload the plugin to Moodle, and install.
2.1.3 Privacy Policies
Privacy policies should be disclosed to the users during any form of data gathering. In the European Union (EU) is more strict that cookie policies should be separated from the privacy policies. By disclosing privacy policies, not only being in compliance with the laws and regulations, but build trusts with the users as well (PrivacyPolicies.com, 2020).
Based on how mouse tracking is executed which more details are illustrated in Figure 2.5, users actually have full control over the mouse tracking process and they can stop the process anytime but they are usually unaware because the mouse tracking runs in the background. They would have to thoroughly inspect the background area to see the running mouse tracking and most users do not attempt to perform this task because they do not feel bothered by the process. This is the reason why mouse tracking is considered non-intrusive.
2.2 Network Data Transmitted by One Click
Leiva and Huang, 2015 stated that a mouse swipe from left to right can generate hundreds of cursor coordinates and a mouse activity over a minute can generate 1 MB (megabyte) of data. Huang, White, and Dumais, 2011 conducted a massive scale mouse tracking on Microsoft’s Bing search engine but in the middle of the experiment, they have to reduce the sampling rate because the data size was simply too much. Those two references are the only scientific record found that complains about the problem of huge data generated by mouse tracking. This shows that data generated and the resource usage are not officially investigated. Therefore, an implementation followed by investigation were conducted by Purnama et al., 2020b.
2.2.1 Peer to Peer Experiment
The one click Peer-to-Peer (P2P) experiment is an experiment that measures the amount of data transmitted from the client to server when the user performs one click shown on Figure 2.8. This experiment greatly helps the investigation because the result can be used to predict the data cost mathematically. However, the result is dependent on the application, as time passes people may find ways to reduce the data.
The online mouse tracking application was installed on the author’s Moodle server. The resource costs were then measured. The data rate of the network was measured using a tool called Wireshark. The server is an Ubuntu 18.04 Long Term Service (LTS) server equipped with an Intel(R) Core(TM) i7-6800K Central Processing Unit (CPU) @ 3.40 Giga Hertz (GHz) (with SSE4.2) CPU, 32 Giga Byte (GB) of DDR4 Random Access Memory (RAM), 10 Tera Byte (TB) of hard drive, and an allocated 2 Mega Byte per second (MBps) network.
2.2.2 Data Generation Estimation for Implementation Plan
The result on Table 2.2 showed that one click generates around 3-4 kilo Byte (kB) of transmission data. In other words, the mouse tracking application generates around 3-4 kB when one event occurs. The size depends on the metadata where in this case the size greatly increases when date and URL are included because they contain many characters.
If the administrator can estimate the amount of users and the average amount of events generated by users, then the administrator can estimate the amount of data to be generated. Rheem, Verma, and Becker, 2018 states that a very high activity is around 70 events per second. Based on Figure 2.9, expect a worst case scenario that a user generates a data rate of 210-280 kilo Byte per second (kBps).
2.3 Overseas Online Mouse Tracking Implementation
2.3.1 Quiz Details
An online quiz session was conducted on the 3rd of January 2019 between approximately 12:00 and 14:30 Japan standard time. There were 2 sessions, with each session lasting approximately an hour and including 20 and 21 students (41 total students participating) from the School of Engineering and Applied Sciences, National University of Mongolia accessing the Moodle server at the Human Interface and Cyber Communication Laboratory, Kumamoto University. The map illustration is shown on Figure 2.10.
The quiz is a part of a mid-term exam of Microprocessor and Interfacing Techniques course for sophomore and junior year students in Department of Electronics and Communication Engineering, National University of Mongolia. The quiz is on https://md.hicc.cs.kumamoto-u.ac.jp. Figure 2.11 shows a screenshot of the Moodle log file and Figure 2.12 shows a screenshot of students grade of the quiz session. The detailed anonymous log files are published in Mendeley Data (Purnama et al., 2020a). The internet protocol (IP) address of the students for example “184.108.40.206” can be tracked by geo-location that it originates from Mongolia and “https://md.hicc.cs.kumamoto-u.ac.jp” which can be nslookup as “220.127.116.11” originates from Japan.
2.3.2 Amount of Data Generated
The screenshot of mouse tracking log can be seen in Figure 2.13. Based on the data shared in Mendeley (Purnama et al., 2020a), the majority of the events are mouse movements and scrolls. That is because each change that occurred in either on the mouse cursor or scroll positions are captured. Rapid mouse movements or scrolls will generate large amount of data and how much depends on the capabilities of the computer. Theoretically, if the mouse cursor travels a distance of 1000 pixels than the number of mouse movement events generated are 1000, and if the scroll distance from top to bottom is 1000 pixels than the number of scroll events generated are 1000. In short, the capturing of geometrical data which is the x and y coordinates of the mouse cursor and scroll is the cause of the huge data generation. Also, the affect is multiplied to the amount of labels attached such as the user's identity that did the events, the place, and the time of the event occurrences. Just removing the URL label can save a lot of data space.
During the quiz session, Figure 2.14 shows that a student is capable of generating a total over 20000 events which is over 80 Mega Byte (MB) transmission data. This means that student had to upload 80 MB of data at the end of the student's mouse tracking session in each page. According to Ookla, 2020 the global average network speed is 9.3 MBps downlink and 3.9 MBps uplink. This means there exist countries with the average network speed below that. Although nowadays are common for university size institutions to have network speed over 100 MBps, those resources are usually already allocated for many things. For example, the author's laboratory was only given 2 MBps network speed, meaning the mouse tracking session can flood the network. This explains why administrators are reluctant in implementing online mouse tracking. Imagine how much data can be generated if online mouse tracking is implemented by the whole university daily and full time.
The amount of mouse tracking data compared to page view and other conventional web analytic were almost incomparable. Table 2.3 shows that the moodle log and grade of the quiz session were only a few kilobytes while mouse tracking log is already over a hundred megabytes. In that table is also shown other logs that required long duration and many users to reach the amount of data that mouse tracking log has. While a few hard drive storage are enough to store conventional web and educational logs, many more hard drive storage are needed to store mouse tracking logs.
3 Online Mouse Tracking Resource Saving Methods
It is unfortunate that the online mouse tracking resource usages are too much for regular people to implement daily and full time except for special occasions only such as examinations. The ones who can implement online mouse tracking daily and full time are big institutions such as Amazon and Google. Therefore, on this chapter is discussed the novel method of this thesis to reduce the resource usage of online mouse tracking.
3.1 Existing Methods
Existing methods to reduce mouse tracking data transmission are common sense and popular methods where most of them were discussed by Purnama et al., 2020b. They are:
3.2 Real-Time Online Mouse Tracking
The conventional data transmission method is to transmit the data as a single package at the end of each mouse tracking session. Based on Figure 2.14, this conventional transmission method floods the 2 MBps network. The author anticipated this and implemented real-time transmission (Purnama et al., 2020b) method avoiding often 2 MBps flood which was reduced to data rate of average around 100 kBps. Although the average data rate is 100 kBps, Figure 3.1 shows many spikes where the difference between average and maximum is large which indicates that there were moments of high activities. The highest spike is around 800 kBps. The spikes are not only pointing upward but pointing downward as well which indicates that there are also moments of low activities. Overall, the standard deviation is high where there were times when activities were high and activities were low, thus precise data usage can be difficult to predict.
The difference between offline mouse tracking, online mouse tracking, and real-time online mouse tracking can be described on Figure 3.2. While offline mouse tracking stores the data in each of the users' computers, online mouse tracking transmits the data to the server. While conventional online mouse tracking stacks the data until the end of every session before transmitting as a single package, real-time online mouse tracking transmits the data immediately after an event occurs every time. Real-time online mouse tracking helps in reducing the probability of bottleneck as illustrated on Figure 3.3. This helps to balance the transmission load.
3.3 Lossy Online Mouse Tracking
3.3.1 Three Mouse Tracking Preprocessing and Transmission Method
In the end of Chapter 2, it is known that the capturing of geometrical data which are the x and y coordinates of the occurred events and the time stamping of each events are the largest contribution to the data size. If the geometrical data can be reduced then the data size can be reduced as well. Based on many example mouse tracking data analysis, there are three possible cases illustrated on Figure 3.4:
By knowing the geometrical data that the analysers wants, the storage and transmission cost can be reduced by applying preprocessing and modifying the transmission method based on Figure 3.5. The default one is the real-time online mouse tracking where the event information is immediately sent to the server at the moment it occurred. For the summarized event amount, only the amounts of events are recorded excluding the place and time of occurrence. It is discouraged to update the event amount in real-time because that will cost data on the network. Instead, it is best to utilize the conventional transmission method where the final event amount value is sent only once at the end of each session (refer to Figure 3.2 online mouse tracking transmission not in real-time). Unfortunately, there are still some potential problems to this conventional transmission method implementation where if the user ends the session in haste, the time may not be enough to retrieve the mouse tracking from the client to the server and potentially losing the data. For ROI mouse tracking, the amount of events are accumulated when the mouse cursor is still within a specific area. When the mouse cursor moves to a new area, the event amount information of the previous area is sent to the server, and the process repeats. There is still a limit in determining and labelling web page areas. Usually, it is done manually by the analyzers but this way is very labor and time consuming. It is possible to determine and label areas automatically using offset DOM event, but not in a smart way where it depends on the layout of the web page. After the areas are determined for the ROI mouse tracking, the transmission method is a hybrid of conventional and real-time where the mouse cursor enters an area and accumulates the event amounts, then the result is transmitted after the mouse cursor leaves the area, and the process repeats upon entering a new area.
3.3.2 Three Mouse Tracking Preprocessing and Transmission Simulations
Since the author did not have another mouse tracking experiment opportunity, a simulation is conducted based on the previous mouse tracking experiment on Figure 3.6. It is possible to replay the scenario because the date of each events during the mouse tracking session was captured. However, there was a limit at that time that half of the students are using different time zone format which was difficult to simulate and half of the students are excluded leaving only 23 students.
Additionally in this thesis, the author simulate the mouse tracking on a single board computer Raspberry Pi 3 to sympathize with those that are in limited connectivity region where the method of mouse tracking quiz session is locally illustrated in Figure 3.7. Also, it is interesting to see how much the Raspberry Pi 3 can handle mouse tracking simulation in terms of CPU and RAM.
Five mouse tracking simulations are performed on a quiz page with a size or dimension of 1920x1080 pixels:
3.3.3 Three Mouse Tracking Preprocessing and Transmission Results
The result is that a great reduction in data size is achieved by sacrificing some geometrical data for ROI mouse tracking and all geometrical data for summarized event amount shown on Table 3.1. Surprisingly on the user side, the script total execution time on the browser was also reduced shown on Figure 3.8. The transmission cost was also reduced shown by the reduced data rate on Figure 3.9 which is also in parallel to the server's CPU and RAM usage.
The Raspberry Pi's CPU is not strong enough to handle the default mouse tracking simulation of around 20 users where the CPU often reach 100% usage. Even the RAM usage is abnormally high over hundreds of MB. However, it is able to handle ROI mouse tracking and summarized event amount method. This shows how useful the data reduction method are.
Among the three mouse tracking method, the summarized event amount method is the maximum resource reduction because all the geometrical and time data are excluded or simply only consist of one area. Theoretically, the amount of query is reduced to one per mouse tracking session. For the ROI mouse tracking, does not necessary always result in large resource reduction like the result in this thesis. Theoretically, it depends on the area division of the web page. The smaller the division, the larger the area, the larger the resource reduction, and vice versa. By performing more division, the areas become smaller, the resource usage becomes larger, and eventually the area will become as small is coordinates if areas are kept being divided which will become the same as default mouse tracking.
3.3.4 Synchronization for Hand Carry Server Quiz
The teacher may decide to conduct the quiz locally using hand carry server illustrated in Figure 3.7 for various limited connectivity reasons such as expensive or unstable Internet connection. If the log data is only for the teacher to use, then all is well, but if it is for institutional use, the teacher may have to synchronize the data to the institution's server. It will be wise to use incremental synchronization method illustrated on Figure 3.10 to reduce data especially for large data like mouse tracking log.
There are two ways to perform incremental synchronization. The first one is to store the data in Structured Query Language (SQL) which is mostly used in database applications. SQL stores the data in form of table and to update is just sending new rows from the teacher's database to the institution's database. Most log data are in unidirectional incremental/addition fashion which is why SQL is mostly used. However, if the update is more than just incremental such as correction where there are deletion and modification than it is more complicated for SQL to handle (Purnama, Usagawa, Ijtihadie, et al., 2016). The most popular algorithm to handle this update is the rsync algorithm illustrated on Figure 3.11. Example use case are when teacher forgot to exclude private data when privacy is a concern and accidentally upload to the server. In this case, the teacher would want to remove the private data in each query where rsync can save resource cost. Though, this is less likely to occur. A more realistic case is a teacher needed to update their quiz contents from the server where the update is made of addition, deletion, and relocation.
4 The Depth Levels of Logs
Back in Chapter 1, it was emphasized that conventional web logs and educational data have a limitation regarding to the information that they can derive. Mostly, it was about how those conventional logs could not capture the users or students behavior online. Eye and mouse tracking solves that problem by capturing how the students interact. It took some time for the author to understand and conceptualize the meaning behind those repeating statements about what conventional log data cannot tell while eye and mouse tracking log can tell. It turns out to be that the depth level of those logs are different where eye and mouse tracking logs belong to a deeper level than conventional logs.
This thesis defines six depth level of web logs from browser content point of view shown on Figure 4.1. Most analyzers do not know that there are deeper level of logs. Most tools do not generate data in deeper level than web page level logs. The web log depth levels converted to educational data can be illustrated on Figure 4.2. Most educational tools only generate logs up to course content level which are mostly how many time the students attempts the activity and what grade they received. This chapter discussed the three deepest log levels and explained how mouse tracking belongs to the deepest log level.
4.1 Web page / Course Content Level Logs
4.1.1 Conventional Web Logs and Educational Data
The conventional web logs belongs up to the web page level log. They are mainly page views which shows that a web page from a certain website and category have been viewed (Bluehost, 2016). Additional metadata can be attached to the page view:
As page view belongs up to the third deepest level log, there is a limit how much it can tell no matter how hard it is analyzed. For example, page view cannot tell how a user is reading a content such as whether the user is skimming or reading in detail. The limit is that page view cannot capture activities that occurred in specific area of the web page. In education, there are four popular logs that are used by teachers which are materials the student read, assignments submitted, topics discussed in forum, and quiz or exams grades. Unfortunately just as conventional web logs, conventional educational data can only tell what activities the students are doing and its duration but cannot tell how the students attempts those activities which can be more emphasized on Figure 4.3. In other words, it can identify a certain extent of what, when, where, and who but cannot identify deeper and how the viewer interacts with the contents (Purnama et al., 2016) (Purnama, Fungai, and Usagawa, 2016).
4.1.2 Amount of Interactions
Although the summarized event amount of mouse tracking is on the depth level of web pages or course contents, it is still not widely known by analyzers. DOM events can tell many other interactions users does on the web page. The simplest of them are knowing how much interaction the user does such as how many clicks, how many touch, how much mouse movements, how much scrolls, how much zoom in and zoom out, how many copy and paste, how many times the keyboard was pressed, and etc. Table 4.1 shows that the Mongolian students attempting the quiz session took at average 1368 seconds, performed at average 175 left clicks, 8 middle clicks, 11004 mouse movements, and 4158 scrolls.
Knowing the amount of DOM event occurrence on a web page may give a hint whether the web page fulfills its purpose or not. For example, a web page designed based on game theory are bound to be interactive where if there are less events such as clicks, movements, etc, may show that the users does not engage on the web page, whereas if the web page is designed for reading and there are many events, then there must be something wrong. The author expect high amount of DOM event done by the students because they are attempting a quiz where they need to perform many clicks to choose an answer, and need to perform many movements to read the questions carefully and maybe reviewing some questions. If there is no problem with the web page then there can be problems with the users. A study showed by Rodrigues et al., 2013 that high amount of events generated by a user can indicate that the user is stressed. Theoretically, there should be a common sense of how much a user should generate events within a certain amount of duration.
4.1.3 Web Page or Course Content Inactivity
Web page or course content inactivity is another DOM mouse event feature that analyzers does not know. In page view, the duration can be counted on visited web page but it cannot tell whether the users are actually in the web page the whole time because they can just open another tab and leave the previous ones open. With mouse DOM events, it is possible to distinguish the amount of active and inactive time of users within a web page. The inactivity is indicated when the mouse cursor leaves the web page for opening another tab or doing other activities and when the mouse cursor re-enters the web page, the status will show active again.
In Table 4.1, the amount of inactivity queries of each student are provided, and in Figure 4.4, the amount of inactivity in time domain are plotted. They showed that all the students does not always stay in the quiz page which opens the possibility that they are seeking information from outside source to answer the quiz better such as searching for answers in search engines and messaging friends online. The amount of inactivities could be exagerated due to system limitation reasons such as slow mouse leaves generates more inactivities query than fast mouse leaves. However, the system design still ensures that no inactivities queries will be generated if the mouse does not leaves the quiz area.
Aside from capturing inactivities, capturing highlight, copy, cut, and paste can help in detecting dishonest behaviors. An alarm system can be developed to inform the examiners when such events occurred. For important exams such as certifications, stricter systems can be implemented such as immediately failing the test when the mouse cursor leaves the exam illustrated on Figure 4.5.
4.2 Area Level Logs
Area level logs are logs showing activities within areas of the web page or course contents. This can be done by either or combination of capturing the mouse cursor position, the touch location, the scroll bar position, or tracking the eye ball position. Then capturing the date and time of the events that occurs in those positions. The ROI mouse tracking provides these kinds of information. The amount activity in each area for this thesis is based on the total amount of events.
The most popular analysis of area level logs are heatmap visualization. There are many indications that can be derived from heat maps. For example on a high activity or duration area, may indicate that users are interested in the area. If not, then they may have trouble with the area whether trouble in understanding the content, questions that are too difficult for example on Figure 4.6 that question three receives the most attention which may indicates difficulties, or there was design problems that results in unnecessary efforts on users to capture the information. On the other hand if the area has low activity or duration may indicate that the users are not interested, the design is not well enough to capture the users' attention, or the question in the quiz is simply too easy.
Figure 4.7 shows an even more detailed heatmap where the visualization was split into 10 minute intervals. Just from a glance it can be seen that the high activity time is the 30th, 90th, and 160th minute, they took a break on the 130th minute, and they finished on the 230th minute. Another interesting information is that they did not bother much with the last question, maybe whether they are too easy or they just want finish quickly because they are too tired.
Figure 4.8 shows another detailed heatmap regarding to the amount of activities done by each students on each area. The heatmap seems to vary to not showing much similarities between each students however, there are some. There can be seen a common correlation on question 13 that there are high activities and looking at the grade/score distribution in Figure 4.9, many students got the answer wrong which maybe common evidence that the question is too difficult for them that they had to take more effort in it. An opposite case is on question 6 where there are low activities but many students got the answer wrong which can lead the analyzer to wonder whether question is a trick question. Another similar case with strong similarity found between Figure 4.8 and Figure 4.9 that students did very little activity on the last question and but most the students got the answer wrong. Unlike question 6, it may not be a trick question but a difficult question because the score allocation is high. There maybe two possibilities where the first possibility is that the students ran out of time and since it is the last question, they may answer randomly, and the second possibility is that the students are lazy and/or tired that when they reach the last question that is difficult, they answer randomly because they may just wanted to finish the quiz quickly, giving up on the last question.
Those indications can be useful in many ways. For example, if the indications shows that users are not paying attention to areas which are intended to be emphasize by content creators then there needs design fixing or content revision. In education, the heat map can be useful to profile the students. It can then be followed by a guidance system that can automatically detects the students interest which the guidance system can guide the students in many ways such as linking to related resource, suggesting students their career path, grouping them with relevant community, etc. The profile can also be used in a stricter way where the teachers gives assignments to students about reading a context and the system will detect whether the students have sufficiently paid enough attention to the context or not.
Additionally there are some analyzers that counts the amount of mouse entering and leaving the area which is known as the mouse flow. In quiz sessions, it is normal to find many mouse flows because students tends to review or revisit the questions whether to double check or because they previously skipped them. On the other hand, for a website that is meant to guide or share information, many mouse flows may indicate problems for the website such as the users maybe confused in finding the information they need thus searching tirelessly (Hsu, Chang, and Liu, 2018).
A possible application is force reading illustrated on Figure 4.10, for example making sure the students read the agreement to tracked before exam and users read the term of service. The administrator can configure the variables such as the reading duration and amount of activities and areas. Simply, if the user did not read enough the area, then the user cannot pass and must read enough of the defined passage.
4.3 Coordinate Level Logs
The coordinate level are the deepest level logs. The coordinate values can either be based on document, screen, or windows perspective. This is the log that the default mouse tracking generates (Purnama et al., 2020b). It is overwhelming but contains the most information where this is the log that most analyzers should want to keep. The more shallow level such as the area level log can be derived from the coordinate level log and it is unidirectional where the vice versa is not possible (Purnama and Usagawa, 2020). The most popular analysis is to draw a mouse trajectory. If the time when the mouse cursor lands on the coordinates are recorded, then it is possible to replay what the users did.
An example visualization that can be drawn from the mouse tracking data is the mouse click trajectory in Figure 4.11. It shows a user highlighting a text which can indicate that a user is paying a attention to that text or attempts to copy that text to save in the user's note or to paste in the search engine to find more information about the text. The amount of highlights the students did was also summarized on Table 4.1 and showed that either the students who highlights gets high or low grade and not average grade. The speculation is that the questions they highlight are too difficult for them and either they succeeded in finding the answers on other sites or failed. Unfortunately, the copy and paste events were not implemented at that time. In fact, it is because the author found this highlighting that motivates the author to add copy, paste, and other DOM events into the mouse tracking application.
Although mouse tracking logs are part of the deepest level logs there is still a limit of how much the mouse cursor and scroll position can indicate because certain events does not necessary have to occur on those positions. For example, reading is based on the eye gaze and typing may occur not far from the mouse and scroll position but not necessarily exactly on those position. Each of these logs alone will not make the best logs but a combination of them. Combining conventional web logs or educational data with mouse tracking and eye tracking may provide a complete log.
5 Conclusion and Future Work
The author wrote an online mouse tracking application suitable for public implementation and implemented during a quiz session at the Human Interface and Cyber Communication Laboratory, Kumamoto University on the 3rd of January 2019 between approximately 12:00 and 14:30 Japan standard time. The amount of data generated by mouse tracking was investigated during the implementation and found that the cause of huge data generation is the capturing of geometrical data or coordinates of each event. Aside from existing solutions to reduce data, this thesis also implemented and discussed real-time transmission system in mouse tracking data retrieval helps distribute the network's burden across the time domain. The main novelty of this thesis is the select-able geometrical online mouse tracking method where there are possible cases that not all the geometrical data are required. The method allows summarizing of coordinates into areas or deleting the coordinates if they are not necessary. The results showed great reduction in storage and transmission costs. However, the method is lossy because the process is irreversible. Rich mouse tracking data were obtained and in this thesis a new concept of log dept level was discussed with example analysis that include click visualization and activity heatmap which help in identifying the interaction between the students' and the quiz page.
5.2 Future Work
The real-time transmission is not the best solution. A better method is to upgrade the real-time transmission method by integrating smart transmission method where the client can detect the traffic of the network and determine the optimal time for queuing and transmission. Although the select-able geometrical mouse tracking data method works perfectly, there are still problems with execution. If all of the geometrical data are excluded, the most efficient time to transmit the data is only once which is when the user leaves the page. However, the problem lies with the browser where there is currently no way to force the user to wait before the transmission process finishes, leaving potential problem of data loss. The problem for ROI tracking is that it cannot perform smart area determination and labelling. Normally, they are performed by humans. Therefore, one solution is to develop an artificial intelligence for this matter in the future. Finally, this doctoral thesis is only limited to mouse tracking with one type of activity which is examination. There are a various activities such as passage reading, e-commerce, entertainment, Geo-visualization reading, search engine, social media, etc which are open for future work.
Appendix A Data
A.1 Quiz Areas
A.2 Full Quiz Page Heatmap
Appendix B Copyrights
Below are the publications reused in this thesis that does not require copyright clearance:
Below are the publications reused in this thesis that requires copyright clearance and obtained:
Permission No.: 20GB0052
IEICE hereby grant permission for the use of the material requested above on condition that their requirements are as follows: