In the current era of data-driven decision-making, organizations are increasingly reliant on visual analytics tools to uncover hidden patterns and anomalies in their data. This project focuses on addressing the VAST Challenge 2021: Mini-Challenge 2, which centers on the analysis of employee behavioral data from GASTech, a fictional natural gas company. The objective of this research is to develop a comprehensive visual analytics tool capable of identifying and highlighting unusual patterns in employee movements, access logs, and interactions, which could indicate insider threats, security breaches, or other anomalies.The challenge provides a rich dataset that includes GPS movement data, keycard access records, and travel logs over a simulated period of time. By combining advanced machine learning techniques with visual analytics, this project seeks to build an interactive tool that allows users to explore employee behavior in real time, detect deviations from normal behavior, and flag potential security concerns.These subtle indicators, when analyzed in isolation, may not seem significant. However, using visual analytics, this tool will overlay GPS heat maps, network graphs of employee interactions, and access frequency histograms to highlight John Smith\'s anomalous behavior. The tool will allow users to explore these visualizations and understand how changes in movement patterns and access logs correlate with potential security risks.Through this approach, the research will demonstrate the power of interactive visual analytics in handling complex, multivariate datasets and delivering actionable insights for organizational security. The results of this study will not only provide a solution to the Mini- Challengebut also contribute to the broader field of insider threat detection and employee monitoring using visual analytics.
Introduction
The thesis focuses on developing a practical visual analytics tool to detect and interpret unusual behavior patterns among employees of GAStech, a fictional company featured in the VAST Challenge 2021 Mini-Challenge 2. The goal is to enhance organizational security and operational efficiency by analyzing employee behavior through multiple structured datasets, including credit/debit card transactions, loyalty cards, car assignments, and GPS tracking data, combined with geospatial maps.
The tool leverages advanced visualization techniques and behavioral analytics to uncover anomalies such as deviations from routines, unusual spending, or suspicious movements. The approach integrates machine learning and data analysis with interactive visualizations to help analysts detect subtle patterns that rule-based systems might miss.
A thorough literature review highlights the importance of visual analytics in insider threat detection, movement data visualization, anomaly detection (using clustering, rule-based methods, and time-series analysis), and workforce behavior monitoring.
The methodology involves preprocessing and integrating the datasets, which contain detailed employee transactions and GPS movements on the fictional islands of Abila and Kronos. The project emphasizes clean, scalable data handling and a user-friendly interface that supports intuitive data exploration.
The resulting visual analytics tool is designed with usability and clarity in mind, featuring a dashboard for overview visualizations and detailed transaction and spatial analyses. The system aims to provide clear, unbiased, and actionable insights to support security investigations and improve organizational risk management.
Conclusion
GasTech\'s anomaly detection capabilities, particularly in time and amount-based anomalies, demonstrate effectiveness in enhancing transaction security within the gas industry. The integration of transaction and GPS data has proven to be a valuable asset, providing a holistic view of employee activities and transactions. The discussion highlighted successful identification of anomalies based on time, uncovering irregular transactions during non- standard hours, and amount, detecting unusual transaction amounts.Enhanced Data Alignment: Future efforts will focus on further refining data alignment, addressing any remaining discrepancies in transaction records. Continuous improvement in timestamp synchronization will contribute to the overall accuracy of anomaly detection.Advanced Anomaly Detection Models: Exploring advanced anomaly detection models and techniques will be essential for improving the platform\'s sensitivity to subtle irregularities. This includes leveraging machine learning algorithms to enhance anomaly identification.
References
[1] P. Moreno-Sanchez, M. Zafar, and A. Kate, ‘Listening to Whispers of Ripple: Linking Wallets and Deanonymizing Transactions in the RippleNetwork’, Proc. Priv. Enhancing Technol., vol. 2016, Feb. 2016
[2] H.Wickham,Ggplot2:Elegantgraphicsfordataanalysis,2nded.Cham,Switzerland: Springer International Publishing, 2016.
[3] ‘Re-Identification of “Anonymized” Data’, Georgetown Law Technology Review, Apr. 12, 2017.
[4] R. Sen and S. Borle, ‘Estimating the Contextual Risk of Data Breach: An Empirical Approach’, J. Manag. Inf. Syst., vol. 32, pp. 314–341, Apr. 2015
[5] E. Tarameshloo, M. H. Loorak, P. W. L. Fong, and S. Carpendale, ‘Using Visualization to Explore Original and Anonymized LBSN Data’, Comput. Graph. Forum
[6] L. Rocher, J. M. Hendrickx, and Y.-A. de Montjoye, ‘Estimating the success of re- identifications in incomplete datasets using generative models’, Nat. Commun., vol. 10, no. 1, Art. no. 1, Jul. 2019
[7] L.Liu, M. Han, Y. Wang, and Y. Zhou, ‘Understanding Data Breach: AVisualization Aspect’, in Wireless Algorithms, Systems, and Applications, Cham, 2018
[8] A. Schouten, ‘AILiteracy 101 — what is it and why do you need it?’ Medium, Aug. 25,2020.
[9] C. Nwosu, ‘Visualizing The 50 Biggest Data Breaches From 2004–2021’, Visual Capitalist,Jun.01,2022.
[10] S. Schmeelk, ‘Where is the Risk? Analysis of Government Reported Patient Medical Data Breaches’, in IEEE/WIC/ACM International Conference on Web Intelligence - Companion Volume, New York, NY, USA, Oct. 2019
[11] ‘DatesandTimesMadeEasywithlubridate|JournalofStatisticalSoftware’.
[12] H.Wickham,R.François,L.Henry,andK.Müller,‘dplyr:AGrammarofData Manipulation’
[13] ‘Buy GIS Software| ArcGIS Product Pricing - Esri Canada Store’.