Authors: N Deepak, Joy Patel, Likith M J, Mihika Srivastava, Dr. Hema Jagadish
Certificate: View Certificate
Know Your Customer (KYC) is a crucial regulatory obligation for banks and financial institutions to record customer information before providing financial services. However, traditional KYC processes are slow, costly, and prone to errors. To address these challenges, a proposed solution aims to use blockchain technology, smart contracts, Ganache, and Advanced Encryption Standard (AES) encryption in the KYC process. This solution leverages the immutability and transparency of the blockchain to create a secure and efficient KYC process. Smart contracts approve and secure the KYC process, eliminating intermediaries and reducing time and cost. Ganache tests and deploys smart contracts, while AES encryption ensures the confidentiality and integrity of customer data. The proposed architecture involves a proof-of-concept system that can be used as a basis for further development and implementation in the financial industry and other regulated sectors. By leveraging these technologies, the proposed solution can revolutionize KYC processes, providing a secure, decentralized, and efficient process while ensuring the privacy of customers.
Know Your Customer (KYC) is a procedural endeavor undertaken by financial institutions, wherein they acquire discerning particulars pertaining to the personal identity and residential whereabouts of prospective acquirers and borrowers. It is a regulatory-driven undertaking entailing diligent scrutiny to authenticate the identities of clients. This process aids in preventing the misuse of banking services. The banks bear the responsibility of executing the KYC procedure when establishing accounts, and they are also obligated to regularly update their customers' KYC information. KYC procedures can be manual, time-consuming, and duplicative across different institutions. Every company must ascertain its identity through suitable means, and this holds particular significance for financial institutions. From this ‘Know Your Customer,’ or KYC protocols emerge to aid corporations in ascertaining the identity of their business partners. It proffers a comprehensive investigation and engenders a feeling of assurance for financial institutions and the banking domain. Typically, this involves a protracted and meticulous process wherein specific documentation is provided, and various forms of scrutiny and validation take place.
Primarily KYC helps financial institutions to prevent identity thefts, money laundering, terrorist financing, and profiling and eliminating runaway creditors. Within the conventional KYC system, each bank independently carries out its own identity verification process, wherein every user undergoes individual scrutiny by a distinct organization or government entity. Every time one wishes to open a new account in a bank, one will have to undergo the whole KYC process from scratch. Hence, there is a waste of time checking each identity from the start. KYCs also need to be regularly updated, like a change in a phone number or updating of address. This involves a lot of redundancy and manual effort.
In a nutshell, the following are the problems with current KYC:
A Distributed Ledger Technology (DLT) known as Blockchain manifests as a decentralized register, wherein the digital sequence simulates a virtual ledger. Each novel record signifies an appended block, interconnected with the chain. All participants or entities possess a replica of the blockchain to authenticate and expose any illicit or dubious transactions. The Blockchain framework and its distributed ledger technology empower the aggregation of data from diverse service providers into an impervious and unalterable database, obviating the requirement for third-party authentication of information. The autonomous Blockchain itself employs the SHA-1 algorithm to fortify each block or transaction within the chain cryptographically. This encompasses the encryption and decryption of sensitive data. The confidential data of a user can solely be accessed through their exclusive private key, strictly held in their possession. Consequently, a system materializes wherein a user would only need to undergo the KYC process once to corroborate their identity. By eliminating intermediaries and averting repetitive KYC procedures across multiple banks, the blockchain methodology emerges as an enduring and proficient alternative. The integration of KYC information onto the Blockchain empowers financial institutions to deliver enhanced compliance outcomes, augment efficiency, and elevate the customer experience. With thorough background checks and verifications in place, this allows banks to flag or mark suspicious customers, informing all banks or entities in the network. This gives banks a chance to either further probe into the customer or not give out credit or loans to the flagged entity. This ensures the security of not just one bank, but all the financial institutions which are a part of the blockchain network.
II. RELATED WORK
Researchers have proposed various machine learning (M) models to automate the process of detecting money laundering activities. Mohannad Alkhalili Mahmoud H. Qutqut et al.  developed a model using Support Vector Machines (SVM) that achieved higher accuracies in predicting transaction decisions. Ashwini Kumar et al.  utilized big data analytics techniques and the Naive Bayes classification method, achieving an accuracy score of 0.8125 in detecting money laundering transactions. David Macedo et al.  focused on document segmentation in online customer identification and proposed the use of the U-Net model for text detection. They also suggested model optimization using Octave Convolutions to improve computational efficiency.
Moreover, Jose-de-Jesus Rocha-Salazar et al.  emphasized the effectiveness of data visualization techniques, particularly link analysis, in identifying suspicious activities and detecting money laundering. Kishore Singh et al.  introduced a new method for AML and terrorism financing detection based on typologies described in Financial Action Task Force reports, which reduced false positives and improved accuracy compared to previous rule-based methods.
Ricardo Azevedo Araujo  postulated an incentive-driven tactic to combat money laundering wherein financial institutions play a crucial role in reporting suspicious activities. However, the problem of confidential information and hidden selection presents a challenge in the effectiveness of the regulation.
Ismail Alarab et al.  proposed the use of graph neural networks to predict illicit behavior in the Bitcoin blockchain. By combining Graph Convolutional Networks (GCN) with linear layers, they achieved better performance in detecting illicit transactions.
Amr Ehab Muhammed Shokry et al.  focused on identifying covert patterns, syndicates, and transactions related to money laundering for combating terrorism financing. Their unsupervised machine learning methodology showed promising results in identifying similarities and hidden patterns across transactions and suspicious accounts.
Rasmus Ingemann Tuffveson Jenson et al.  conducted a comprehensive literature review on anti-money laundering (AML) in banks. They proposed a standardized terminology and identified the shortage of public datasets as a major challenge in the scientific literature on statistical and machine learning methods for AML.
Joana Lorenz et al.  addressed the challenge of limited labels for detecting money laundering through cryptocurrencies. They introduced an active learning approach that can perform as well as a fully supervised method with just 5% of the labels, mimicking real-world scenarios where limited labels are available.
In addition to machine learning, the use of blockchain technology shows promise in addressing AML challenges. Yue Shi et al.  examined the potential of blockchain technology in cybersecurity, highlighting benefits such as enhanced security and data integrity. They proposed an enhanced access control strategy using attribute-based encryption. Joe Abou Jaoude et al.  conducted a literature review on blockchain technology's applications and emphasized its decentralized nature, which eliminates the requisite for trust amidst stakeholders and enables trustless transactions.
Regarding KYC, Prakash Chandra Mondal et al.  presented a method for secure and seamless financial access through dynamic KYC-based transaction authorization.
This approach reduces the risk of key theft and the need for additional hardware, making it cost-effective. KYC blockchain offers significant advantages by leveraging blockchain's security, anonymity, and data integrity features. By storing customer identification information securely on the blockchain, it ensures trustless and tamper-resistant KYC processes, reducing the risk of identity fraud and enhancing compliance with AML regulations.
In conclusion, machine learning approaches, such as SVM, Naive Bayes, and graph neural networks, demonstrate promise in automating the detection of money laundering activities.
These models have shown higher accuracies in predicting transaction decisions and identifying suspicious patterns. Nevertheless, there are certain limitations to consider.
One drawback is the reliance on historical labeled data in supervised learning approaches, limiting their effectiveness against sophisticated money launderers who constantly adapt their techniques. This highlights the need for unsupervised machine learning techniques, as highlighted by Prof. Dr. Nevine Makram Labib et al. .
Unsupervised learning can discover new patterns and detect all accounts and groups involved in money laundering, reducing the risk of false positives. Future research should focus on contrasting and suggesting unsupervised machine learning techniques explicitly for combating terrorism financing.
Another challenge is the computational intensity of some models, such as the U-Net model proposed by David Macedo et al. , which may not be practical for deployment on mobile devices characterized by constrained computational capabilities. Optimizing models using techniques like Octave Convolutions can help address this issue.
Furthermore, the effectiveness of AML regulations and activities heavily relies on the willingness and ability of financial institutions to report suspicious activities, as highlighted by Ricardo Azevedo Araujo . The problem of confidential information and hidden selection can hinder the regulation's effectiveness and create a barrier to combating money laundering.
In the realm of blockchain technology, KYC blockchain presents an opportunity to enhance AML efforts. By securely storing customer identification information on the blockchain, KYC processes become trustless and tamper-resistant. This reduces the risk of identity fraud and improves compliance with AML regulations. However, further research is needed to explore the practical implementation and scalability of KYC blockchain solutions.
In summary, machine learning algorithms have exhibited promise in automating AML processes and detecting money laundering activities. Unsupervised learning approaches and optimization techniques can enhance their effectiveness. Additionally, leveraging blockchain technology, particularly KYC blockchain, can significantly improve the security and efficiency of AML efforts. Continuous research and innovation in these areas are crucial to stay ahead of evolving money laundering tactics and safeguard the integrity of the global financial system.
III. PROBLEM DEFINITION
V. SYSTEM DESIGN
A KYC utility system based on blockchain technology will enable the financial and banking sectors to emancipate the process of identity verification. Currently, the data is collected and stored in a centralized system, such as a repository. With the introduction of blockchain solutions to handle the KYC process, data will be available on a decentralized network and can, therefore, be accessed by third parties directly after permission has been given. The blockchain-based KYC system will also offer better data security by ensuring that data access is only made after a confirmation or permission is received from the relevant authority.
This will eradicate the possibility of unauthorized entry, thus bestowing individuals with heightened authority over their own data. The ledger will furnish an archival log of all disseminated documents and compliance endeavors undertaken for each clientele. Moreover, Blockchain technology proves advantageous in discerning entities endeavoring to fabricate deceptive histories. Within the parameters of data protection regulations, the data enshrined within the blockchain remains immutable and can be scrutinized to uncover anomalies, directly targeting illicit conduct.
There is an in-built feature that does not require manual input but rather instead it is automatic called BANK_RATING, which is a measure or score assigned to a bank to evaluate its performance and reliability. It provides an indication of the bank's credibility and trustworthiness in the financial industry. In the context of the KYC process stored on the blockchain, the bank rating can be incremented or decremented based on the number of KYC processes conducted and stored. When a bank successfully completes a KYC process for a customer and stores the verified information on the blockchain, it demonstrates its commitment to compliance and risk management. Each completed and verified KYC process can contribute positively to the bank's rating. The more KYC processes the bank successfully completes, the higher its rating may increase. This indicates that the bank has a robust and efficient KYC process in place, assuring adherence to regulatory mandates and mitigating potential hazards linked to fraudulent endeavors. On the other hand, if the bank fails to perform the KYC process accurately or faces issues with data integrity, it may lead to a decrement in its rating. Instances such as incomplete KYC information, fraudulent activities detected, or violations of regulatory guidelines can negatively impact the bank's rating. This signals potential weaknesses in the bank's compliance procedures and raises concerns about its reliability and ability to handle customer data securely.
Advanced Encryption Standard (AES) is a widely used encryption algorithm that is considered to be one of the most secure encryption methods available today. It embodies a symmetric encryption algorithm, thereby signifying that an identical key is employed for both the encryption and decryption of data. AES was selected by the U.S. National Institute of Standards and Technology (NIST) in 2001 as the standard for securing sensitive government information and is now used in a wide range of applications, including in the financial industry. AES is a symmetric encryption algorithm that uses a block cipher to encrypt and decrypt data. In the context of KYC blockchain, AES encryption can be used to secure the personal and sensitive information of customers that is stored on the blockchain. When a customer provides their personal information to the KYC service provider, that information can be encrypted using AES before being stored on the blockchain. Overall, the use of AES encryption in KYC blockchain provides an additional layer of security and ensures that sensitive information is protected at all times. The process of AES encryption involves several steps:
In the KYC Blockchain System, the cryptographic technique for encryption and decryption of customer data when sending and retrieving from the KYC network is done with the help of the Advanced Encryption Standard (AES). Now the choice of using Symmetric cryptography such as AES instead of Asymmetric cryptography such as Rivest-Shamir-Adleman (RSA) is because of the following technical reasons:
Overall, while asymmetric cryptography can provide better security in certain situations, the efficiency, simplicity, and compatibility of symmetric cryptography may make it more appropriate for a KYC blockchain system.
VIII. EVALUATION AND RESULTS
When it comes to secure communication of sensitive data, choosing the right encryption algorithm is crucial. AES, Data Encryption Standard (DES), and Triple-DES are widely used symmetric cryptographic algorithms that provide encryption and decryption of sensitive data. In order to determine which algorithm is most suitable for a particular use case, it is important to compare their performance characteristics.
This can involve analyzing factors such as security, speed, key length, and compatibility, as well as conducting experiments to determine the encryption and decryption time of each algorithm. By comparing AES, DES, and 3DES, one can make an informed decision about which algorithm to use based on the specific security requirements, performance characteristics, and compatibility of their system or use case.
A. Experiment Setup
“Tech Company Fundings” is the dataset used for comparing AES, DES, and Triple-DES in terms of their execution time (both encryption and decryption time). This dataset contains up-to-date information about tech company funding across the globe. The dataset contains information from January 2020 and contains 3575 company funding information. The data attributes include –
From the attributes, it is clear that the dataset contains different datatypes such as string, integer, url, and date. All of these can be used to showcase how AES, DES, and Triple DES would encrypt and decrypt that data. It is also important to note that there is no dataset containing KYC details as it contains sensitive information about customers and thereby needs to be confidential at all cost. This is why we have chosen this dataset to validate the cryptography technique used in the KYC system i.e, AES.
B. Testing Framework
C. Script Testing
There are two different types of scripts written in order to analyze the memory usage of AES, DES, and Triple-DES algorithms in terms of their execution time. However, the execution steps remain the same as shown below:
A. More Memory Consumption
The values present in Table 2 are the total time taken for encrypting and decrypting the 3575 rows of data simultaneously. Based on this, it is clear that AES was executed in the shortest time possible. Figure 14 shows a line graph, where the color yellow indicates AES, green indicates DES, and blue indicates Triple-DES. X-axis represents the number of rows of data encrypted against the time taken to encrypt data in milliseconds (ms ) on the Y-axis. From the graph, it is clear that AES takes the least amount of time to execute, then comes DES, and finally Triple-DES. The same result can also be seen in the case of decryption when a similar testing process is done.
B. Less Memory Consumption
The values present in Table 3 are the total time taken for encrypting and decrypting the 3575 rows of data simultaneously. Based on this, it is clear that AES was executed in the shortest time possible. Figure 15 shows a stacked bar graph, where the color yellow indicates AES, green indicates DES, and blue indicates Triple-DES. X-axis represents the number of rows of data encrypted against the time taken to encrypt data in milliseconds (ms) on the Y-axis. From the graph, it is clear that AES takes the least amount of time to execute, then comes DES, and finally Triple-DES even in the case of less memory consumption. This case is true even in the case of decryption of data. In conclusion, AES outperforms DES, and Triple-DES in terms of execution speed, security and memory consumption comparatively. It is also noted that DES performs good than Triple-DES but not that well against AES. Choosing AES as cryptography for the KYC Blockchain as a security measure is ideal as per the results obtained.
The proposed system is an enhanced and dynamic KYC system, built on blockchain technology, which effectively diminishes the expenses associated with conventional KYC procedures. Moreover, it facilitates the proportional distribution of these expenses among stakeholders. Through the implementation of smart contract, users can securely store their personal information on the blockchain via banks, and grant or revoke access to them and making the system truly decentralized, and that it makes possible a distributed data storage architecture. The use of blockchain technology with AES cryptography in a KYC system provides a secure and tamper-proof way of storing personal information. The system allows for the easy addition, modification, and viewing of KYC data by authorized parties. The use of customer ratings and bank ratings also provides an additional layer of trustworthiness and credibility to the system. The BANK_RATING feature is a unique aspect of the system, which incentivizes banks to conduct accurate and reliable KYC processes. It encourages banks to comply with regulatory requirements and mitigate risks associated with fraudulent activities, which in turn improves their credibility and trustworthiness. Overall, a KYC blockchain system that uses AES as the cryptography technique can provide a secure and efficient way to manage customer information and improve the regulatory compliance and risk management practices of financial institutions.
 “Investigation of Applying Machine Learning for Watchlist-Filtering in Anti-Money Laundering” by Mohannad Alkhalili Mahmoud H. Qutqut and Fadi Almasalha 2021 IEEE  “Anti-Money Laundering Detection using Naïve Bayes Classifier” by Ashwini Kumar, Sanjoy Das, and Vishu Tyagi 2020 IEEE  “A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation” by Ricardo Batista das Neves Junior, Luiz Felipe Vercosa, David Macedo , Byron Leite Dantas Bezerra 2020 IEEE  “Money laundering and terrorism financing detection using neural networks and an abnormality indicator” by Jose-de-Jesus Rocha-Salazar, Maria-Jesus Segovia-Vergas and Maria-del-Mar Camacho-Minano 2020 Elsevier  “Anti-Money Laundering: Using data visualization to identify suspicious activity” by Kishore Singh and Peter Best 2019 Elsevier  “Combating money laundering with machine learning – applicability of supervised-learning algorithms at cryptocurrency exchanges” by Eric Pettersson Ruiz and Jannis Angelis, 2019, JMC  “Survey of Machine Learning Approaches of Anti-money Laundering Techniques to Counter Terrorism Finance” by Prof. Dr. Nevine Makram Labib, Prof. Dr. Muhammed Abu Rizka, and Amr Ehab Muhammad Shokry, 2020 IEEE  “Assessing the efficiency of the anti-money laundering regulation: an incentive-based approach” by Ricardo Azevedo Araujo, 2008, JMC  “Transaction Authorization from Know Your Customer (KYC) Information in Online Banking” by Prakash Chandra Mondal, Rupam Deb, and Mohammad Nurul Huda, 2016 IEEE Conference  “Competence of Graph Convolutional Networks for Anti-Money Laundering in Bitcoin Blockchain” by Ismail Alarab, Simant Prakoonwit, Mohamed Ikbal Nacer  “Counter Terrorism Finance by Detecting Money Laundering Hidden Networks Using Unsupervised Machine Learning Algorithm” by Amr Ehab Muhammed Shokry, Mohammed Abo Rizka, and Nevine Makram Labib, 2020 ICT  “Fighting Money Laundering with Statistics and Machine Learning” by Rasmus Ingemann Tuffveson Jensen and Alexandros Iosifidis, 2022 IEEE  “Machine Learning Methods to Detect Money Laundering in the Bitcoin Blockchain in the Presence of Label Scarcity” by Joana Lorenz, Maria Ines Silva, David Aparicio, Joao Tiago Ascensao, and Pedro Bizarro, 2020,  “From Bitcoin to Cybersecurity: A Comparative Study of Blockchain Application and Security Issues” by Fangfang Dai, Yue Shi, Nan Meng, Liang Wei, and Zhiguo Ye 2017, IEEE  “Blockchain Applications – Usage in Different Domains “ by Joe Abou Jaoude And Raafat George Saade 2019, IEEE
Copyright © 2023 N Deepak, Joy Patel, Likith M J, Mihika Srivastava, Dr. Hema Jagadish. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.