The flow of fake (counterfeit) goods into the United States is a serious problem. These fake products like designer clothes, electronics, medicine, and car parts not only break the law but also put people’s safety at risk and cost the U.S. economy billions of dollars every year (Ocean Tomo, 2024; U.S. Customs and Border Protection [CBP], 2023). As online shopping and global trade grow, old-fashioned inspection methods are no longer enough to catch the large number of fake goods being shipped in (ProCogia, 2025; Garcia-Cotte et al., 2024).
This study explores how data analytics can help detect these fake products. We used a dataset of 5,000 shipments that were flagged by U.S. Customs between 2022 and 2024. Using that data, we built computer models logistic regression and random forest classifiersto predict which shipments were likely to be fake. The models used many types of information, such as the country the shipment came from, its weight and value, how often the vendor ships items, and even image analysis using computer vision tools (Pulfer et al., 2022; Garcia-Cotte et al., 2024).
Our results showed that using these advanced tools made a big difference. The accuracy of detecting fake goods increased from 68% (with old methods) to 91% with our analytics-based approach. We also found a strong connection (r = .82, p < .001) between the risk scores predicted by our model and the actual fake shipments that were seized.
These findings show that analytics can greatly improve how customs officers do their jobs. By helping them focus on the most suspicious shipments, analytics saves time, uses resources wisely, and helps protect the country (Leppard Law, 2024; Alhabash et al., 2023). This research supports the idea that government agencies and international trade partners should use more machine learning and business intelligence tools to fight fake goods on a larger scale
Introduction
The import of counterfeit goods into the U.S. is rising, especially in categories like clothing, electronics, and even medical supplies. In 2023, clothes and accessories made up 26% of all counterfeit seizures by U.S. Customs and Border Protection (CBP). Fake goods harm the global economy—causing over $500 billion in losses—and pose serious health and safety risks, as seen during the COVID-19 pandemic with fake N95 masks and medical devices.
Traditional methods used by customs—manual inspections, basic warning signs, and watchlists—are no longer sufficient due to today’s massive shipment volumes, complex supply chains, and increased online shopping. To address these challenges, governments and companies are turning to advanced analytics and AI. Tools like Convolutional Neural Networks (CNNs) detect fake logos and packaging with up to 99% accuracy, while cloud-based systems such as Amazon Redshift and Snowflake analyze shipment patterns and assign risk scores in real time.
To support these efforts, this study proposes a new counterfeit-detection system using logistic regression and random forest models, enhanced with powerful features such as vendor history, image anomaly scores, packaging irregularities, and country-of-origin risk. A dataset of 5,000 flagged international shipments (2022–2024) was used, including data on shipment value, weight, past violations, packaging issues, and image-based AI scores.
The models were trained on 70% of the data and tested on 30%, evaluated using precision, recall, and F1-score. Results showed strong performance, with steady improvement from 2022 to 2024 due to better features, continuous updates, and more data. Random forest analysis revealed that vendor seizure history, image anomaly score, and origin risk flags were the most influential predictors. A risk-decile analysis showed that higher model risk scores corresponded strongly with actual counterfeit seizures.
A key finding was a very strong Pearson correlation (r = 0.82, p < .001) between model risk scores and real seizure outcomes, proving that the system reliably identifies high-risk shipments. This supports smarter inspection strategies, allowing customs to focus on the top-risk 30% of shipments while catching most counterfeit goods.
Ultimately, the analytic framework demonstrates a major improvement in detection: model precision increased from 68% to 91% over the study period. This shows that combining AI, image analysis, and behavioral data creates a scalable, effective tool for strengthening border security and protecting global trade.
Conclusion
This study gives strong real-world proof that using a complete analytic system one that combines shipment details, seller behavior, image checks, and machine learning can greatly improve how the U.S. detects fake goods at the border.
Here are the main parts of this approach:
1) Basic Shipment Information (Descriptive Metadata)
This includes details like where the shipment came from, how much it\'s worth, and its size or weight. These facts help give early clues about whether a shipment could be risky (U.S. Trade Representative, 2024).
2) Vendor Behavior Tracking (Behavioral Profiling)
By looking at a seller’s past like how often their shipments were caught with fakes the model can spot repeat offenders.
This type of tracking helps the system learn what fake sellers typically do (Ocean Tomo, 2024; Alhabash et al., 2023).
3) Image Checks for Problems (Image Anomaly Detection)
Using AI-powered image tools, the system looks for packaging or label issues that might be too small or subtle for human inspectors to catch. This helps detect even high-quality counterfeits that look very real (Garcia-Cotte et al., 2024).
4) Predictive Modeling (Machine Learning)
The system useslogistic regression(which shows why a decision was made) and random forest models (which handle complex patterns). Together, they create a risk score for each shipment. These scores matched very well with actual fake shipments that were caught (r = 0.82, p < .001), showing the model works (ProCogia, 2025).
This matters on a bigger scale because:
Better Use of Resources: Customs officers can now focus on the most dangerous shipments and let low-risk ones pass through faster, saving time and effort (Leppard Law, 2024).
Teamwork Between Government and Tech Companies: This system lets federal agencies work with private tech firms to create real-time detection tools that stay up to date with the latest counterfeiting tricks (Pulfer et al., 2022).
Protecting the U.S. Economy: Fake goods cost the country over $600 billion a year in lost sales and taxes. Using smart analytics like this helpsprotect American businesses and jobs (Ocean Tomo, 2024).
References
[1] Alhabash, S., Kononova, A., Huddleston, P., Moldagaliyeva, M., & Lee, H. (2023). Global Anti-Counterfeiting Consumer Survey 2023. Michigan State University.
[2] Garcia Cotte, H., Mellouli, D., Rehman, A., Wang, L., & Stork, D. G. (2024). Deep neural network based detection of counterfeit products from smartphone images. ArXiv, 2410.05969. https://arxiv.org/abs/2410.05969
[3] Leppard Law. (2024). How technology aids in detecting counterfeit products. https://leppardlaw.com
[4] Ocean Tomo. (2024). The Impact of Counterfeit Goods in Global Commerce. https://oceantomo.com
[5] ProCogia. (2025). Counterfeit detection and legal compliance: Optimizing data analytics with Redshift. https://procogia.com
[6] Pulfer, B., Belousov, Y., Tutt, J., Chaban, R., Taran, O., &Voloshynovskiy, S. (2022). Anomaly localization for copy detection patterns through print estimations. ArXiv, 2209.15625. https://arxiv.org/abs/2209.15625
[7] U.S. Trade Representative. (2024). 2023 Review of Notorious Markets for Counterfeiting and Piracy. https://ustr.gov