CAMELON: A System for Crime Metadata Extraction and Spatiotemporal Visualization from Online News Articles: When Crime News Becomes a Crucial Safety Database

The rising number of daily crime-related news articles, both in Thailand and internationally, has led to increasing anxiety in daily life. This raises the question, “In an era where threats are closer than we think, what tools can make us feel safer from the crimes around us?” Dr. Siripen Pongpaichet, Assistant Dean for Information and Systems at the Faculty of Information and Communication Technology (ICT), Mahidol University, and her research team recognized this concern and developed CAMELON.

CAMELON (Crime and Accident Monitoring: An Estimation from Large-Scale Online News Articles) is a senior project developed by ICT Mahidol alumni (Batch 17) of the Bachelor’s program in Information and Communication Technology (ICT International Program). The development team includes Mr. Kantapong Matangkarat, Mr. Chancheep Mahacharoensuk, and Mr. Pattadon Singhajan, with Dr. Siripen Pongpaichet as the advisor.

CAMELON is a web-based platform gathering crime news nationwide and visualizing it to reflect the safety levels in various localities across Thailand. It employs a tool called the Criminometer, which measures crime occurrence levels based on multiple factors such as crime types, frequency, severity, and local population, all sourced from credible online news outlets. The data is presented through interactive Crime Maps such as Pin Maps, Heat Maps, and Choropleth Maps.

“The inspiration behind CAMELON stemmed from frequent exposure to crime news on television and social media. The research team noticed crime maps in international systems, like those for Chicago and Los Angeles, which detail crime incidents geographically. They realized that a similar tool could enhance personal safety awareness in Thailand and assist authorities and policymakers by providing multi-dimensional, real-time data for decision-making.”

CAMELON operates through three main processes. The first involves data collection, which is achieved by extracting information from various online news sources, such as robbery news, assault reports, and murder cases. The collected data is then processed into metadata, and the news content is analyzed to automatically categorize crime types using machine learning for prediction. Once the crime categories are identified, the research team uses this information to assess the severity of the incidents. Throughout the process, the research team carefully handles sensitive data and ensures compliance with the Personal Data Protection Act (PDPA).”

The second part involves preparing in-depth information for analysis, with the research team receiving support from Assoc. Prof. Dr. Suppawong Tuarob, Instructor of the Computer Science Academic Group, and Head of the Machine Intelligence and Knowledge Engineering (MIKE) research group, along with Asst. Prof. Dr. Thanapon Noraset, Assistant Dean for Academic Affairs. Their collaboration focuses on developing Machine Learning and Natural Language Processing capabilities for CAMELON. This stage involves breaking down news content into multiple aspects related to crime events, such as the location of the incident, the perpetrator, the act committed, the victim, and the timeline of events. This allows for more detailed data analysis, such as identifying all incidents that occurred in Bangkok in January. Additionally, an Application Programming Interface (API) was developed as a critical mechanism for integrating the three main components.

The final part focuses on data presentation. The research team analyzed system usage patterns and designed the User Experience (UX) and User Interface (UI) to meet the needs of the target user group. The aim is to provide users with essential information for each area, along with the ability to filter specific data. The presentation includes maps and graphs that make the data easy to understand.

The challenges encountered in this study included the large volume and redundancy of data, which prolonged the process of collecting high-quality data. Additionally, the extracted details from news articles were sometimes incomplete or inaccurate, resulting in less precise event location monitoring. In some cases, the data was limited to the provincial level and could not pinpoint districts or sub-districts. Consequently, time was needed to improve the system’s ability to accurately extract information and to train the Machine Learning model to classify crimes more effectively.

Dr. Siripen, the project leader, shared her vision for the future of this work as follows:

“Our target groups include the general public, who wish to stay aware when traveling, and law enforcement officers, who can use the system to monitor the areas under their jurisdiction. Additionally, we aim to incorporate data from police officers’ fieldwork since many crimes reported to the police do not make it into the news. Monthly or yearly summaries provided by law enforcement are invaluable. Another future target group is policymakers at the national level. With CAMELON’s insights into high-risk areas, they could develop plans and policies to assist communities in those regions. Looking ahead, if we can enable real-time data input and visualization for the general public, police officers, and policymakers within CAMELON, it could significantly reduce crime rates and help individuals better prepare for potential risks in specific areas.”

Visit the CAMELON website: https://camelon-project.web.app

Download the publication CAMELON: A System for Crime Metadata Extraction and Spatiotemporal Visualization from Online News Articles: https://ieeexplore.ieee.org/abstract/document/10424974

[/cm