Mapping crime descriptions to law articles using deep learning
Publicatie van Kenniscentrum Creating 010
M. Vink, C.P.M. Netten, M.S. Bargh, S. Braak,van den, R. Choenni | Conferentiebijdrage | Publicatiedatum: 15 september 2020
In the operational systems of the Dutch Public Prosecution Service, data about criminal cases are registered. This information is used to generate crime statistics or other management information as input for policymaking. A key element for these statistics is the crime type of a case, which is normally deduced from the registered law articles of each case. However the quality of these registered law articles has shortcomings. Additional data describing the crime could be useful to enhance the equality of these law articles. In this paper we investigate the possibility to map additional descriptions of the crime to the formal notations of law articles using a deep learning neural network approach called sequence-to-sequence learning. We describe the characteristics of the data and carry out a number of experiments on these data. Subsequently, we compare two approaches: a) one-hot encoding for the words in a sentence and b) pre-trained word embeddings. The results show that the mapping of the crime description to law articles works reasonably well: a measured accuracy of 91% is reached, and when some issues in the dataset would be resolved the performance could be even higher.