项目作者: TechSang

项目描述 :
Bi-LSTM CRF, CNN, Word2vec, BERT, RoBERTa
高级语言: Jupyter Notebook
项目地址: git://github.com/TechSang/Chinese-Legal-Entity-Recognition.git
创建时间: 2020-05-09T12:58:09Z
项目社区:https://github.com/TechSang/Chinese-Legal-Entity-Recognition

开源协议:Apache License 2.0

下载


Chinese-Legal-Entity-Recognition

Introduction

With the development of artificial intelligence, many industries, such as intelligent legal service, has also ushered in a new round of technological changes. Although lots of international research have been done in this field, the disparity of social attention in law-related tasks between China and the west, as well as the differences in language families, make it is still in its infancy in China. To solve the problems of low accuracy and unsatisfactory output of traditional methods, this paper aims to design a Chinese legal entity extraction system based on the pre-trained model.

Dataset

中国‘法研杯’法律智能挑战赛(任务:罪名预测、法条推荐、刑期预测)的数据. (CAIL2018)

Structure

The process of the project can be divided into three parts – data processing, data labelling and model building.
Flow Chart

  • Data processing section
    Dataset cleaning and format, legal item extraction and complement.
    Flow Chart
  • Traverse Comparsion and Word Similarity Comparsion of tagging the entities.
    Flow Chart
  • Model building section includes Bi-LSTM CRF, convolutional network based and pretrained model based model.
    Flow Chart