¡¡Chinese Journal of Computers   Full Text
  TitleProcess Mining: An Extended ¦Á-Algorithm to Discovery Duplicate Tasks
  AuthorsLI Jia-Fei LIU Da-You YANG Bo
  Address(College of Computer Science & Technology, Jilin University, Changchu 130012)
(Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012)
  Year2007
  IssueNo.8(1436¡ª1445)
  Abstract &
  Background
Abstract Based on the ¦Á-algorithm, an improved process mining algorithm called ¦Á** is presented and a proof of its correctness is also provided. First, the properties of duplicate tasks are analyzed through the techniques of machine learning, several theorems to judge the duplicate tasks and their proofs are given. Then, all the duplicate tasks in workflow logs are discovered and identified by them. Finally, the workflow net is extracted from the identified log using the ¦Á-algorithm and fine-tuned to get the result workflow model containing duplicate tasks. Experiments illustrate the validity of ¦Á**-algorithm, the experiment results prove the higher efficiency of the ¦Á** comparing with the existing duplicate tasks mining algorithm.

keywords process mining; workflow mining; duplicate tasks; Petri nets; workflow nets

background Process mining aims at a more fine grained analysis of processes based on event logs. The goal of process mining is to extract information about processes from these logs. Most research in process mining focuses on mining heuristics primarily based on binary ordering relations of the events in a workflow log. A lot of work has been done on utilizing heuristics to distill a process model from event logs and many valuable progresses are made in the domain. However, all the existing heuristic-based mining algorithms have their limitations. There are still many challenging problems that the existing mining algorithms cannot handle. Duplicate tasks are one of them.
The ¦Á algorithm theoretically constructs the final process model in WF-nets, which is a subset of Petri nets. This algorithm is proven to be correct for a large class of processes, but like most other techniques it has problems in dealing with duplicate tasks. In this paper, combining techniques of machine leaning and the ¦Á-algorithm, a new algorithm called ¦Á** that can deal with duplicate tasks is proposed. First, the properties of duplicate tasks are analyzed through the techniques of machine learning, several theorems to judge the duplicate tasks and their proofs are given. Then, all the duplicate tasks in workflow log are discovered and identified by them. Finally, the workflow net is extracted from the identified log using the ¦Á-algorithm and fine-tuned to get the result workflow model containing duplicate tasks. Experiments illustrate the validity of ¦Á**-algorithm, the experiment results prove the higher efficiency of the ¦Á** compared to existing duplicate tasks mining algorithm.
This algorithm can be used in various domains, e.g., governmental agencies, municipalities, hospitals, ERP systems, etc. It will improve the efficiency and performance of the discovery of process with duplicate tasks.
This work is supported by NSFC Major Research Program, Basic Theory and Core Techniques of Non Canonical Knowledge (No.60496321); National Natural Science Foundation of China (No.60373098,60573073,60503016), the National High-Tech Research and Development Plan of China (No.2006AA10Z245), the Major Program of Science and Technology Development Plan of Jilin Province (No.20020303), the Science and Technology Development Plan of Jilin Province (No.20030523), European Commission under grant No.TH/Asia Link/010 (111084).