一种序列模式的概念及挖掘算法
来源期刊:中南大学学报(自然科学版)2001年第4期
论文作者:李宏 陈松乔
文章页码:425 - 427
关键词:时间序列模式;挖掘算法;频繁物品集;独立最大序列
Key words:sequential pattern; data mining algorithm; litemset; maximal sequence
摘 要:介绍了一种时间序列模式的形式和概念,讨论了其相关的挖掘算法.将时间序列模式既用于具有时间关系的购买行为的分析,以揭示购买行为后面一种序列关系信息,又用于其他有时间关联的事件分析.挖掘算法由以下几部分构成:建立频繁物品集,进行数据处理和转换,并生成候选子序列,通过验证后,得到长度为2,3,…的序列集合,从中选出独立最大序列即为所求.通过实例指出了该算法和传统的Aprioriall算法的不同之处.结果表明,这种序列模式在网络通信、气象分析等领域具有广阔的应用前景.
Abstract: This paper introduces the concept of sequential pattern in KDD and discusses its algorithm. Sequential pattern discovery is used in analysis of buying action with time relation in order to reveal the sequential information behind. The algorithm consists of four parts,i.e., the litemset is set up, the data are processed and transformed, the candidate subsequences are formed, the sequences set with different length are got by verification, and the selected independent maximal sequences are the result. An example is given to point out the difference between the Aprioriall algorithm and ours.The results showthatthe sequence pattern haswide application future in domains such as network communication, weather analysis, stock market,etc.