
ISSA is a system integrating different structural mining tasks (only the sequencial case for the moment) by using a common unifying framework: a Galois lattice adapted to sequential data. After characterizing the Galois lattice, it is possible to calculate either partial orders, or frequent sequential patterns, or also a new notion of association rules with order by just traversing its nodes.
Download
WINDOWS Executable
Software in beta testing, without warranty of any kind.
Building the utilities IN UNIX/LINUX
In order to install these utilites, follow the following steps:- ./configure
- make
- make install
After doing these three steps, you can call issa from anywhere, because it will be on your path. To uninstall:
- make uninstall
ISSA utilities
ISSA is divided of two main executables: fcs and lat, that will be called using the script named issa.Fcs executable
Fcs is a program to calculate the frequent closed sequences of the input database over a mininum support threshold. This utility generates two output files: one with information about the frequent closed sequences found in the database(*); the other one with the same filename of the output ending with a '.lat'. This last file will be used to feed the lat executable.(*)IMPORTANT: This output is only an intermediate information, about the process, please use lat with the '.lat' output to get the complete results.
Lat executable
Lat executable generates a lattice from the frequent closed sequences found in the first phase. This utility has different possible options that can be followed by simply running the program lat without any parameters.Lat will generate three diferent outputs, the name of each file is corresponds to changing the extension of the input filename.
The one with the information in text mode about the request, has the same name of the input file but changing .lat for .
The two other files are in format graphml, and can be visualized with Yed.
A file with extension '.sec.graphml' corresponds to the lattice of the frequent closed sequences constructed from the closed sequences over the threshold specified in the fcs utility. A file with extension '.po.graphml', is a lattice that each node is built as a partial order.
As it was mentioned, these two files can be visualized with Yed. For that it is necessary to set up the following options once you open the files:
Layout->Hyerarchical, and here Layout->orientation must be set to 'Left to Right' and the Grouping to 'Layout Groups'. After this, go to Layout->orientation and set it to "Top to bottom", and the Grouping to 'Fix contents to group'. Apply again. In this way the lattice gets the original form of top to bottom, and the closed posets in the nodes get fixed as we want them to be.
Issa executable
Issa is the unified executable for produce the Galois lattice only in one step, using fcs and lat. If you execute issa this way:- issa <myinput> <output> <threshold>
- fcs <myinput> <output> <threshold>
- lat <output>.lat onlylat
The ISSA input file format
The file format that uses fcs, we call this file format 'issa file format', and the structure is as follows.Structure of the format
<list of item1 of sequence1>Note: This is an structure example of an ISSA dataset with N sequences.
<list of item2 of sequence1>
..
0
<list of item1 of sequence2>
<list of item2 of sequence2>
..
0
..
..
..
0
<list of item1 of sequenceN>
..
<list of itemM of sequenceN>
0
0
Structure of the list of items
<list of itemI of sequenceJ> := <item1><item2> ... <itemN>Note: The item is a string that can't be a "0" to avoid confusions with the separating character.
Learning the format
This "list of items", is a set of items separated by a space that are in the same place of the sequence,and the end of this itemset is a return (can be the UNIX/LINUX one (CR), or the DOS (CR+LF)).If the next line of the file is a 0, it would mark the end of the sequence, otherwise the line corresponds to a "list of items"
The end of the file is marked as a empty sequence, that is, a line with a 0 that are ending the last sequence and the next line with another 0. For the moment, ISSA is not performing a rigurous parsing for detecting errors; so if the file is not correct, the results of issa will be erroneus.