|
| 1 | +Overview of Graph Pattern Mining (GPM) in Galois |
| 2 | +================================================================================ |
| 3 | + |
| 4 | +This is the Pangolin framework [1] for efficient and flexible graph mining. |
| 5 | +It uses the bliss library [2][3] for graph isomorphism check. |
| 6 | +The license for this library is in the bliss directory: |
| 7 | +note that **it does not use the same license as the rest of Galois**. |
| 8 | +To run Pangolin applications, please go to ../lonestarmine/README.md |
| 9 | +for more details. |
| 10 | + |
| 11 | +[1] Xuhao Chen, Roshan Dathathri, Gurbinder Gill, Keshav Pingali, |
| 12 | +Pangolin: An Efficient and Flexible Graph Pattern Mining System on CPU and GPU, VLDB 2020 |
| 13 | + |
| 14 | +[2] Bliss: A tool for computing automorphism groups and canonical |
| 15 | +labelings of graphs. http://www.tcs.hut.fi/Software/bliss/, 2017. |
| 16 | + |
| 17 | +[3] Tommi Junttila and Petteri Kaski. 2007. Engineering an efficient |
| 18 | +canonical labeling tool for large and sparse graphs. In Proceedings |
| 19 | +of the Meeting on Algorithm Engineering & Expermiments, 135-149. |
| 20 | + |
| 21 | +INPUT |
| 22 | +=========== |
| 23 | + |
| 24 | +We support four input graph format: **gr**, **txt**, **adj**, **mtx**. |
| 25 | +For unlabeled graphs, we use the gr graph format, same as other Galois benchmarks. |
| 26 | +**Make sure that the graph is symmetric and contains no self-loop or redundant edges**. |
| 27 | +If not, use the convert tool in tools/graph-convert/ to convert the graph. |
| 28 | +We use **adj** format for labeled graphs as also used by Arabesque and RStream. |
| 29 | +The **adj** format takes as input graphs with the following formats (vertex labeled): |
| 30 | + |
| 31 | +``` |
| 32 | +# <num vertices> <num edges> |
| 33 | +<vertex id> <vertex label> [<neighbour id1> <neighbour id2> ... <neighbour id n>] |
| 34 | +<vertex id> <vertex label> [<neighbour id1> <neighbour id2> ... <neighbour id n>] |
| 35 | +... |
| 36 | +``` |
| 37 | + |
| 38 | +We currently do not support graphs label on edges. |
| 39 | +Vertex ids are expected to be sequential integers between 0 and (total number of vertices - 1). |
| 40 | +For testing, we have prepared a test graph **citeseer** in $GALOIS_HOME/lonestarmine/test_data. |
| 41 | + |
| 42 | +BUILD |
| 43 | +=========== |
| 44 | + |
| 45 | +1. Run cmake at BUILD directory: |
| 46 | + |
| 47 | +`cd build; cmake -DUSE_PANGOLIN=1 ../` |
| 48 | + |
| 49 | +To enable GPU mining, use: |
| 50 | + |
| 51 | +`cmake -DUSE_PANGOLIN=1 -DUSE_GPU=1 ../` |
| 52 | + |
| 53 | +2. Run make: |
| 54 | + |
| 55 | +`cd <BUILD>/lonestar/experimental/fsm; make -j` |
| 56 | + |
| 57 | +RUN |
| 58 | +=========== |
| 59 | + |
| 60 | +The following are a few example command lines. |
| 61 | + |
| 62 | +- `$ ./tc_mine gr $GALOIS_HOME/lonestarmine/test_data/citeseer.csgr -t 28` |
| 63 | +- `$ ./kcl gr $GALOIS_HOME/lonestarmine/test_data/citeseer.csgr -k=3 -t 28` |
| 64 | +- `$ ./motif gr $GALOIS_HOME/lonestarmine/test_data/citeseer.csgr -k=3 -t 56` |
| 65 | +- `$ ./fsm adj $GALOIS_HOME/lonestarmine/test_data/citeseer.sadj -k=2 -ms=300 -t 28` |
| 66 | + |
| 67 | +PERFORMANCE |
| 68 | +=========== |
| 69 | + |
| 70 | +Please see details in the paper. |
| 71 | + |
| 72 | +CITATION |
| 73 | +========== |
| 74 | + |
| 75 | +Please cite the following paper if you use Pangolin: |
| 76 | + |
| 77 | +``` |
| 78 | +@article{Pangolin, |
| 79 | + title={Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU}, |
| 80 | + author={Xuhao Chen and Roshan Dathathri and Gurbinder Gill and Keshav Pingali}, |
| 81 | + year={2020}, |
| 82 | + journal = {Proc. VLDB Endow.}, |
| 83 | + issue_date = {August 2020}, |
| 84 | + volume = {13}, |
| 85 | + number = {8}, |
| 86 | + month = aug, |
| 87 | + year = {2020}, |
| 88 | + numpages = {12}, |
| 89 | + publisher = {VLDB Endowment}, |
| 90 | +} |
| 91 | +``` |
| 92 | + |
0 commit comments