You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+13-4
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
Brisera
2
2
=======
3
3
4
-
A Python implementation of a distributed seed and reduce algorithm (similar to BlastReduce and CloudBurst) that utilizes RDDs (resilient distributed datasets) to perform fast iterative analyses and dynamic programming without relying on chained MapReduce jobs.
4
+
A Python implementation of a distributed seed and reduce algorithm (similar to BlastReduce and CloudBurst) that utilizes RDDs (resilient distributed datasets) to perform fast iterative analyses and dynamic programming without relying on chained MapReduce jobs.
5
5
6
6
Quick Start
7
7
-----------
@@ -20,9 +20,18 @@ To install the required dependencies:
20
20
21
21
The code for Brisera is found in the `brisera` Python module. This module must be available to the spark applications (e.g. able to be imported) either by running the spark applications locally in the working directory that contains `brisera` or by using a virtual environment (recommended). You can install `brisera` and all dependencies, use the setup.py function:
22
22
23
-
$ python setup.py install
23
+
$ python setup.py install
24
24
25
-
But note that you will still have to have access to the Spark applications that are in the `apps/` directory - don't delete them out of hand!
25
+
But note that you will still have to have access to the Spark applications that are in the `apps/` directory - don't delete them out of hand!
26
+
27
+
Usage
28
+
-----
29
+
30
+
To read a burst sequence file (e.g. `fixtures/cloudburst/100k.br`) in order to compare results from CloudBurst to Brisera, you can use the `read_burst.py` Spark application as follows:
This will write out each record (or chunk) from the sequence file to a text file on disk.
26
35
27
36
Other Details
28
37
-------------
@@ -37,4 +46,4 @@ Brisera means to "explode" or to "burst" in Swedish. Since I'm reworking CloudBu
37
46
38
47
1. X\. Li, W. Jiang, Y. Jiang, and Q. Zou, “Hadoop Applications in Bioinformatics,” in Open Cirrus Summit (OCS), 2012 Seventh, 2012, pp. 48–52.
39
48
40
-
1. R\. K. Menon, G. P. Bhat, and M. C. Schatz, “Rapid parallel genome indexing with MapReduce,” in Proceedings of the second international workshop on MapReduce and its applications, 2011, pp. 51–58.
49
+
1. R\. K. Menon, G. P. Bhat, and M. C. Schatz, “Rapid parallel genome indexing with MapReduce,” in Proceedings of the second international workshop on MapReduce and its applications, 2011, pp. 51–58.
0 commit comments