Advanced User Manual¶
Contents
Who is this tutorial for?¶
This tutorial is meant for users who will be swapping between available extractors or integrating new ones. It will provide in-depth instructions into changing the workflow files and ….
- You should be comfortable using:
- Jx language
- CCTools.
- Docker containers.
- iRods.
- HPC.
- More.
Why use PhytoOracle?¶
PhytoOracle is a scalable, modular data pipeline that reduces processing times. If you are looking for a pipeline that provides the flexibility to add new extractors or develop some, this pipeline is for you..
Swapping extractors¶
PhytoOracle makes it easy to swap between extractors. To being swapping, edit the Makeflow file as follows:
Editing our workflow¶
- Begin by stating the rules, including the command and the inputs/outputs of your extractor.
{
"rules": [
{
"command" : "/bin/echo hello world > output.txt",
"outputs" : [ "output.txt" ],
"inputs" : [ ]
}
]
}
Defining your values¶
- Define elements of your workflow:
{
"define": {
# symbol definitions go here
},
"rules": [
# Rules you created above go here
]
}
- Continuing from the example in Step 1:
{
"define":{
"message" : "hello world!"
},
"rules": [
{
"command": "/bin/echo " +message+ " > output-from-define.txt",
"outputs": [ "output-from-define.txt" ],
"inputs": [ ],
}
]
}
Running your workflow¶
- Now you can run it locally!
$ makeflow --jx define-hello.jx
parsing define-hello.jx...
local resources: 4 cores, 7764 MB memory, 2097151 MB disk
max running local jobs: 4
checking define-hello.jx for consistency...
define-hello.jx has 1 rules.
starting workflow....
submitting job: /bin/echo hello world! > output-from-define.txt
submitted job 1376
job 1376 completed
- Then run the following:
$ cat output-from-define.txt
hello world!
Creating Multiple Jobs¶
Workflows enable you to run analysis codes. Below is an example of how to string multiple jobs together:
- Write your job and generate multiple instance of the job
{
"rules": [
{
"command" : "python ./example.py --parameter + N + " > output." + N + ".txt",
"inputs" : [ "example.py" ],
"outputs" : [ "output." + N + ".txt" ]
} for N in [1, 2, 3]
]
}
- Stitch Results
{
"command" : "/bin/cat + join(["output.1.txt","output.2.txt","output.3.txt"], " ") + " > output.all.txt",
"inputs" : [ "output." + N + ".txt" ] for N in [1,2,3] ],
"outputs" : [ "output.all.txt" ]
}
- Or you could factor out the definition of the list and the range to the define section of the workflow as follows:
Understand Jx language Rule > command > inputs/outputs explicitly stated Swap with your Docker container Variables in Makeflow file..