About Terra-Ref

Modern agriculture has made great progress in reducing hunger and poverty and improving food security and nutrition but still faces tremendous challenges in the coming decades. In order to accelerate plant breeding, we need novel high-throughput phenotyping (HTP) approaches to advance the understanding of genotype-to-phenotype. The Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform (TERRA-REF) is one such program that aims to transform plant breeding by using remote sensing to quantify plant traits. The TERRA-REF project provides a data and computation pipeline responsible for collecting, transferring, processing and distributing large volumes of crop sensing and genomic data.

Gantry Sensors

The Lemnatec Scanalyzer Field System is a high-throughput phenotyping field-scanning robot that autonomously moves and continuously collects images of the crops it hovers. Attached to the 30-ton steel gantry of the field-scanning robot are sensors and cameras that collect different sets of data. The diverse array of sensors allow researchers to collect significant sets of data that can be used to leverage biological insight into how environments affect phenotypes and the overall relationship between genotypes (gene) and phenotypes (characteristic). Below are three sensors specific to this project:

Field Scanning Imaging Sensors

INSERT IMAGE HERE

Stereo RGB Camers

The Stereo RGB camera is a camera that captures images from above which enables researchers to determine canopy cover (spread of plants), the amount of crops, etc.

3D Laser Scanner (LIDAR)

A 3D scanner that captures the architecture of plants, such as leaf angles and shapes.

PSII Fluorescence Response Camera

A camera that allows researchers to understand how efficient plants are at photosynthesizing.

Transformer: Metadata Cleaner

This transformer is specialized specifically to clean Gantry metadata to allow easier processing by sensor extractors. This requires the transformer to also have its own transformer_class; a transformer instance for special parameter handling.

Parameters

There are two additional parameters for this transformer: sensor and userid.

  • The sensor parameter is required and refers to the metadata it’s associated with.
  • The userid parameter is optional and allows additional identification information to be stored with the cleaned metadata.

Sensor Extractors

Stereo 3D RGB Extractors

Canopy cover extractor

This extractor processes binary stereo images and generates plot-level percentage canopy cover traits for BETYdb.

Input

  • Evaluation is triggered whenever a file is added to a dataset

  • Following data must be found

    • _left.bin image
    • _right.bin image
    • dataset metadata for the left+right capture dataset; can be attached as Clowder metadata or included as a metadata.json file

Output

  • CSV file with canopy coverage traits will be added to original dataset in Clowder
  • The configured BETYdb instance will have canopy coverage traits inserted

Full field mosaic stitching extractor

This extractor takes a day of stereo BIN files and creates tiled JPG/TIFF images as well as a map HTML page.

Input

  • Currently this should be run on Roger as a job. Date is primary parameter.

3D Scanner Extractors

PLY to LAS conversion extractor

This extractor converts PLY 3D point cloud files into LAS files. The LAS file will be placed in the same directory as PLY file.

Input

  • Evaluation is triggered whenever a file is added to a dataset
  • Checks whether the file is a .PLY file

Output

  • The dataset containing the .PLY file will get a corresponding .LAS file

TERRA-REF Pipeline Bottlenecks

Computing:

  • Requires large computation of data
  • Each step in the Terra-ref pipeline requires interaction with a database
  • RabbitMQ lacks workflow features
  • Complex dependencies

Development:

  • Monitoring and reprocessing is time intensive
  • Difficult to add new algorithms
  • Not clear how to reuse and adapt components

Solution

Establish a generalized workflow that includes a template extractor, which will enable a lower barrier for contributors and reduce the effort for developers.

Intro to CC tools

The Cooperating Computing Tools (CCTools) help design and deploy scalable applications that run on hundreds or thousands of machines at once. Work Queue within CCTools is a framework for building large master-worker applications that span thousands of machines drawn from clusters, clouds, and grids.

CCTool’s ReadtheDocs

Concept Maps