Flywheel

From CNI Wiki
Revision as of 19:30, 20 June 2020 by imported>Wandell
Jump to navigation Jump to search

The CNI uses Flywheel for data and computational management. The Flywheel system runs on the cloud and provides users access to the data and also implements various data processing steps that are set up by default. These include transferring the data from the scanner to the system, converting DICOM data to NIfTI, and certain other functions.

This page describes Flywheel features that are specific to the CNI.

  • General Flywheel documentation can be found at this site.
  • A 5 minute video overview of Flywheel can be fund on YouTube

cni.flywheel.io The CNI Flywheel site

Authentication

Authentication to Flywheel requires that you have a valid (full-account) SUNetID and is handled though Google.

New users should inquire with CNI staff for account creation.

Data Organization

Flywheel captures data directly from the MR scanner via its reaper technology. As the scanner produces DICOM (and P-File) data, those data are automatically packaged and uploaded to Flywheel. For a typical session, acquisition data will already be in Flywheel before the session has ended.

The Flywheel reaper determines where to place data in the hierarchy using information either entered into the MR console by the user or obtained from the DICOM data.

Flywheel Hierarchy

Flywheel organizes data into the following hierarchy:

  • Group
  • Project
  • Subject
  • Session
  • Acquisition

The first three levels of hierarchy are set by the user at the time of data acquisition via a "Flywheel sort string". This string is entered into the Patient ID field on the scanner console.

Sort String

Flywheel Sort String is entered into the Patient ID field on the scanner console and has the following format:

 <subject_label>@<group>/<project>

Groups and Projects

When the reaper is not able to associate a given dataset with a group (i.e., it is not provided, or the group does not exist), the reaper will send those data to "unknown" group - which only Site Admins have access to. Similarly, when the reaper cannot associate those data with a given project it goes to an "Unsorted" project (case sensitive).

Thus, for a given piece of data coming in through a Reaper:

 No Group, No Project --> group_id = "unknown", project = "Unsorted"
 No Group, Project --> group_id = "unknown", project = <Project>
 Group, No Project --> group_id = <Group>, project = "Unsorted"

Key Takeaways

  • If a project does not already exist within an existing group, even if that project is provided by the user during data collection, those data will land in the target group's "Unsorted" project.
  • If the "Unsorted" project does not already exist within a known group, it will be created by the reaper (or API) during ingestion.
  • If the "unknown" group does not exist on an instance, it will be created by the API.

Tags and Session Labels

Within the AdditionalPatientHistory section on the console, you can set the session label and tags using the following format (without the "<>"):

 label_<your_desired_session_label> 

and

 tag_<tag_name>

Protected Health Information

Protected Health Information (PHI) is considered High Risk Data according to the Stanford Data Classification Guidelines. Falling under the definition of PHI is any information that can be used to identify an individual, which personally relates to their past, present, or future health. This information must be encrypted by law, and must be stored only in encrypted form, and transmitted only through secure means. However, in the case of research data for publication, PHI can be anonymized such that it is no longer considered "protected", and can therefore be released without harm. See: https://med.stanford.edu/irt/security/hipaa.html. Even though CNI is not a healthcare facility, we treat human subjects as if they were patients.

IT IS THE RESEARCHERS' RESPONSIBILITY TO ENSURE THAT THE DATA YOU ACQUIRE AT CNI HAS NO PHI

PHI in MR DICOM images

Any information in the “header” (DICOM tags) that can identify the human subject is PHI (Protected Health Information).

This includes:

  • name (or initials)
  • address
  • Social Security Number
  • phone number
  • anything else that clearly identifies the human subject

Never use any such identifiers on scanners!

In particular, be careful about what you enter in these fields:

  • Subject code (part of Patient ID)
  • Exam description
  • Series description

Data Processing

Data are processed according to a given Project's "Gear Rules". For help with Gear rules please reach out to Michael Perry.

Data Migration from NIMS

Historical data, preserved in NIMS, are available for migration to Flywheel on an as-needed basis. Please inquire with Michael Perry for more information.

Downloading Data

There are several ways to download data from Flywheel, including via the WEB UI, the command line interface (CLI), and the Flywheel SDK (which is available for both Python and MATLAB). The links below can help you get started with the various methods of export:



  • Download with the Flywheel CLI
    • See Downloading and Sherlock sections below


  • Flywheel SDK
    • MATLAB example of tar download
    • Python example of tar download


Downloading data with the CLI (Tips and Best Practices)

This section focuses on a few tips for using the CLI to download your data. For a complete overview of the Flywheel CLI, including how to get started, please visit the Flywheel CLI Documentation.

Download only the files you need

When downloading data from Flywheel using the CLI you can greatly speed up your downloads by excluding data types which are not needed for your analysis.


Example: Exclude pfile and DICOM data from a container download: Most users do not need to download the raw scanner files (PFILES) or raw DICOM data. You can exclude certain data types from your downloads by using the `-e` flag with your CLI download, like so:

 fw download "cni/testproject/subject1/session1" -e pfile -e dicom

This tells the CLI to exclude any pfile and dicom files in the container. Note that you can use consecutive -e flags to exclude multiple data types.


Example: Download only NIfTI, BVEC, and BVAL files:
Most users are only interested in the data that will be input to their analysis pipelines. This is most often limited to three data types (nifti, bvec, and bval). You can use the following command with multiple include flags (`-i`) to accomplish exactly that:

 fw download "cni/testproject/subject1/session1" -i nifti -i bvec -i bval

This will generate an archive (.tar) file containing the requested hierarchy with only those files you explicitly need.


Use quotes

Often times your source-path (that is the group/project/subject/session string the describes the location of your data in Flywheel) will have one or more spaces or special characters in it. To properly address that location using the CLI it's important to use quotes around the source path, like so:

 fw download "cni/testproject/subject1/session1"


Download a single file with the CLI

So you have navigated to a container and your only desire is to download a single file from that container, the best way to do that is using the CLI with the 'files' spec filter.


Example: Download a single NIfTI file from an acquisition container:

 fw download "test/Unsorted/s001/18591/T1w 1mm/files/18591_13_1.nii.gz"

The important bit here is the inclusion of "files" prior to the file name.


Using Sherlock and other Remotes

For those users who wish to transfer data to other remote compute resources, like Sherlock, we suggest that you use the Flywheel Command Line Interface (CLI) to do so. The CLI allows you to perform many tasks, one of which is downloading data, with the ability to restrict to only certain data types (e.g., NIfTI files only) or exclude specific data types (e.g., no PFiles or DICOMS).


Downloading and installing the CLI on Sherlock

To download and install the CLI on Sherlock we will use the `wget` tool, then `unzip` to extract the CLI resource package, and finally modify our `.bashrc` file to add the fw binary as an alias in our environment.


Step 1: Get the URL for the CLI package, which can be found on the "Profile" page within the Flywheel interface. To grab the URL, find the CLI section on the Profile page, right-click the Linux CLI Download link, and choose "Copy Link Address" (or similar). Once you have the download link address move to the next step.


Step 2: Log in to Sherlock and run the `wget` command, using the URL from Step 1 to download the CLI package.

 mkdir -p flywheel/cli
 cd flywheel/cli
 wget https://storage.googleapis.com/flywheel-dist/cli/7.1.0/fw-linux_amd64.zip


Step 3: Unpack the CLI archive and cleanup the downloaded package.

 unzip  fw-linux_amd64.zip
 mv linux_amd64/fw .
 rmdir linux_amd64
 rm -f fw-linux_amd64.zip


Step 4: Modify your `.bashrc` file to add the fw CLI command to your environment, and source it to make the alias active. Note that this only needs to be done once.

 echo -e "alias fw='$HOME/flywheel/cli/fw'" >> $HOME/.bashrc
 source $HOME/.bashrc


Step 5: Once the above steps are complete, you should be able to log in using the CLI and use it as described in the official documentation: https://flywheelio.zendesk.com/hc/en-us/sections/360001596834-Command-Line-Interface. The best way to do this is navigate to your profile page in Flywheel, make sure that you have generated an API Key, and use the login command text that is provided for you there.

 fw login <your API key>


Support

Michael can help with many, if not all, issues related to Flywheel. If further support is needed for Flywheel specific issues, not related to the CNI, please feel free to email support@flywheel.io.