Flywheel: Difference between revisions
imported>Wandell |
imported>Wandell |
||
Line 76: | Line 76: | ||
Protected Health Information (PHI) is considered High Risk Data according to the Stanford Data Classification Guidelines. PHI is defined as any information that can be used to identify an individual and may relate to their past, present, or future health. By law, this information must be encrypted by law and must be (a) stored in encrypted form, and (b) transmitted only through secure means. | Protected Health Information (PHI) is considered High Risk Data according to the Stanford Data Classification Guidelines. PHI is defined as any information that can be used to identify an individual and may relate to their past, present, or future health. By law, this information must be encrypted by law and must be (a) stored in encrypted form, and (b) transmitted only through secure means. | ||
Anonymized research data for publication can be shared without harm. See: //med.stanford.edu/irt/security/hipaa.html. Although CNI participants are not medical patients, we treat human subject data with the same PHI status as if they were patients. Specifically, data uploaded to Flywheel | Anonymized research data for publication can be shared without harm. See: //med.stanford.edu/irt/security/hipaa.html. Although CNI participants are not medical patients, we treat human subject data with the same PHI status as if they were patients. Specifically, data uploaded to Flywheel strips information from these fields: | ||
* | * Patient ID field | ||
* | * Date of Birth | ||
* | * Medical Record Number | ||
* | * Subject Name (first and last) | ||
For PIs, please note we do look for PHI information and over the years we have found that users sometimes insert names, phone numbers and other information that might identify the subject. When we find such information, we delete it. The most likely places that RAs and others insert this information are these fields: | |||
* Subject code (part of Patient ID) | * Subject code (part of Patient ID) | ||
Line 90: | Line 89: | ||
* Series description | * Series description | ||
Thus, we ask all CNI users: '''Never use any PHI identifiers when entering data into the scanner!''' | |||
At present, to describe the PHI methods in your publications, you might find this summary paragraph helpful. | |||
''Data at Stanford's Center for | ''Data at Stanford's Center for Cognitive and Neurobiological Imaging are securely transferred from the MR scanner directly to a data management system (Flywheel.io) that is running within a Google platform space that is approved for research data. Prior to transfer the data file headers are stripped of fields that may contain subject information (patient id, DOB, MRN, name fields). These procedures meet the Stanford standard for anonymized research data for publication and can be shared without harm. See: //med.stanford.edu/irt/security/hipaa.html.'' | ||
'''Note to CNI users:''' Flywheel is planning to provide a substantial enhancement of the anonymization protocol so that each lab will have the ability to control the details of its own anonymization protocol. We will send a note describing the capabilities when the feature is released - which is expected first quarter 2021. | |||
= Data Processing on Upload = | = Data Processing on Upload = |
Revision as of 23:40, 6 November 2020
The CNI uses Flywheel for data and computational management. Data from the scanner are automatically uploaded to the Flywheel database. The Flywheel system is on the Google Cloud Platform. Users download data either using a web-browser or another of the methods provided by Flywheel (e.g., command-line-interface or software-development-kit).
At the CNI, as data come into the Flywheel they are automatically processed according to rules determined by the CNI staff. For example, upon ingest PHI data are removed, DICOM and raw PFile data are converted to NIfTI format, and metadata about the scan parameters are read and inserted into the database. Some labs may configure initial processing of the data for individual projects differently. For help with customization groups can contact Flywheel directly or ask Michael Perry (see below).
This page describes Flywheel features that are specific to the CNI. You can learn more about Flywheel using the links below:
- A 5 minute video overview of Flywheel is available on YouTube
- Manual pages and basic material about the system are maintained by Flywheel at this site.
- Links to videos and webinars are on the Flywheel Docs site
Log in to CNI's Flywheel site here: cni.flywheel.io
Authentication
Authentication to Flywheel at the CNI requires a valid (full-account) SUNetID.
New users should ask the CNI staff to create an account.
Data upload
Flywheel captures data directly from the MR scanner via its Connector technology. The scanner produces DICOM (and P-File) data, and those data are automatically packaged and uploaded to Flywheel. For a typical session, series data will be in Flywheel before the session has ended.
Flywheel Sort String (Good)
The Flywheel Connector determines where to place data in the hierarchy using information either entered into the MR console by the user or obtained from the DICOM data.
To make sure the data you collect is sent to the correct Flywheel project for your lab, you must enter a "Flywheel sort string" on the console.
Enter this string into the Patient ID field on the scanner console.
The string format is:
<subject_label>@<group>/<project>
Flywheel Sort String (Bad)
If you do not enter the sort string correctly, the data will still be sent to Flywheel, but it will not be routed to the correct project. Instead, it will be assigned to the "unknown" group or "Unsorted" project, depending on what Flywheel can determine.
- If the data are sent to the unknown group, you must ask the Site Admins to retrieve your data. (Michael, Laima).
- If the data are sent to the "Unsorted" project (case sensitive), you can find it and move it yourself.
Thus, for a given piece of data coming in through a Connector:
No Group, No Project --> group_id = "unknown", project = "Unsorted" No Group, Project --> group_id = "unknown", project = <Project> Group, No Project --> group_id = <Group>, project = "Unsorted"
Session Labels and Tags
In some cases, users want to control the name of the session rather than have Flywheel use the default name.
You can control the session name using the AdditionalPatientHistory section on the console.
Set the session label and tags using the following format (without the "<>"):
label_<your_desired_session_label>
Flywheel also lets you tag a session to make it easier to find with its search function.
You can set a tag by inserting this string in the AdditionalPatientHistory section.
tag_<tag_name>
Protected Health Information
Protected Health Information (PHI) is considered High Risk Data according to the Stanford Data Classification Guidelines. PHI is defined as any information that can be used to identify an individual and may relate to their past, present, or future health. By law, this information must be encrypted by law and must be (a) stored in encrypted form, and (b) transmitted only through secure means.
Anonymized research data for publication can be shared without harm. See: //med.stanford.edu/irt/security/hipaa.html. Although CNI participants are not medical patients, we treat human subject data with the same PHI status as if they were patients. Specifically, data uploaded to Flywheel strips information from these fields:
- Patient ID field
- Date of Birth
- Medical Record Number
- Subject Name (first and last)
For PIs, please note we do look for PHI information and over the years we have found that users sometimes insert names, phone numbers and other information that might identify the subject. When we find such information, we delete it. The most likely places that RAs and others insert this information are these fields:
- Subject code (part of Patient ID)
- Exam description
- Series description
Thus, we ask all CNI users: Never use any PHI identifiers when entering data into the scanner!
At present, to describe the PHI methods in your publications, you might find this summary paragraph helpful.
Data at Stanford's Center for Cognitive and Neurobiological Imaging are securely transferred from the MR scanner directly to a data management system (Flywheel.io) that is running within a Google platform space that is approved for research data. Prior to transfer the data file headers are stripped of fields that may contain subject information (patient id, DOB, MRN, name fields). These procedures meet the Stanford standard for anonymized research data for publication and can be shared without harm. See: //med.stanford.edu/irt/security/hipaa.html.
Note to CNI users: Flywheel is planning to provide a substantial enhancement of the anonymization protocol so that each lab will have the ability to control the details of its own anonymization protocol. We will send a note describing the capabilities when the feature is released - which is expected first quarter 2021.
Data Processing on Upload
At upload time data are processed according to a given Project's "Gear Rules". For help with Gear rules please reach out to Michael Perry.
Downloading Data
There are several ways to download data from Flywheel, including via the WEB UI, the command line interface (CLI), and the Flywheel SDK (which is available for both Python and MATLAB). The links below can help you get started with the various methods of export:
UI Downloads
Flywheel SDK
Flywheel CLI
CLI: Tips and Best Practices for CNI Users
This section focuses on a few tips for using the CLI to download your data. For a complete overview of the Flywheel CLI, including how to get started, please look through the Flywheel CLI Documentation.
Tip: Download only the files you need
When downloading data from Flywheel using the CLI you can greatly speed up your downloads by excluding data types which are not needed for your analysis.
Example: Exclude pfile and DICOM data from a container download: Most users do not need to download the raw scanner files (PFILES) or raw DICOM data. You can exclude certain data types from your downloads by using the `-e` flag with your CLI download, like so:
fw download "cni/testproject/subject1/session1" -e pfile -e dicom
This tells the CLI to exclude any pfile and dicom files in the container. Note that you can use consecutive -e flags to exclude multiple data types.
Example: Download only NIfTI, BVEC, and BVAL files:
Most users are only interested in the data that will be input to their analysis pipelines. This is most often limited to three data types (nifti, bvec, and bval). You can use the following command with multiple include flags (`-i`) to accomplish exactly that:
fw download "cni/testproject/subject1/session1" -i nifti -i bvec -i bval
This will generate an archive (.tar) file containing the requested hierarchy with only those files you explicitly need.
Example: Using quotes
Often times your source-path (that is the group/project/subject/session string the describes the location of your data in Flywheel) will have one or more spaces or special characters in it. To properly address that location using the CLI it's important to use quotes around the source path, like so:
fw download "cni/testproject/subject1/session1"
Example Download a single file with the CLI
So you have navigated to a container and your only desire is to download a single file from that container, the best way to do that is using the CLI with the 'files' spec filter.
Example: Download a single NIfTI file from an acquisition container:
fw download "test/Unsorted/s001/18591/T1w 1mm/files/18591_13_1.nii.gz"
The important bit here is the inclusion of "files" prior to the file name.
Using Sherlock and other Computers
We suggest that you use the Flywheel Command Line Interface (CLI) to transfer data to other compute resources, like Sherlock.
To download and install the CLI on Sherlock we use the `wget` tool, then `unzip` to extract the CLI resource package, and finally modify our `.bashrc` file to add the fw binary as an alias in our environment.
Step 1: Get the URL for the CLI package, which can be found on the "Profile" page within the Flywheel interface. To grab the URL, find the CLI section on the Profile page, right-click the Linux CLI Download link, and choose "Copy Link Address" (or similar). Once you have the download link address move to the next step.
Step 2: Log in to Sherlock and run the `wget` command, using the URL from Step 1 to download the CLI package.
mkdir -p flywheel/cli cd flywheel/cli wget //storage.googleapis.com/flywheel-dist/cli/<version>/fw-linux_amd64.zip
Step 3: Unpack the CLI archive and cleanup the downloaded package.
unzip fw-linux_amd64.zip mv linux_amd64/fw . rmdir linux_amd64 rm -f fw-linux_amd64.zip
Step 4: Modify your `.bashrc` file to add the fw CLI command to your environment, and source it to make the alias active. Note that this only needs to be done once.
echo -e "alias fw='$HOME/flywheel/cli/fw'" >> $HOME/.bashrc source $HOME/.bashrc
Step 5: Once the above steps are complete, you should be able to log in using the CLI and use it as described in the official documentation: //flywheelio.zendesk.com/hc/en-us/sections/360001596834-Command-Line-Interface. The best way to do this is navigate to your profile page in Flywheel, make sure that you have generated an API Key, and use the login command text that is provided for you there.
fw login <your API key>
Data Migration from NIMS
Before there was Flywheel, there was NIMS. Data have been preserved in NIMS and they can be migrated to Flywheel on an as-needed basis. Please inquire with Michael Perry for more information.
Support
Michael can help with most CNI Flywheel issues. If support is needed for Flywheel issues not related to the CNI, please email their help line: support@flywheel.io.