Are you doing enough to ensure your data is secure, protected, and ready to be shared when you've finished your research? You can work backwards: Identify where your data will end up (your university's repository, a specialized repository, your funder's repository?) Review their data submission forms and take notes on what formats are accepted, what metadata must be included, and what services are offered. Then, review your existing protocols and procedures to insure a seamless workflow.
Common Issues | Best Practices |
Lab instruments and software are proprietary, file types are proprietary, data can only be read/analyzed on original instrument | Use most common machines/software for discipline |
Lab instruments and software are designed to run on older operating systems | Keep copies of the OS and software, be aware of compatibility issues caused by software updates |
Machines and software are not backwards compatible | Avoid upgrading software systems |
Exported/converted data cannot be manipulated (data loss) | Save both original, uncompressed data files and exported data |
Manufacturer mergers and acquisitions can lead to discontinued/unsupported products |
Data documentation and organization | |
Common Issues | Best Practices |
Inconsistently labeled files and folders | Use file naming conventions |
Poor version control | Use version control software, or dates in file/folder names |
Files saved in multiple locations | Use permanent data identifiers to prevent duplication |
Metadata is missing crucial information | Use disciplinary metadata standards |
No team protocols or procedures | Document workflow, build a data dictionary, write data documentation procedures |
Data storage | |
Common Issues | Best Practices |
Digital data are stored on local hard drives and not backed up | Back up 3 copies of everything (original + external/local + external/remote) |
Data are stored in the cloud owned by private sector | Use both hard drive and cloud storage |
Machines are set to override data to clear space | Change settings or increase storage capacity |
Files are kept on common drives | Establish access levels for files and folders |
Specimens, samples, etc. are not secure; paper lab notebooks can be damaged or lost | Have policy on keeping hard copies/specimens physically secure |
Data have to be shared with remote collaborators/rotating lab personnel | Train personnel on their roles in creating, storing, and taking responsibility for data security |
Common Issues | Best Practices |
Sensitive data are not encrypted or anonymized | Anonymize data using a random ID generator (not subject or experiment characteristics) |
Patient consent forms do not cover re-use, re-purposing, or sharing | Obtain permission from participants to make data publicly available |
Copyright material is used w/o permission to distribute derivatives | Review all relevant federal and state laws; seek copyright permissions |
Certain data cannot be deposited w/o violating patient privacy | Encrypt and store data for verification purposes only |
Certain data cannot be shared for national security reasons/trade secrets/patents | Encrypt and store data for verification purposes only |
Common Issues | Best Practices |
Researcher keeps the only copy of the data | Make copies of the data |
Others can access the data only be personal request (Researcher gets to vet uses) | Place in an institutional or disciplinary repository so it can be discovered and accessed |
Data is shared, but without metadata or instructions that make it possible for others to re-use or understand | Include workflows and metadata |
Process for accessing data is complicated or confusing | Set permissions while depositing data |
Placing your work in a public repository comes with added benefits, like fixity checks, metadata assistance, format migration, permissioning, backups, and increased discoverability.