Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
| University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Research Data Management Basics

This guide covers practical tips and best practices for managing project data.

Document & Describe

Documentation describes your research project and data you have collected, generated, or analyzed. 
Documenting your data: 

  • enables you and others to understand your data in detail,
  • allows other researchers to find, use, and properly cite your data. 

It is critical to begin documenting your data at the very beginning of your research project—even before data collection begins.  Doing so makes data documentation easier and reduces the likelihood that you will forget aspects of your data later.

Documentation is often produced in a readme.txt file, a data dictionary, or a codebook.

Below are some resources that provide guidelines for creating these documents. At minimum, store documentation in a readme.txt file or the equivalent, together with the data files.  

Data documentation can include information such as:

  • Title of dataset, creation date.
  • Investigator names, keywords.
  • Purpose of study, research questions, hypotheses.
  • File formats, content, size, relationship among files. 
  • Data source, provenance, copyright permissions. 
  • Data identifiers (DOI, URI).
  • Information on confidentiality, access & use conditions.
  • Variable names and description.
  • Explanation of codes and classification schemes.
  • Software used (including version).
  • Sampling techniques, methodology, experimental protocols. 
  • Equipment/instrument settings. 
  • Software syntax, code. 
  • Associated presentations/manuscripts/articles.

Metadata is data about your data and also a form of documentation. When writing metadata, researchers can choose among various metadata standards—often tailored to a particular file format or discipline. For example the DCMI (Dublin Core Metadata Initiative) is a simple metadata standard used to describe a variety of formats (texts, images, bibliographic records, etc.).  

Guide to Writing Metadata (USGS)

Metadata and describing data (Cornell RDMSG)

Directory of disciplinary metadata (DCC) includes metadata standards for biology, earth science, physical science, social science and humanities, and general research data.

Selected metadata standards:

General

Geosciences

Life Sciences

Social Sciences & Humanities

When documenting and sharing research data, it is essential to remove direct and indirect identifiers from your data. Below are some guides which describe these identifiers and methods for removing them.

For general guidelines, see Guide to Social Science Data Preparation and Archiving from ICPSR.