Skip to Main Content
George Mason University | University Libraries
See Updates and FAQs for the latest library services updates. Subject Librarians are available for online appointments, and Virtual Reference has extended hours.

Working with Data

What you need to know for Data Management and Data Wrangling

File Formats

File Formats

File Extensions

Common File Types

What kind of file(s) do I have? Look at the file extension. Starting from the top, find the first row that is for any of your files. 

If you have one of these... File Type What to do
.zip, .tar, .gz, .rar Zip File Unzip it, see the tab for instructions
.sav, .por, dta, .xpt, .stc, .sas7bdat Software-specific See the box "Converting Data"
.sps, .do, .dct, .sas, .R Syntax files, used to import See the tab "Setup Files"
.xls, .xlsx, .ods, .mdb, .dbf Spreadsheet or Database Most software can Import these
.csv, .tsv, .tab Delimited Text (ex. ASCII) Most software can Import these
.txt, .dat Other Text (ex. ASCII) Look at the documentation
.xml, .json, .html Structured Text or Markup Look at the documentation

Zip Files

Files are compressed, or zipped, to save storage space and/or to bundle several files together. 

Instructions on how to unzip files:

Even though you can see what is in the .zip file, you must unzip the file before you can open the individual files in the archive.

If the extension is ".tar.gz". You will likely need to use  7-zip or  another program to unzip it. Just unzip it twice (once to get rid of the .gz, and the second time to get rid of the .tar). 

Setup Files

In some cases, you will need to download data in ASCII (text) format and use a setup file (typically .sps, .do or .sas) to get it into your software. 

Many statistical packages will not run a set-up file unless you reset the Windows default setting that hides file extensions.

Comma Separated Values (CSV) is a useful file format for spreadsheets. It can be opened easily in programs like Excel.

Converting Data (SPSS, Stata, SAS, R)

Data File Extensions

  • SPSS: .sav  ( .por is their "portable" format )
  • Stata:  .dta  ( for .dct , see "Setup Files" )
  • SAS: .sas7bdat, .xpt ( for .stc files, use CIMPORT )

What software do you want to use?

SPSS or SAS

Both SPSS and SAS can open from and save files to SPSS, Stata, and SAS without additional steps. 

Stata

  • from SPSS: On Windows, you can open .sav files with usespss (see tab).
  • from SAS: For SAS Transport files (.xpt), use import sasxport.
  • Alternatively, you can use Stat/Transfer or R. See tabs in this box for more info.

Stat/Transfer

Data Services has a software called Stat/Transfer that can convert files from one software to another.

  • Stop by the Data Services Lab or email datahelp@gmu.edu for assistance.
  • Stat/Transfer is also available in the Founders Hall computer lab (Arlington Campus).

SPSS to Stata: Using Stata

Best option, but WINDOWS ONLY

1. Open Stata on a Windows computer

2. In the command window, submit the below syntax;

set more off
net from http://radyakin.org/transfer/usespss/beta
net install usespss
usespss


3. Select the SPSS .sav file in the dialog box that will pop up. 

4. Save the data file [in Stata format]

SPSS to Stata: Using R

1. Download the installation file from CRAN ( pdf with details )

2. Install then open R  ( screenshots )

3. At the prompt (>), paste the following lines and press enter,

library(foreign)
file <- file.choose() 
spss <- read.spss(file)
data <- as.data.frame(spss)
attr(data,"var.labels") <- as.vector(attr(spss,"variable.labels"))
write.dta(data, sub('.sav','.dta',file))


4. Select the SPSS .sav file in the dialog box that will pop up.

5. Look for the Stata (.dta) file in the same directory. 

Notes:

  • User Missing values are kept as is and not converted to Stata missing values.
  • String variables are "encoded" into labeled numeric variables.
  • String variables with labels do not transfer properly. 

SAS to Stata: Using R

1. Download the installation file from CRAN ( pdf with details )

2. Install then open R  ( screenshots )

3. At the prompt (>), paste the following lines and press enter,

install.packages("haven")
library(haven)
file <- file.choose() 
sas <- read_sas(file)
write_dta(sas, sub('.sas7bdat','.dta',file))


4. Select the SAS .sas7bdat file in the dialog box that will pop up.

5. Look for the Stata (.dta) file in the same directory.