The Office of Research Computing focuses on High Performance Computing (HPC) to deal with large amounts of data.
E-mail: orchelp@gmu.edu
       (use @gmu.edu address)
"Big data can be described in terms of data management challenges that – due to increasing volume, velocity and variety of data – cannot be solved with traditional databases."
With the widespread use of technologies such as social media and e-commerce comes a large amount of data. In fact, nowadays so much data is generated at such a rapid pace that it's impossible to store and analyze it all through traditional means (i.e. relational databases, basic statistics). Instead, all the "Big Data" that's generated essentially gets dumped into giant repositories to be dealt with at a later point in time. A substantial degree of skill is required to be able to sift through this data (usually distributed across multiple computers working in parallel) and detect patterns within it (i.e. data mining.)
Below are some generalities that may be useful for evaluating your data file. These terms do not have widely shared definitions, so they are just guidelines.
For reference, most personal computers these days (2018) have a 500GB-1TB hard drive and 4-8GB RAM (sometimes16GB).
* All statistical software puts a dataset into RAM except for SAS, which is why SAS is popular in Government and Finance.
Ask a Librarian | Hours & Directions | Mason Libraries Home
Copyright © George Mason University