This research guide provides a list of external data repositories and registries to support research in Science, Technology, Engineering, Mathematics and Medicine.

- The
**first five headings**in the table of contents to the left of this page, provide general information and assistance for using datasets. - The
**remainder of the headings,**provide links to web sites with datasets relating to a specific subject area.

Data files are typically provided in **ASCII, SAS, SPSS or STATA format**. Using the associated metadata, users import data to a spreadsheet or statistical software program for statistical analysis. **Certain data manipulation and data analysis skills are required to download and use these datasets.**

**Other useful datasets may be found in the Library Guide: Datasets for Research in Social Science**

Understanding some **basic terminology** will help you to determine whether or not you need statistics, data or both.

**Statistics **are in a format where the data have already been analyzed and processed to produce information in an easy to read format such as charts, tables, and graphs. An example of this is Statistical Abstract of the United States. If you're looking for a quick number, it's best to start with statistics.

**Data **are typically raw data that need to be manipulated using software. Data can be quantitative, qualitative, spatial, etc. The difference between data and statistics can be confusing because in everyday language, the terms statistics and data are often used interchangeably.

**Numeric Data** is a type of data made up of numbers. Numeric Data are processed using statistical software like SPSS, Stata, or SAS.

**Qualitative Data** are data that describe a property or attribute. Examples of qualitative data are interviews, case studies, comments collected on a questionnaire, etc

**More Terminology:**

**Codebook **provides information on the structure, contents, and layout of a data file.

**Data Archive** preserves and makes accessible research data. Some examples are ICPSR, CPANDA, and CIESIN.

**Microdata** are data on the lowest level of observation such as individual answers to questions. For example, the U.S. Census Bureau's Public-Use Microdata Samples (PUMS files) is a data set of individual housing unit responses to census questions.

**Primary Data** are data collected through your own research study directly through instruments such as surveys, observations, etc.

**Raw Data** are the actual observations that are made when the data is collected.

**Secondary Data** are data from a research study conducted by someone else. Usually when you are asked to locate statistics on a topic you are using secondary data. An example of secondary data are statistics from the Census of Population and Housing.

**Summary Data** is another way of describing data that has been processed, or summarized (see **statistics**). For example, the tables you are reading when using statistical sources are summary data.

**Time Series** is a sequence of data points spaced over time intervals.

This research guide identifies electronic datasets to support statistical research in the science, technology, engineering, medicine and mathematics fields.

- Last Updated: Mar 17, 2021 4:15 PM
- URL: https://libguides.library.qut.edu.au/STEMdatasets
- Print Page

Subjects:
Engineering / Aerospace engineering, Engineering / Civil engineering, Engineering / Electrical engineering, Engineering / Mechanical engineering, Engineering / Medical engineering, Information technology / Computer science, Information technology / Information systems, Mathematics, Science / Biological sciences, Science / Chemistry, Science / Earth science, Science / Physics

Tags:
data, data mining, data repositories, datasets, engineering, mathematics, maths, science, technology

- CRICOS No. 00213J
- ABN 83 791 724 622

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia License.

QUT acknowledges the Traditional Owners of the lands where QUT now stands.