Data integration - dataset joining by common variable: Creating a Single Table using Mutiple Datasets
Other features illustrated: Aggregate data multi-variable recode, computational calculation
Background: Different datasets may have variables whose value sets are defined alike(comparable). Examples include demographic variables (e.g. gender, race, marital status) and geographic variables (e.g. FIPS state, FIPS county). Co-tabulation allows you to tabulate information from the different datsets side by side using the comparable variable as a hinge (one of the dimensions).
Goals: 1) Estimate the number of people per gas station in each county in Maryland. 2) Add the number of households without vehicles to the table.
Other features illustrated: Using percent buttons, sorting on a column, creating a timeseries graph from a table cell(s), creating a table that averages across time, creating a table that shows data over time, turning off/on Automatic Totals, cross-variable explosion, nesting vs. not nesting
Background: In microdata datasets, you can create new variables with values that are based on the combinations of multiple variables' values.
Goals: 1) Determine the percentage of people with and without health care coverage by age groups. 2) Examine health insurance coverage numbers over time.
Other features illustrated: Viewing underlying records, clearing one dimension, changing universe from the spreadsheet
Background: If a dataset has address point information that has been defined correctly, points can be mapped on top of a thematic map from a separate dataset.
Goals: 1) Show the locations of all the public schools in DC on top of a thematic map showing the median household income by tracts from SF3. 2) Show how you can view the underlying records from the spreadsheet to see all the information for the universe of the selected cell(s).
Other features illustrated: Summing calculation, hiding columns, creating column spanners, viewing hidden columns, IF conditions
Background: This example shows how you can use ranking and apply weights to those ranks in order to answer questions like "What is the best county for ...?"
Goals: 1) Create a table that will rank the best county in the US in which to live, based on our 3 criteria - commute time, median household income, and median housing value. 2) Limit our rankings to counties that meet our population size criteria - between 50,000 and 150,000 people.
Other features illustrated: Advanced SQL
Background: You can have your table only show results for the records that meet your criteria. For example, if you want to see all the counties within a select number of states that have a population within a certain size, you can get those listed in a table and include additional information from the dataset.
Goals: 1) Create a listing of all the counties in Alabama, Georgia, and Florida that have a population of less than 75,000 people. 2) Also show the actual population estimate and the median household income for those counties.
Other features illustrated: Column number formatting (decimals shown)
Background: The DataFerrett spreadsheet typically does row and column calculations since calculations normally apply to all cells in the row or column equally. However, sometimes you want to create a calculation that uses one specific cell as part of the formula and not the entire column or row.
Goal: Create a table that calculates the percentage of drivers within commute time ranges for a set of counties, while still showing the numbers of drivers within each commute time.
Other features illustrated: Column formulas and IF conditions
Background: The DataFerrett spreadsheet allows you to calculate medians (or any percentile) on-the-fly by creating your own bins (microdata) or utilizing income distribution counts (aggregate data), and using standard formulas to calculate the linear percentile you have defined within a given distribution.
Goal: Create a table that calculates the median income for age groups (for the population 15 and over) using the March CPS supplement (Annual Social and Economic Supplement).