Title: | Estimate and Visualize Voting Blocs' Partisan Contributions |
---|---|
Description: | Functions to combine data on voting blocs' size, turnout, and vote choice to estimate each bloc's vote contributions to the Democratic and Republican parties. The package also includes functions for uncertainty estimation and plotting. Users may define voting blocs along a discrete or continuous variable. The package implements methods described in Grimmer, Marble, and Tanigawa-Lau (2022) <doi:10.31235/osf.io/c9fkg>. |
Authors: | Justin Grimmer [aut],
Will Marble [aut] |
Maintainer: | Cole Tanigawa-Lau <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.1.9000 |
Built: | 2025-02-13 03:28:26 UTC |
Source: | https://github.com/coletl/blocs |
Selected columns from the American National Election Studies' 2020 cumulative data file. The final column is an example of the three-valued variable for voting behavior, to be passed to the 'dv_vote3' argument,
anes
anes
A data frame with 68,224 rows and 13 columns:
election year
respondent identifier
survey weight
respondent race
respondent gender
respondent education level
respondent age
respondent's voter turnout
respondent's presidential vote
flag indicating Democratic presidential vote choice
flag indicating Republican presidential vote choice
Three-valued voting behavior DV coded as follows: -1 for Democrat vote choice, 0 for third-party vote, 1 for Republican vote choice, and NA for no vote.
https://electionstudies.org/data-center/anes-time-series-cumulative-data-file/
Validator for class vbdf
check_vbdf(x, tolerance = sqrt(.Machine$double.eps))
check_vbdf(x, tolerance = sqrt(.Machine$double.eps))
x |
object to check |
tolerance |
tolerance used when checking range of probability estimates |
Run kde for weighted density estimation
of a x
at n_points
evenly spaced points between min
and max.
estimate_density(x, min, max, n_points = 100, w = NULL, ...)
estimate_density(x, min, max, n_points = 100, w = NULL, ...)
x |
numeric vector or matrix |
min |
numeric vector giving the lower bound of evaluation points for each variable in |
max |
numeric vector giving the upper bound of evaluation points for each variable in |
n_points |
number of evaluation points (estimates) |
w |
vector of weights. Default uses uniform weighting. |
... |
further arguments to pass to kde |
Constructor for class vbdf
new_vbdf(x, bloc_var = character(), var_type = c("discrete", "continuous"))
new_vbdf(x, bloc_var = character(), var_type = c("discrete", "continuous"))
x |
a data.frame |
bloc_var |
character vector naming the variables to define voting blocs |
var_type |
string, the type, discrete or continuous |
Constructor for vbdf summaries
new_vbsum(x, bloc_var, var_type, summary_type, resamples)
new_vbsum(x, bloc_var, var_type, summary_type, resamples)
x |
data.frame of uncertainty summary |
bloc_var |
string, the name of the variable that defines the voting blocs |
var_type |
string, the type of variable, discrete or continuous |
summary_type |
string, the type of variable, discrete or continuous |
resamples |
numeric, the number of bootstrap resamples |
A vbsum
object
Define voting blocs along a continuous variable and estimate their partisan vote contributions.
vb_continuous( data, data_density = data, data_turnout = data, data_vote = data, indep, dv_vote3, dv_turnout, weight = NULL, min_val = NULL, max_val = NULL, n_points = 100, boot_iters = FALSE, verbose = FALSE, tolerance = sqrt(.Machine$double.eps), ... )
vb_continuous( data, data_density = data, data_turnout = data, data_vote = data, indep, dv_vote3, dv_turnout, weight = NULL, min_val = NULL, max_val = NULL, n_points = 100, boot_iters = FALSE, verbose = FALSE, tolerance = sqrt(.Machine$double.eps), ... )
data |
default data.frame to use as the source for density, turnout, and vote choice data. |
data_density |
data.frame of blocs' composition/density data. Must
include any columns named by |
data_turnout |
data.frame of blocs' turnout data. Must include any
columns named by |
data_vote |
data.frame of blocs' vote choice data. Must include any
columns named by |
indep |
string, column name of the independent variable defining discrete voting blocs. |
dv_vote3 |
string, column name of the dependent variable in |
dv_turnout |
string, column name of the dependent variable flagging
voter turnout in |
weight |
optional string naming the column of sample weights. |
min_val |
numeric vector of the same length as |
max_val |
numeric vector of the same length as |
n_points |
scalar, number of points at which to estimate density. See [estimate_density]. |
boot_iters |
integer, number of bootstrap iterations for uncertainty
estimation. The default |
verbose |
logical, whether to print iteration number. |
tolerance |
tolerance used when checking range of probability estimates |
... |
further arguments to pass to kde for density estimation. |
a vbdf
data.frame with columns for the resample, bloc variable,
and, for each resample-bloc combination, four estimates:
probability density, turnout, Republican vote choice conditional on turnout,
and net Republican votes.
Use vbdf output to calculate differences in blocs' net Republican vote contributions.
vb_difference( vbdf, estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(vbdf), value = TRUE), sort_col = "year", tolerance = sqrt(.Machine$double.eps) )
vb_difference( vbdf, estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(vbdf), value = TRUE), sort_col = "year", tolerance = sqrt(.Machine$double.eps) )
vbdf |
data.frame holding the results of voting bloc analyses. |
estimates |
character vector naming the column(s) in |
sort_col |
character vector naming the column(s) in |
tolerance |
tolerance used when checking range of probability estimates |
A vbdf
object, plus two types of columns:
for each column named in estimates
, a column named diff_*
containing the
difference in each estimate across sort_col
values,
comp
, which contains a string tag for the rows compared (e.g., 2020-2016),
A vbdf
object.
Define voting blocs along a discrete variable and estimate their partisan vote contributions.
vb_discrete( data, data_density = data, data_turnout = data, data_vote = data, indep, dv_vote3, dv_turnout, weight = NULL, boot_iters = FALSE, verbose = FALSE, check_discrete = TRUE )
vb_discrete( data, data_density = data, data_turnout = data, data_vote = data, indep, dv_vote3, dv_turnout, weight = NULL, boot_iters = FALSE, verbose = FALSE, check_discrete = TRUE )
data |
default data.frame to use as the source for density, turnout, and vote choice data. |
data_density |
data.frame of blocs' composition/density data. Must
include any columns named by |
data_turnout |
data.frame of blocs' turnout data. Must include any
columns named by |
data_vote |
data.frame of blocs' vote choice data. Must include any
columns named by |
indep |
string, column name of the independent variable defining discrete voting blocs. |
dv_vote3 |
string, column name of the dependent variable in |
dv_turnout |
string, column name of the dependent variable flagging
voter turnout in |
weight |
optional string naming the column of sample weights. |
boot_iters |
integer, number of bootstrap iterations for uncertainty
estimation. The default |
verbose |
logical, whether to print iteration number. |
check_discrete |
logical, whether to check if |
A vbdf
object.
Plot the summary of a voting bloc analysis
vb_plot( data, x_col = get_bloc_var(data), y_col, ymin_col, ymax_col, discrete = length(unique(data[[x_col]])) < 20 )
vb_plot( data, x_col = get_bloc_var(data), y_col, ymin_col, ymax_col, discrete = length(unique(data[[x_col]])) < 20 )
data |
a |
x_col |
string naming the column that defines voting blocs. |
y_col |
string naming the column of point estimates. |
ymin_col |
string naming the column to plot as the lower bound of the confidence interval. |
ymax_col |
string naming the column to plot as the upper bound of the confidence interval. |
discrete |
logical indicating whether voting blocs are defined along a discrete (not continuous) variable. |
a ggplot object
Summarize uncertainty for a vbdf objects. Analysis must have run with bootstrap iterations.
vb_uncertainty
is just an alias for vb_summary
.
vb_summary( object, type = c("discrete", "continuous", "binned"), estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(object), value = TRUE), na.rm = FALSE, funcs = c("mean", "median", "low", "high"), low_ci = 0.025, high_ci = 0.975, bin_col, tolerance = sqrt(.Machine$double.eps) ) vb_uncertainty( object, type = c("discrete", "continuous", "binned"), estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(object), value = TRUE), na.rm = FALSE, funcs = c("mean", "median", "low", "high"), low_ci = 0.025, high_ci = 0.975, bin_col, tolerance = sqrt(.Machine$double.eps) )
vb_summary( object, type = c("discrete", "continuous", "binned"), estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(object), value = TRUE), na.rm = FALSE, funcs = c("mean", "median", "low", "high"), low_ci = 0.025, high_ci = 0.975, bin_col, tolerance = sqrt(.Machine$double.eps) ) vb_uncertainty( object, type = c("discrete", "continuous", "binned"), estimates = grep("prob|pr_turnout|pr_votedem|pr_voterep|cond_rep|net_rep", names(object), value = TRUE), na.rm = FALSE, funcs = c("mean", "median", "low", "high"), low_ci = 0.025, high_ci = 0.975, bin_col, tolerance = sqrt(.Machine$double.eps) )
object |
a |
type |
a string naming the type of independent variable summary. Use
|
estimates |
character vector naming columns for which to calculate uncertainty estimates. |
na.rm |
logical indicating whether to remove |
funcs |
character vector of summary functions to apply to
|
low_ci |
numeric. If you include the string |
high_ci |
numeric. If you include the string |
bin_col |
character vector naming the column(s) that define the bins. Used only when |
tolerance |
tolerance used when checking range of probability estimates |
A summary object with additional columns for each combination
of estimates
and funcs
.
Create a vbdf object holding bloc-level estimates of composition, turnout,
and/or vote choice. This function is mostly for internal use, but you may
want it to create a vbdf
object from your own voting bloc analysis.
A valid vbdf
object can be used in [vb_difference] and [vb_plot].
vbdf( data, bloc_var, var_type = c("discrete", "continuous"), tolerance = sqrt(.Machine$double.eps) )
vbdf( data, bloc_var, var_type = c("discrete", "continuous"), tolerance = sqrt(.Machine$double.eps) )
data |
data.frame of voting-bloc results to convert to a |
bloc_var |
string, the name of the variable that defines the voting blocs |
var_type |
string, the type of variable, discrete or continuous |
tolerance |
tolerance used when checking range of probability estimates |
A vbdf
object.
Weighted frequency table or proportions
wtd_table( ..., weight = NULL, na.rm = FALSE, prop = FALSE, return_tibble = FALSE, normwt = FALSE )
wtd_table( ..., weight = NULL, na.rm = FALSE, prop = FALSE, return_tibble = FALSE, normwt = FALSE )
... |
vectors of class factor or character, or a list/data.frame of such vectors. |
weight |
optional vector of weights. The default uses uniform weights of 1. |
na.rm |
logical, whether to remove NA values. |
prop |
logical, whether to return proportions or counts. Default returns counts. |
return_tibble |
logical, whether to return a tibble or named vector. |
normwt |
logical, whether to normalize weights such that they sum to 1. |
a vector or tibble of counts or proportions by group