Package 'prcr' reference manual

Title:	Person-Centered Analysis
Description:	Provides an easy-to-use yet adaptable set of tools to conduct person-center analysis using a two-step clustering procedure. As described in Bergman and El-Khouri (1999) <DOI:10.1002/(SICI)1521-4036(199910)41:6%3C753::AID-BIMJ753%3E3.0.CO;2-K>, hierarchical clustering is performed to determine the initial partition for the subsequent k-means clustering procedure.
Authors:	Joshua M Rosenberg [aut, cre], Jennifer A Schmidt [aut], Patrick N Beymer [aut], Rebecca R Steingut [ctb]
Maintainer:	Joshua M Rosenberg <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.1
Built:	2025-03-14 05:00:07 UTC
Source:	https://github.com/jrosen48/prcr

Create profiles of observed variables using two-step cluster analysis

Description

Create profiles of observed variables using two-step cluster analysis

Usage

create_profiles_cluster(
  df,
  ...,
  n_profiles,
  to_center = FALSE,
  to_scale = FALSE,
  distance_metric = "squared_euclidean",
  linkage = "complete"
)
create_profiles_cluster(
  df,
  ...,
  n_profiles,
  to_center = FALSE,
  to_scale = FALSE,
  distance_metric = "squared_euclidean",
  linkage = "complete"
)

Arguments

`df`	with two or more columns with continuous variables
`...`	unquoted variable names separated by commas
`n_profiles`	The specified number of profiles to be found for the clustering solution
`to_center`	Boolean (TRUE or FALSE) for whether to center the raw data with M = 0
`to_scale`	Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1
`distance_metric`	Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust)
`linkage`	Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist)

Details

Function to create a specified number of profiles of observed variables using a two-step (hierarchical and k-means) cluster analysis.

Value

A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object.

Examples

d <- pisaUSA15
m3 <- create_profiles_cluster(d, 
                              broad_interest, enjoyment, instrumental_mot, self_efficacy,
                              n_profiles = 3)
summary(m3)
d <- pisaUSA15
m3 <- create_profiles_cluster(d, 
                              broad_interest, enjoyment, instrumental_mot, self_efficacy,
                              n_profiles = 3)
summary(m3)

Identifies potential outliers

Description

Identifies potential outliers

Usage

detect_outliers(df, return_index = TRUE)
detect_outliers(df, return_index = TRUE)

Arguments

`df`	data.frame (or tibble) with variables to be clustered; all variables must be complete cases
`return_index`	Boolean (TRUE or FALSE) for whether to return only the row indices of the possible multivariate outliers; if FALSE, then all of the output from the function (including the indices) is returned

Details

* add an argument to ‘create_profiles_cluster()' to remove multivariate outliers based on Hadi’s (1994) procedure

Value

either the row indices of possible multivariate outliers or all of the output from the function, depending on the value of return_index

Estimates R^2 (r-squared) values for a range of number of profiles

Description

Estimates R^2 (r-squared) values for a range of number of profiles

Usage

estimate_r_squared(
  df,
  ...,
  to_center = FALSE,
  to_scale = FALSE,
  distance_metric = "squared_euclidean",
  linkage = "complete",
  lower_bound = 2,
  upper_bound = 9,
  r_squared_table = TRUE
)
estimate_r_squared(
  df,
  ...,
  to_center = FALSE,
  to_scale = FALSE,
  distance_metric = "squared_euclidean",
  linkage = "complete",
  lower_bound = 2,
  upper_bound = 9,
  r_squared_table = TRUE
)

Arguments

`df`	with two or more columns with continuous variables
`...`	unquoted variable names separated by commas
`to_center`	(TRUE or FALSE) for whether to center the raw data with M = 0
`to_scale`	Boolean (TRUE or FALSE) for whether to scale the raw data with SD = 1
`distance_metric`	Distance metric to use for hierarchical clustering; "squared_euclidean" is default but more options are available (see ?hclust)
`linkage`	Linkage method to use for hierarchical clustering; "complete" is default but more options are available (see ?dist)
`lower_bound`	the smallest number of profiles in the range of number of profiles to explore; defaults to 2
`upper_bound`	the largest number of profiles in the range of number of profiles to explore; defaults to 9
`r_squared_table`	if TRUE (default), then a table, rather than a plot, is returned; defaults to FALSE

Details

Returns ggplot2 plot of cluster centroids

Value

A list containing a ggplot2 object and a tibble for the R^2 values

student questionnaire data with four variables from the 2015 PISA for students in the United States

Description

student questionnaire data with four variables from the 2015 PISA for students in the United States

Usage

pisaUSA15
pisaUSA15

Format

Data frame with columns #'

CNTSTUID: international student ID
SCHID: international school ID

...

Source

http://www.oecd.org/pisa/data/

Return plot of profile centroids

Description

Return plot of profile centroids

Usage

plot_profiles(d, to_center = F, to_scale = F)
plot_profiles(d, to_center = F, to_scale = F)

Arguments

`d`	summary data.frame output from create_profiles_cluster()
`to_center`	whether to center the data before plotting
`to_scale`	whether to scale the data before plotting

Details

Returns ggplot2 plot of cluster centroids

Value

A ggplot2 object

Prints details of prcr cluster solution

Description

Prints details of prcr cluster solution

Usage

## S3 method for class 'prcr'
print(x, ...)
## S3 method for class 'prcr'
print(x, ...)

Arguments

`x`	A 'prcr' object
`...`	Additional arguments

Details

Prints details of of prcr cluster solution

Concise summary of prcr cluster solution

Description

Concise summary of prcr cluster solution

Usage

## S3 method for class 'prcr'
summary(object, ...)
## S3 method for class 'prcr'
summary(object, ...)

Arguments

`object`	A 'prcr' object
`...`	Additional arguments

Details

Prints a concise summary of prcr cluster solution

Package 'prcr'

Help Index

Create profiles of observed variables using two-step cluster analysis

Description

Usage

Arguments

Details

Value

Examples

Identifies potential outliers

Description

Usage

Arguments

Details

Value

Estimates R^2 (r-squared) values for a range of number of profiles

Description

Usage

Arguments

Details

Value

student questionnaire data with four variables from the 2015 PISA for students in the United States

Description

Usage

Format

Source

Return plot of profile centroids

Description

Usage

Arguments

Details

Value

Prints details of prcr cluster solution

Description

Usage

Arguments

Details

Concise summary of prcr cluster solution

Description

Usage

Arguments

Details