R/r-group-summary.R
RGroupSummary.Rd
Before and after running ACE Models, it is important to examine the characteristics of the different groups. When the ACE is estimated with an SEM using multiple groups, it is even even more important. Groups may contain too few subjects to have a well-behaved covariance matrix.
If a group's covariance matrix is not Positive Definite (or it's misbehaving in some other way), it's typically recommended to exclude that group from the SEM.
RGroupSummary( ds, oName_S1, oName_S2, rName = "R", determinantThreshold = 1e-05 )
ds | The base::data.frame containing the following variables: |
---|---|
oName_S1 | The name of the outcome variable corresponding to the first subject in the pair. |
oName_S2 | The name of the outcome variable corresponding to the first subject in the pair. |
rName | The name of the variable specifying the pair's |
determinantThreshold | The minimum value the covariance matrix's determinant (for the group) should exceed to be considered Positive Definite. |
A base::data.frame with one row per group. The base::data.frame contains the following variables:
R The group's R
value. Note the name of this variable can be changed by the user, by specifying a non-default value to the rName
argument.
Included Indicates if the group should be included in a multiple-group SEM.
PairCount The number of pairs in the group with complete data for R
and the two outcome/manifest variables.
O1Mean The mean (of the outcome variable) among the group's first members, excluding the missing values.
O2Mean The mean (of the outcome variable) among the group's second members, excluding the missing values.
O1Variance The variance (of the outcome variable) among the group's first members.
O2Variance The variance (of the outcome variable) among the group's second members.
O1O2Covariance The covariance (of the outcome variable) across the group's first and second members.
Correlation The correlation (of the outcome variable) across the group's first and second members.
Determinant The determinant of the group's covariance matrix.
PosDefinite Indicates if the group's covariance matrix is positive definite.
This function isn't specific to an ACE model and groups defined by R
. It could be applied to any multiple-group SEM with two manifest/outcome variables. In the future, we may generalize it beyond two manifest variables.
To get summary stats for the entire sample, create a dummy indicator variable that assigns everyone to the same group. See the second example below.
The default determinantThreshold value is nonzero, in order to forgive slight numerical inaccuracies caused by fixed-precision arithmetic.
Please see Neale & Maes for more information about SEM with multiple groups.
Will Beasley and David Bard
library(NlsyLinks) #Load the package into the current R session. dsLinks <- Links79PairExpanded # Load the dataset from the NlsyLinks package. dsLinks <- dsLinks[dsLinks$RelationshipPath=='Gen2Siblings', ] oName_S1 <- "MathStandardized_S1" # Stands for Outcome1 oName_S2 <- "MathStandardized_S2" # Stands for Outcome2 dsGroupSummary <- RGroupSummary(dsLinks, oName_S1, oName_S2) dsGroupSummary#> R Included PairCount O1Mean O2Mean O1Variance O2Variance #> 1 0.250 TRUE 2689 95.10450 95.97936 126.9489 150.1775 #> 2 0.375 TRUE 137 93.63139 93.36861 160.0120 136.6628 #> 3 0.500 TRUE 5491 99.89374 100.02868 168.7326 172.7293 #> 4 0.750 FALSE 2 108.50000 106.00000 220.5000 18.0000 #> 5 1.000 TRUE 21 98.21429 96.02381 289.4393 215.2369 #> O1O2Covariance Correlation Determinant PosDefinite #> 1 41.96914 0.3039577 17303.459 TRUE #> 2 50.39790 0.3408090 19327.735 TRUE #> 3 90.04116 0.5274225 21037.642 TRUE #> 4 63.00000 1.0000000 0.000 FALSE #> 5 229.10714 0.9179130 9807.933 TRUE#Should return: # R Included PairCount O1Mean O2Mean O1Variance O2Variance O1O2Covariance Correlation #1 0.250 TRUE 2718 94.6439 95.5990 169.650 207.842 41.0783 0.218761 #2 0.375 TRUE 139 92.6043 93.1655 172.531 187.081 40.4790 0.225311 #3 0.500 TRUE 5511 99.8940 100.1789 230.504 232.971 107.3707 0.463336 #4 0.750 FALSE 2 108.5000 106.0000 220.500 18.000 63.0000 1.000000 #5 1.000 TRUE 22 98.6364 95.5455 319.195 343.117 277.5887 0.838789 # Determinant PosDefinite #1 33573.0 TRUE #2 30638.7 TRUE #3 42172.2 TRUE #4 0.0 FALSE #5 32465.6 TRUE #To get summary stats for the whole sample, create one large inclusive group. dsLinks$Dummy <- 1 (dsSampleSummary <- RGroupSummary(dsLinks, oName_S1, oName_S2, rName="Dummy"))#> Dummy Included PairCount O1Mean O2Mean O1Variance O2Variance #> 1 1 TRUE 8340 98.24454 98.60504 160.6817 168.9103 #> O1O2Covariance Correlation Determinant PosDefinite #> 1 78.80601 0.4783524 20930.41 TRUE#Should return: # Dummy Included PairCount O1Mean O2Mean O1Variance O2Variance O1O2Covariance #1 1 TRUE 8392 98.07162 98.56864 216.466 229.2988 90.90266 # Correlation Determinant PosDefinite #1 0.4080195 41372.1 TRUE ### ### ReadCsvNlsy79 ### if (FALSE) { filePathGen2 <- "~/Nlsy/Datasets/gen2-birth.csv" ds <- ReadCsvNlsy79Gen2(filePath=filePathGen2) }