S-GIMME Tutorial
October 13, 2020
S-GIMME
Subgrouping GIMME was designed to enable researchers to identify meaningful clusters of individuals based only on their time-series data, with no a priori information regarding condition, behavior, or individual characteristics. Specifically, S-GIMME conducts the community-detection algorithm Walktrap (Pons & Latapy, 2006) on temporal features available during GIMME model building. After arriving at the group-level effects, S-GIMME identifies effects that may be specific to each subgroup. Finally, as with the original GIMME algorithm, S-GIMME conducts individual-level searches. All weights are estimated at the individual level—even for those temporal relations found to exist at the group or subgroup levels. Importantly, the researcher does not need to decide the number of subgroups. The final models contain reliable group-, subgroup-, and individual-level patterns that enable generalizable inferences, subgroups of individuals with shared model features, and individual-level patterns and estimates. This document is a brief tutorial on using S-GIMME.
To run S-GIMME, one calls the gimme (or equivalently, gimmeSEM) function with the field subgroup = TRUE. Other subgrouping options include sub_feature and sub_method.
gimme_output <- gimme( # can use "gimme" or "gimmeSEM" data = '', # source directory or list where your data are (if source folder indicated for "data" argument) out = '', # output directory where you'd like your output to go sep = ",", # how data are separated. "" for space; "," for comma, "/t" for tab-delimited header = FALSE, # TRUE or FALSE, is there a header plot = TRUE, # TRUE (default) or FALSE, generate plots subgroup = TRUE, # Must be TRUE to perform confirmatory subgrouping sub_feature = "lag & contemp", # option to indicate features to subgroup individuals. sub_method = "Walktrap", # option to indicate community detection method used for subgrouping. groupcutoff = .75, # the proportion that is considered the majority at the group level subcutoff = .75 # the proportion that is considered the majority at the subgroup level )
Output
If subgroup = TRUE, a subgroup output directory is created with the following data:
- subgroupkPathCounts Contains counts of relations among lagged and contemporaneous variables for the kth subgroup.
- subgroupkPlot (if plot = TRUE) Contains plot of group, subgroup, and individual level paths for the kth subgroup. Black represents group-level paths, grey represents individual-level paths, and green represents subgroup-level paths.
For example, the following two plots were produced in the output directory using simData, a simulated time series data set. simData has 25 people with 200 time points and 10 variables (brain regions of interest). This data is a part of the GIMME package.
This is the subgroupkPlot.
This is the summaryPathsplot (see Output tutorial for further explanation).
Note: if a subgroup of size n = 1 is discovered, subgroup-level output is not produced.
Evaluating Subgroups
Subgroups can be evaluated using perturbR.
For example:
perturbRout <- perturbR( sym.matrix = gimme_output$sim_matrix, # from the variable gimme_output generated above plot = TRUE, # TRUE (default) or FALSE, generate plots resolution = 0.01, # The percentage of edges to iteratively alter. One percent is default, increase to go quicker. reps = 100, # The number of repititions to do for each level of perturbation. Decrease to make it go quicker. errbars = TRUE, # Logical, defaults to FALSE. Option to add error bars of one standard deviation above and below the mean for each point. )
Two plots are generated: “Comparison of original result against perturbed graphs: ARI” and “Comparison of original result against perturbed graphs: VI”. The output used to produce the plots was generated using simData.
Additionally, the subgroups can be evaluated in a correlogram using corrplot.
For example:
a = cor(gimme_output$sim_matrix) corrplot(a, method = "color")