Jekyll2023-02-24T11:27:43+11:00https://blog.jamovi.org/jamoviStats. Open. Now.jamoviCalculating Log Likelihood Ratios (LLRs) using the jeva module2023-02-22T13:00:00+11:002023-02-22T13:00:00+11:00https://blog.jamovi.org/2023/02/22/jeva<h3 id="tldr">tl;dr</h3>
<p>Ever wanted to try doing an evidential analysis? You may have found it difficult to find a statistical platform to do it. Now there is the jamovi module <strong>jeva</strong> which can provide log likelihood ratios for a range of common statistical tests.</p>
<!--more-->
<p>Imagine for a moment that we wish to carry out a statistical test on our sample of data. We do not want to know whether the procedure we routinely use gives us the correct answer with a specified error rate (such as the Type I error) – the frequentist approach. Nor do we want to concern ourselves with possible a priori probabilities of hypotheses being true – the Bayesian approach. We need to know whether a statistic from this particular set of data is consistent with one or more hypothetical values. Also, let’s say that we weren’t happy with how much data we had collected (a familiar problem?), and just added more when convenient. Welcome to the likelihood (or evidential) approach!</p>
<p>In my view, using just LLRs is actually the simplest and most direct procedure for making statistical inferences. I will not here rally arguments in support of the likelihood approach, but merely point the reader to better sources: (<a href="#edwards1992">Edwards 1992</a>, <a href="#royall1997">Royall 1997</a>, <a href="#dennis2019">Dennis, Ponciano et al. 2019</a>)</p>
<p>For the given data, the likelihood approach calculates the natural logarithm of the likelihood ratio for two models being compared. So, it should be possible to do likelihood analyses in any of the major statistical packages, right? Well, it is possible to obtain log likelihood and deviance in any procedure that uses maximum likelihood. So yes, in <strong>SPSS</strong> when you run a logistic regression you will find <em>-2 Log likelihood</em> in the <em>Model Summary</em>. The same statistic is given by <em>SAS</em> in the <em>Model Fit Statistics</em>. In <strong>Minitab</strong> the statistic is labelled <em>deviance</em>. Finally, <strong>Stata</strong> gives the <em>log likelihood</em>. (I am grateful to Professor Sander Greenland for pointing out the availability of these statistics in common statistical packages.) In <strong>R</strong>, deviance statistics are given using <em>glm</em>, and <em>logLik</em> can be used to extract the log likelihood. In <strong>jamovi</strong>, the deviance and AIC are available, for example in binomial logistic regression. Although log likelihoods are available, we need to know which ones to use and how to calculate a log likelihood ratio.</p>
<h3 id="calculating-the-llr">Calculating the LLR</h3>
<p>Bear with me, or skip this section if you have never coded.</p>
<p>Now, I want to calculate the LLR for some data that consists of measurement data in 2 independent groups. I want to know how much more likely is the model with fitted means compared to the null model (one grand mean only). I will assume equal variances in the 2 groups. OK, this is just an independent samples <em>t</em> test. Working in <strong>R</strong>, we first create the data (representing increased hours of sleep following a drug):</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mysample <- c(0.7, -1.6, -0.2, -1.2, -0.1, 3.4, 3.7, 0.8, 0.0, 2.0)
treat <- rep(1:0,each=5)
</code></pre></div></div>
<p>We then do separate analyses for the means model (m1) and the null model (m2):</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>m1=glm(mysample~treat)
m2=glm(mysample~1)
</code></pre></div></div>
<p>The easiest way to get the log likelihoods is using the <em>logLik</em> function (note the capital L in the middle). Remembering that the log of a ratio is the same as subtracting the log of the denominator from the numerator:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>logLik(m1) - logLik(m2)
</code></pre></div></div>
<p>This prints:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>'log Lik.' 3.724533 (df=3)
</code></pre></div></div>
<p>This is the answer we are looking for, and represents strong evidence against the null hypothesis, see Table 1 below. The output also tells us that the fitted model (m1) has 3 parameters, which are: the variance and the two group means. The null model only has the variance and the grand mean.</p>
<p>There is a widely held misconception that deviance for a normal model is −2 x log likelihood (−2LL), e.g. <a href="#goodman1989">Goodman 1989</a>. It is not, it is only the sums of squares (SS). Only for non-normal models (such as Poisson and binomial) does the deviance equal −2LL. However, the unexplained SS (deviance in normal data) can provide likelihoods (see <a href="#glover2004">Glover and Dixon 2004</a> for details). We can use the following formula using the residual SS (RSS) and total SS (TSS) to calculate the LLR for a fitted model versus the null:</p>
<script type="math/tex; mode=display">LLR=-\frac{N}{2} \left(log \left(\frac{RSS}{TSS}\right) \right)=-\frac{N}{2} (log(RSS)-log(TSS))</script>
<p>Where <em>N</em> is the total sample size. In <strong>R</strong> we do this either using the SS given in an ANOVA table or from the deviances. We will use the latter:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-length(m1$y)/2*(log(m1$deviance) - log(m2$deviance))
</code></pre></div></div>
<p>This gives us the same result as earlier. Another way to do the analysis is to use Akaike Information Criterion AIC, which uses log likelihoods but takes into account the number of parameters in the two models:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(m2$aic - m1$aic)/2
</code></pre></div></div>
<p>This gives us 2.724533, exactly 1 less than the previous answer. The reason for this is that the LLR is penalized for increasing the complexity of the model by 1 parameter (for the two group means) over the null model. Some recommend this or a similar penalty depending on the sample size (<a href="#glover2018">Glover 2018</a>), although if we compare the <em>p</em> value produced by the corrected likelihood ratio test with that produced by the regular <em>t</em> test, then such correction seems harsh.</p>
<p>I tried asking ChatGPT: “how to calculate a log likelihood ratio for independent samples t test using R?”. The answer was fine, calculating null and means models, but gives in the final line:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LLR <- 2 * (logLik(model1) - logLik(model0))
</code></pre></div></div>
<p>The use of logLik function is good, but of course, the 2 times multiplier is wrong . This is, I think, an attempt to relate it to the deviance. As we noted earlier, it is only identified with −2LLR in certain circumstances.</p>
<p>I think the reader gets the idea. There are issues about using the right functions appropriately and the number of parameters to consider. Doing a likelihood analysis of a simple independent samples <em>t</em> test is not so straightforward. If we wanted to use Welch’s test on the data, or wanted to compare a model with a null hypothesis value different from 0, it would be even more difficult to calculate the LLR. It would require quite a few more lines of code.</p>
<h3 id="using-jeva">Using <em>jeva</em></h3>
<p>When I published my book a few years ago (<a href="#cahusac2020">Cahusac 2020</a>) I was surprised how difficult it was to find software that would calculate LLRs for common statistical tests. This led me to produce a package in R called likelihoodR (<a href="#cahusac2022a">Cahusac 2022</a>). Since then, I discovered that the best platform was <strong>jamovi</strong>, where users can produce modules for custom analyses. So I produced a module called <em>jeva</em> (<strong>j</strong>amovi <strong>ev</strong>idential <strong>a</strong>nalyses). Each analysis has an option to give explanatory text, to help explain what the findings of a particular analysis means in terms of the evidential framework used.</p>
<h3 id="independent-samples-t-test">Independent Samples <em>t</em> test</h3>
<p>The <em>jeva</em> analysis of 2 independent samples <em>t</em> test is given in Figure 1. This is for the same data given earlier (increased hours of sleep in males and females following a drug).</p>
<p><img src="/assets/images/jeva/screenshot.png" alt="jeva analysis of 2 independent samples. The dialog box is on the left, where we select the data and options appropriate for our analysis. The output is on the right." /></p>
<blockquote>
<p>Figure 1: <em>jeva</em> analysis of 2 independent samples. The dialog box is on the left, where we select the data and options appropriate for our analysis. The output is on the right.</p>
</blockquote>
<p>The analysis shown assumes equal variances, otherwise we could select Welch’s. In the <strong>Support</strong> table, the LLR (<em>S</em>) for <script type="math/tex">H_0</script> vs observed mean difference is −3.725. Why negative? Well, it depends on what order we specify the two models, and we have specified the <script type="math/tex">H_0</script> first. This means that the null hypothesis is much less likely than the observed mean difference. To see how much less likely, we use <script type="math/tex">e^{-3.725}=0.024</script>, or the inverse, 41.5 times less likely. In the dialog box I have entered 2 for <script type="math/tex">H_a</script>, which was suggested as a clinically important sleeping time for the effect of the drug. The Support table gives the LLR for this value versus the observed difference in means as <script type="math/tex">S = −0.19</script> (1.2 times different), which represents a trivial difference. The 3rd row in the table gives the LLR for <script type="math/tex">H_a</script> vs <script type="math/tex">H_0</script> as 3.535, representing strong evidence (<script type="math/tex">H_a</script> is 34.3 times more likely than the null). This is a comparison which can easily be done using the likelihood approach – no <em>p</em> is available. The likelihood function at the bottom shows the positions of the different hypothesis values and the observed mean difference (2.46, vertical dashed line). The horizontal red line is the likelihood interval for <em>S</em>-2 (values given in the <strong>Support interval for mean difference</strong> box above it), closely corresponding to the 95% confidence interval. The vertical black line is <script type="math/tex">H_0</script> at 0, and the vertical blue line is <script type="math/tex">H_a</script> at 2. With familiarity, it is easier to think in terms of the LLRs, known as support values <em>S</em>, rather than the exponentiated values. A useful comparison table is given in the Additional Option of Explanatory text, and is reproduced in Table 1 below (based on a table given by Goodman (1989)).</p>
<p><img src="/assets/images/jeva/lr-table.png" alt="" /></p>
<blockquote>
<p>Table 1: Interpreting the support <em>S</em> obtained in an LLR analysis for one hypothesis value versus a second. The middle column shows the likelihood ratio, which is <script type="math/tex">e^S</script>. The right column gives the interpretation. Negative <em>S</em> values just mean that the second hypothesis value is more likely than the first.</p>
</blockquote>
<p>The natural log is used in all the calculations and the LLRs can simply be added together, for example when accumulating the evidence of the same phenomenon. Do you want to add to your study data? That’s not allowed using the frequentist paradigm, but fine in the likelihood approach. It also makes meta-analysis easy, just add the <em>S</em> values together.</p>
<h3 id="odds-ratio">Odds Ratio</h3>
<p>Let’s look at another analysis in <em>jeva</em>, the odds ratio (<em>OR</em>). These data are from a double-blind randomized clinical trial of folic acid supplements in pregnant women. About half received folic acid and half received placebo during pregnancy. The outcome was whether the babies born had a neural tube defect or not. Putting the data into <strong>jamovi</strong>, the first part of the output gives us the summary 2 x 2 table:</p>
<p><img src="/assets/images/jeva/contingency-table.png" alt="" width="500px" /></p>
<p>Since the intervention (folic acid) appeared to reduce the defect, the <em>OR</em> was less than 1. In fact, it was 0.283. A more complete picture of the analysis is shown in Figure 2.</p>
<p><img src="/assets/images/jeva/odds-ratio.png" alt="" /></p>
<blockquote>
<p>Figure 2: Doing an odds ratio analysis in <em>jeva</em>. On the left is the dialog box giving settings and options. On the right is the output.</p>
</blockquote>
<p>The settings in the dialog box show that the null hypothesis <script type="math/tex">H_0</script> is 1, as it normally should be, but we can specify another value if necessary. We can also select an alternative hypothesis value <script type="math/tex">H_a</script> and here we have entered 0.85. This could be a value suggested as the minimal clinical effectiveness for the intervention.</p>
<p>The first line of the <strong>Support: Odds Ratio analyses</strong> table gives us the strength of evidence for the null versus the observed <em>OR</em> as <em>S</em> = −4.39. This means that there is extremely strong evidence that the observed <em>OR</em> differs from the null value of 1. (The trial was discontinued and all women were given folic acid). We will from now on refrain from exponentiating it to obtain how many times more likely, and use the <em>S</em> values alone. The next line in the table gives the strength of evidence for the alternative hypothesis of 0.85 compared with the observed <em>OR</em>. At −3.28 the evidence is strong that the observed value differs from the minimal clinical effectiveness value, i.e. the observed <em>OR</em> is much better than the required minimum. The final line in the table pits the alternative hypothesis value against the null. With <em>S</em> = 1.12, this suggests that there is at best weak evidence for a difference between these two values. In other words, although the observed <em>OR</em> strongly differs from the null and the alternative, the evidence suggests that these two values cannot be easily distinguished. For a comparison with the frequentist approach, the final column in the Support table gives the corresponding <em>p</em> values. These are consistent with the <em>S</em> value analysis (.003, .010 and .135 respectively). They are available through the likelihood ratio (or <em>G</em>) test, which means simply multiplying the <em>S</em> values by 2 and using the <script type="math/tex">\chi^2</script> distribution to obtain the <em>p</em> values. Despite its name, the likelihood ratio test is not part of the evidential approach as it uses <em>p</em> values.</p>
<p>The next <strong>Support</strong> table (for marginal effects and interaction analyses) allows us to see if there are obvious differences in the marginal totals. Since similar numbers were allocated to folic acid and placebo, it is not surprising that <em>S</em> is close to 0. Analysis of the other marginal totals for presence of defect gives an extremely high <em>S</em> of 699, trivially telling us that most babies did not suffer from the defect. The 3rd row here repeats the <em>S</em> value from the first line of the previous table except that it is now positive. We are looking at the evidence for an interaction against the null model of no interaction. If we change the <script type="math/tex">H_0</script> <em>OR</em> to something other than 1 then the values given in the first Support table would change, but not those in this table. The final line in this table calculates the <em>S</em> value, assuming that the 4 cells contain 1195/4 = 298.75. It is satisfying (to me at least) that the 3 previous components sum precisely to the total <em>S</em> = 703.715.</p>
<p>The table below the <strong>Support</strong> tables gives intervals for the observed <em>OR</em>. The first line gives the support interval. We have specified the level to be 3 in the dialog box. The lower and upper limits for the <em>S</em>-3 support interval are given, and correspond graphically to what is shown in the likelihood function curve below (see Figure 3). The next line gives a 95% likelihood-based confidence interval. It is close to the regular 95% confidence interval which <strong>jamovi</strong> gives as: 0.113 to 0.706. The advantage of the likelihood-based 95% confidence interval is that it is more accurate and is parameterization-invariant (<a href="#pritikin2017">Pritikin, Rappaport et al. 2017</a>). The <em>S</em>-2 support interval is fairly close to both of these intervals: 0.101 to 0.676.</p>
<p><img src="/assets/images/jeva/likelihood-function.png" alt="" /></p>
<blockquote>
<p>Figure 3: The likelihood function for the <em>OR</em>. The obtained value is shown by the dashed vertical line. The <script type="math/tex">H_0</script> <em>OR</em> of 1 is shown as a vertical black line at the right side of the plot, and the <script type="math/tex">H_a</script> <em>OR</em> of 0.85 is shown as the vertical blue line. The horizontal red line is the support <em>S</em>-3 interval. Both the <script type="math/tex">H_0</script> and <script type="math/tex">H_a</script> values lie outside the interval, since their <em>S</em> values versus the obtained <em>OR</em> exceed the absolute value of 3 (−4.39 and −3.28 respectively).</p>
</blockquote>
<h3 id="the-llr-support-interval">The LLR Support Interval</h3>
<p>This identifies a supported range of values which are consistent with the observed statistic. In <em>jeva</em> it is denoted as <em>S</em>-X, where X can be any number between 1 and 100. The <em>S</em>-2 interval is commonly used since, as mentioned, it is numerically close to the 95% confidence interval. For the <em>S</em>-2 interval, it means that the values within the interval have likelihood ratios in the range 0.135 to 7.38, corresponding to e⁻² to e². Simply put, within an <em>S</em>-2 interval, no likelihoods are more than 7.38 times different from each other. Similarly, for the <em>S</em>-3 interval, likelihood ratios will range from 0.050 to 20.09, corresponding to e⁻³ to e³, and no likelihoods will be more than 20.09 times different from each other.</p>
<h3 id="variance-analysis">Variance Analysis</h3>
<p>The categorical analyses all feature a variance analysis. When selected in the <em>OR</em> analysis we get this output:</p>
<p><img src="/assets/images/jeva/variance-analysis.png" alt="" width="500px" /></p>
<p>This analysis specifically addresses the issue of whether the variance of the counts in the cells vary more or less than we would expect by chance, assuming an <em>OR</em> of 1. With this particular analysis, we get <em>S</em> = 2.6 which is more than moderate evidence that the variance is different than expected. This broadly agrees with the value given earlier of <em>S</em> = 4.4 in the first Support table, although it is concerned with variance rather than the means (i.e. the expected frequencies). If the first count in the contingency table is changed from a 6 to 21, the obtained <em>OR</em> becomes very close to 1, and the Support table gives <em>S</em> = −0.001, indicating no difference from an <em>OR</em> of 1. However, the variance analysis now gives <em>S</em> = 2.9, almost strong evidence that the obtained <em>OR</em> is closer to 1 than we would expect (i.e. the variance is smaller than expected). The corresponding <script type="math/tex">\chi^2</script> was very small at 0.001, with 1 – p = 0.026 (statistically significant). It is now answering the question as to whether the data fit the model too well – are the data too good to be true? This can be used to test whether data, like Mendel’s data (<a href="#edwards1986">Edwards 1986</a>), fit a model too well. Edwards (<a href="#edwards1992">Edwards 1992</a>), see especially pages 188-194, argued that the <script type="math/tex">\chi^2</script> test could only be legitimately used for this purpose, and not what the test is normally used for (as a test of the means, i.e. the expected frequencies obtained from the marginal totals in an association test).</p>
<h3 id="finally">Finally</h3>
<p>Currently, the <em>jeva</em> module has 10 common analyses: <em>t</em> tests, one-way ANOVA, polynomial regression, correlation, and 4 categorical analyses including McNemar’s paired test. In future, I aim to add factorial ANOVA, repeated measures ANOVA, logistic regression, and a sample size calculator for <em>t</em> tests (<a href="#cahusac2022b">Cahusac and Mansour 2022</a>).</p>
<p>Please do try out all the analyses. See how they compare with the other approaches. Where possible, <em>p</em> values are given to help compare with the conventional frequentist approach. I would be keen to hear any feedback, and you’ll get a £10 Amazon voucher (or regional equivalent) if you spot any errors!</p>
<h3 id="references">References</h3>
<h5 id="cahusac2020"><a href="https://www.amazon.com/Introduction-Evidence-Based-Statistics/dp/1119549809/">Cahusac, P. M. B. (2020). Evidence-Based Statistics: An Introduction to the Evidential Approach – from Likelihood Principle to Statistical Practice. New Jersey, John Wiley & Sons</a>.</h5>
<h5 id="cahusac2022a">Cahusac, P. M. B. (2022). “Log Likelihood Ratios For Common Statistical Tests Using The likelihoodR Package.” The R Journal 14(3): 203-212.</h5>
<h5 id="cahusac2022b">Cahusac, P. M. B. and S. E. Mansour (2022). “Estimating sample sizes for evidential t tests.” Research in Mathematics 9(1): 1-12.</h5>
<h5 id="dennis2019">Dennis, B., J. M. Ponciano, M. L. Taper and S. R. Lele (2019). “Errors in Statistical Inference Under Model Misspecification: Evidence, Hypothesis Testing, and AIC.” Frontiers in Ecology and Evolution 7.</h5>
<h5 id="edwards1986">Edwards, A. W. F. (1986). “Are Mendel’s Results Really Too Close?” Biological Reviews 61(4): 295-312.</h5>
<h5 id="edwards1992">Edwards, A. W. F. (1992). Likelihood. Baltimore, John Hopkins University Press.</h5>
<h5 id="glover2018">Glover, S. (2018). “Likelihood Ratios: A Tutorial.” MetaArXiv Preprints.</h5>
<h5 id="glover2004">Glover, S. and P. Dixon (2004). “Likelihood ratios: A simple and flexible statistic for empirical psychologists.” Psychonomic bulletin & review 11(5): 791-806.</h5>
<h5 id="goodman1989">Goodman, S. N. (1989). “Meta-analysis and evidence.” Controlled Clinical Trials 10(2): 188-204.</h5>
<h5 id="pritikin2017">Pritikin, J. N., L. M. Rappaport and M. C. Neale (2017). “Likelihood-based confidence intervals for a parameter with an upper or lower bound.” Structural equation modeling: a multidisciplinary journal 24(3): 395-401.</h5>
<h5 id="royall1997">Royall, R. M. (1997). Statistical Evidence: a Likelihood Paradigm. London, Chapman & Hall.</h5>["Peter M.B. Cahusac"]tl;dr Ever wanted to try doing an evidential analysis? You may have found it difficult to find a statistical platform to do it. Now there is the jamovi module jeva which can provide log likelihood ratios for a range of common statistical tests.An invitation to translate jamovi2022-05-26T12:00:00+10:002022-05-26T12:00:00+10:00https://blog.jamovi.org/2022/05/26/i18n-weblate<h3 id="tldr">tl;dr</h3>
<p>Join the jamovi community, and contribute to <a href="https://hosted.weblate.org/projects/jamovi/">translating jamovi</a>!</p>
<!--more-->
<p>jamovi is already available in a number of languages, including:</p>
<ul>
<li>English</li>
<li>Chinese</li>
<li>Japanese</li>
<li>Spanish</li>
<li>Portuguese</li>
<li>And many others</li>
</ul>
<p>Is your language still missing? Or do you think a translation could be improved? We invite you to come join the translation effort!</p>
<h3 id="how-it-works">How it works</h3>
<ol>
<li>
<p>jamovi is translated through the weblate project <a href="https://hosted.weblate.org/projects/jamovi/">here</a>. Head on over there, create an account, indicate what languages you are able to translate, and weblate will let you create new translations and contribute to existing ones.</p>
</li>
<li>
<p>When you’ve completed a translation, we’ll push it out to <a href="https://cloud.jamovi.org">the cloud version of jamovi</a> so you can see what the translations look like jamovi itself. If you need to tweak/adjust/fix anything that isn’t quite right, just head back to weblate.</p>
</li>
<li>
<p>Once you’re happy with it, drop us a line, and we’ll include those translations in our next desktop release of jamovi!</p>
</li>
</ol>
<p>Thanks for helping make jamovi great!</p>["jonathon_love"]tl;dr Join the jamovi community, and contribute to translating jamovi!The esci module for jamovi2020-06-09T10:00:00+10:002020-06-09T10:00:00+10:00https://blog.jamovi.org/2020/06/09/esci<h3 id="tldr">tl;dr</h3>
<p>Today there’s a new module available in jamovi: <strong>esci</strong> (effect sizes and
confidence intervals), developed by Bob Calin-Jageman and Geoff Cumming
(<a href="https://twitter.com/TheNewStats">@TheNewStats</a> and
<a href="https://thenewstatistics.com/itns/">TheNewStatistics.com</a>).
As a newer module you will need a recent version of jamovi to install
esci-–probably 1.2.19 or above. You can refresh your install of jamovi
<a href="https://www.jamovi.org/download.html">here</a>.</p>
<!--more-->
<p>esci provides an easy step into estimation statistics (aka the “new
statistics”), an approach that emphasizes effect sizes, interval estimates,
and meta-analysis. esci can provide estimates and confidence intervals for
most of the analyses you would learn in an undergraduate statistics course
<strong>and</strong> meta-analysis (which really should be part of a good undergraduate
statistics course). Most analyses can be run from raw data or from
summary data (enabling you to generate estimates from journal articles
that only reported hypothesis tests). All analyses generate nice
visualizations that emphasize effect sizes and uncertainty. esci is
for everyone, but was developed especially with students in mind–it
provides step-by-step instructions, clear feedback, and tries to prevent
rookie mistakes (like calculating a mean on a nominal variable).</p>
<h2 id="what-is-estimation-statistics">What is estimation statistics?</h2>
<p>Inferential statistics has two major traditions: testing and estimation. The
testing approach is focused on decision-making. In this approach we propose a
null hypothesis, collect data, generate a test-statistic and p-value measuring
the degree to which the null hypothesis is compatible with the data, and then
make a decision about the hypothesis. For example, we might test the null
hypothesis that a drug has exactly 0 effect on depression. We collect data from
those randomly assigned to take the drug or placebo. We run a t-test comparing
these groups and find <em>p</em> = .01. We then make a decision: because <em>p</em> < .05 we
reject the null hypothesis, deciding that an effect of exactly 0 is not
compatible with the data. Huzzah.</p>
<p>The testing approach has its uses, but note two important issues that we have
not been addressed: 1) <em>How much does the drug work?</em> and 2) <em>How wrong might
we be?</em> That’s where estimation comes in. From the same data and assumptions
that underlie the testing approach we can generate an estimate and a confidence
interval. So, for example, we might find that the drug improved depression by
10% with a 95% CI of [1%, 19%]. This is some very useful information. It tells
us how well the drug worked in this one study (10% benefit). It also gives us
an expression of uncertainty about this estimate. Specifically, the CI gives
the entire range of benefits that are compatible with the data collected–
benefits around 1% are compatible and so are benefits around 19%.</p>
<p>Focusing on estimates can be really helpful:</p>
<ul>
<li>It helps us weigh practical significance rather than just statistical
significance.</li>
<li>It helps us calibrate our conclusions to the uncertainty of the study</li>
<li>It fosters meta-analytic thinking, where we combine data from multiple
studies to refine our estimates (like the poll aggregators on
fivethirtyeight.com)</li>
<li>It calibrates expectations for replications</li>
<li>It helps us think critically about optimizing procedures to maximize effect
sizes and minimize noise</li>
<li>And much more</li>
</ul>
<p>Estimates and tests are linked. A null hypothesis is rejected at the alpha =
.05 level if it is outside a 95% CI and not rejected if it is inside. To put it
a different way, a 95% CI is all the null hypotheses you would not reject at
alpha .05 (and a 99% CI all those for alpha .01, etc.). This means that if you
have an estimate, you can still conduct a test–in fact you can test any null
hypothesis just by checking for it in the CI. The converse is not true, though:
knowing that a test is statistically significant does not easily let you know
the magnitude of the effect or the uncertainty around it. So when you focus on
estimation you gain some benefits, but you don’t lose anything. That makes it
rather bizarre that some fields have come to use only testing. esci is part of
an effort to change this around, and to make estimation the typical or default
approach to inference.</p>
<p>Want to know more about estimation? Here are some sources:</p>
<ul>
<li>Undergraduate textbook: Cumming, G., & Calin-Jageman, R. J. (2017).
Introduction to the new statistics: Estimation, open science, and beyond.
New York: Routledge. <a href="https://www.amazon.com/gp/product/1138825522">On Amazon</a>.</li>
<li>Calin-Jageman, R. J., & Cumming, G. (2019). The New Statistics for Better
Science: Ask How Much, How Uncertain, and What Else Is Known. The American
Statistician, 73(sup1), 271–280.
<a href="https://doi.org/10.1080/00031305.2018.1518266">https://doi.org/10.1080/00031305.2018.1518266</a></li>
</ul>
<h2 id="an-example-with-esci">An example with esci</h2>
<p>Let’s use esci to re-analyze data from a famous paper about the “trust drug”
oxytocin. Oxytocin is a neurohormone best known for its role in human
reproduction. But in 2005, Kosfeld et al. followed up on some interesting work
in rodents to examine if oxytocin might influence trust in humans. The
researchers randomly assigned participants to receive oxytocin (squirted up the
nose) or placebo (also squirted up the nose) before playing an investment game
that depended on trusting an anonymous partner. The average amount invested by
each participant was used as a measure of trust. Kosfeld et al. found that
oxytocin produced a statistically significant increase in trust
(<em>t</em>(56) = 1.82, <em>p</em> = .037 one-tailed)*.</p>
<p>That sounds pretty convincing, right? It must have been, as the paper was
published in Nature and has now been cited over 4,000 times. Right from the
start, citations made the effect seem established and unequivocal. But how much
did oxytocin improve trust and how wrong might this study be?</p>
<p>Let’s take a look. The original data is available in .csv format
<a href="https://osf.io/8h6ut/">here**</a>. Opening it in jamovi you can conduct a
standard t-test to confirm that the difference is statistically significant
(for a directional test). Now let’s generate the estimate and CI in esci using
“Estimate Independent Mean Difference”.</p>
<p><img src="https://blog.jamovi.org/assets/images/esci/esci1.png" alt="esci1" style="width: 700px; max-width: 100%;" /></p>
<p>In the analysis options, we’ll enter Trust as the dependent variable and
Condition as the grouping variable (placebo was coded as a 0; oxytocin as a 1).
We’ll also set the confidence level to 90% to match the stringency of a
directional test.</p>
<p><img src="https://blog.jamovi.org/assets/images/esci/esci2.png" alt="esci2" style="width: 500px; max-width: 100%;" /></p>
<p>Our output emphasizes the effect size, which in this case is the difference in
means, and reports this as both a raw difference (with a CI) and as a
standardized difference (also with a CI):</p>
<p><img src="https://blog.jamovi.org/assets/images/esci/esci3.png" alt="esci3" style="width: 500px; max-width: 100%;" /></p>
<p>esci also generates a <em>difference plot</em>. This shows the oxytocin data (all
participants and the group mean with CI) and the placebo data (all participants
and the group mean with CI). Most importantly, the graph emphasizes the
difference between them: we draw a line from the placebo group, considering
that our benchmark, and then we measure the space between the groups, marking
the difference (delta) with a triangle on a right-side axis anchored at 0 to
the placebo group. It sounds a bit complicated to write it out, but just take a
look.</p>
<p><img src="https://blog.jamovi.org/assets/images/esci/esci4.png" alt="esci4" style="width: 500px; max-width: 100%;" /></p>
<p>The graph shows that the difference in trust was fairly huge–a $1.41 increase
in investment in a context where a typical investment was $8-9. The change,
though, is highly uncertain, with a 95% CI that runs from $0.11 up to $2.71.
This means the data is compatible with a very large range of effect sizes–from
the vanishingly small to the dazzlingly large. In other words, this study
doesn’t really tell us much about how much oxytocin might influence trust.
Perhaps not 0, but basically almost any other positive effect size is on the
table, including ones (around $0.11) that would be very difficult to replicate.</p>
<p>Looked at with these eyes, it might not surprise you much to find out that the
benefit of oxytocin in human trust has not replicated well–and that the
consensus is that oxytocin probably does not have a <em>practically significant</em>
effect on trust. Unfortunately this was not obvious to researchers wedded to
the testing approach, and so much faith was put in these results that clinical
trials were launched to try to use oxytocin as a therapy for social processing
deficits (such as with autism-spectrum disorder). None of these clinical trials
have shown much benefit, but they’ve cost a ton and produced a decent handful
of (thankfully mild) adverse reactions. If you’re curious about the way the
oxytocin story imploded at great costs and hardship, check out the article
<a href="https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1518266">here</a>.</p>
<p>This is just one example of how you can gain important insight into your data
by using estimation thinking in place of or as a supplement to testing. esci
should make it easy to get started with this approach.</p>
<p>* — In the original study the researchers didn’t actually use a t-test; they
compared median trust using a non-parametric test. This nuance doesn’t alter
the patterns in the data presented in this post.</p>
<p>** — This data was extracted by Bob and Geoff from a figure in Kosfeld et al.
(2005). The OSF page where it is posted has all the details.</p>
<h2 id="im-used-to-running-this-test-what-would-i-use-in-esci">I’m used to running this test… what would I use in esci?</h2>
<p>Glad you asked. Here’s how the most common statistical tests map on to the
estimates generated by esci:</p>
<table>
<tbody>
<tr>
<td><strong>Traditional hypothesis test</strong></td>
<td><strong>esci in jamovi command</strong></td>
</tr>
<tr>
<td>One-sample t-test</td>
<td>Estimate Mean</td>
</tr>
<tr>
<td>Independent samples t-test</td>
<td>Estimate Independent Mean Difference</td>
</tr>
<tr>
<td>Paired samples t-test</td>
<td>Estimate Paired Mean Difference</td>
</tr>
<tr>
<td>One-Way ANOVA</td>
<td>Estimate Ind. Group Contrasts</td>
</tr>
<tr>
<td>2×2 ANOVA</td>
<td>Estimate Ind. 2×2</td>
</tr>
<tr>
<td>2×2 Chi Squared</td>
<td>Estimate Proportion Difference</td>
</tr>
<tr>
<td>Correlation test</td>
<td>Estimate Correlation</td>
</tr>
<tr>
<td>Correlation test with categorical moderator</td>
<td>Estimate Correlation difference</td>
</tr>
</tbody>
</table>
<h2 id="this-module-would-be-better-if">This module would be better if…</h2>
<p>The esci module is still in alpha. Geoff and Bob have made this initial release
to help gather feedback as they continue to work on the module in conjunction
with a new edition of their statistics textbook. They welcome your feedback,
feature requests, and/or bug reports. Please especially consider esci through
the eyes of your students:</p>
<ul>
<li>What other analyses would you like to see?</li>
<li>Anything in the output that is hard to understand? That should be labelled
better? That should be added or could be removed?</li>
<li>Would it be helpful to add the option to see all assumptions for an analysis?
Should we provide more guidance on interpreting output?</li>
<li>Any options missing from analyses?</li>
</ul>
<p>The best way to provide feedback would be on the github page for this module,
which is here: <a href="https://github.com/rcalinjageman/esci">https://github.com/rcalinjageman/esci</a>.
If that’s a hassle, then by all means just email Bob directly or tweet at them
<a href="https://twitter.com/TheNewStats">@TheNewStats</a>.</p>
<h2 id="havent-i-heard-of-this-before">Haven’t I heard of this before?</h2>
<p>Yes – Geoff Cumming has been developing versions of esci for some time. The
original versions were designed as worksheets in Excel. And in addition to
analyses, the older version of esci has some great simulations and sample-size
planning tools. You can still check these out here:
<a href="https://thenewstatistics.com/itns/esci/">https://thenewstatistics.com/itns/esci/</a>.</p>["bob_calin_jageman"]tl;dr Today there’s a new module available in jamovi: esci (effect sizes and confidence intervals), developed by Bob Calin-Jageman and Geoff Cumming (@TheNewStats and TheNewStatistics.com). As a newer module you will need a recent version of jamovi to install esci-–probably 1.2.19 or above. You can refresh your install of jamovi here.Flexplot in jamovi2019-10-06T11:00:00+11:002019-10-06T11:00:00+11:00https://blog.jamovi.org/2019/10/06/flexplot<h3 id="tldr">tl;dr</h3>
<p>I was recently perusing several journals in psychology, looking for examples of bad graphics. One would think such an exercise would be quite simple. People are generally <em>really</em> bad at creating graphics.</p>
<p>But the problem was worse than I thought.</p>
<!--more-->
<p>Worse than <em>bad</em> graphics, people were not producing graphics <em>at all!</em> Instead, massive tables and test statistics littered their article.</p>
<p>It’s odd, is it not? Humans evolved to be exceptional at visual pattern recognition, and yet we turn off that part of our brain when we do science?</p>
<p>Odd, that.</p>
<p>After grumbling for several weeks, I began to consider <em>why</em> people don’t use graphics. I have two hypotheses:</p>
<ol>
<li>
<p>People don’t know what sorts of graphics are appropriate for a given situation.</p>
</li>
<li>
<p>The software just doesn’t exist. I won’t name any names, but SP[redacted]S and S[redacted]S both display graphics that look like they were produced on an Atari. R is excellent (particularly with ggplot2), but the learning <em>curve</em> looks more like a <em>cliff</em> to the uninitiated.</p>
</li>
</ol>
<p>Well that’s where Flexplot fits in. Flexplot was designed to address both problems.</p>
<h1 id="the-guiding-philosophy-of-flexplot">The Guiding Philosophy of Flexplot</h1>
<p>I have a colleague who is an expert in human factors, and she has played no small role in my thinking of how Flexplot should work. One of the dominant philosophies in technology design is that the technology needs to “get out of the way,” of the user. If the user is trying to buy a product online, the website should make it as easy as possible for the user to do so. If the user is trying to drive from Place A to Place B, the car shouldn’t put any obstacles in the way.</p>
<p>And yet, when producing graphics, many analysts have the following internal conversations:</p>
<p><em>hmmm…I want to see if my intervention worked. What plot do I use? Is that a scatterplot? Or no…wait. A qqplot? A histogram? Where’s my stats textbook?…Ah yes, a boxplot. I want a boxplot. Now how do I do that again? Is it under this menu? No. Maybe that menu? Hmmmm… You know what? Screw it. Ellen’s on in an hour. I’ll just report a table.</em></p>
<p>When creating graphics in SPSS, JMP, Prism, etc., there are <em>way</em> too many obstacles for users to produce graphics. But even if they do produce a graphic, their intellectual resources are so sapped they have nothing left to actually <em>interpret</em> the graph.</p>
<p>That’s where Flexplot comes in. Flexplot removes these obstacles so analysts can be freed to spend their resources doing what they should do: interpreting the findings.</p>
<p>Let me say that again, but in a way that’s much more tweetable</p>
<blockquote>
<p>The more resources we spend constructing/deciding on graphics, the less we have to interpret. #Flexplot automates the decision-making so researchers can spend their resources interpreting graphics. #Jamovi.</p>
</blockquote>
<p>(I think that’s within twitter’s word limit).</p>
<h1 id="how-does-flexplot-do-it">How does Flexplot do it?</h1>
<p>So how does Flexplot do it? The user only needs to specify the outcome and the predictor(s).</p>
<p>That’s it.</p>
<p>Well, they may have to decide whether to panel variables, but we’ll talk about that later.</p>
<p>So with that, let me show you how to do some basic graphics in the Flexplot module.</p>
<h1 id="univariate-distributions">Univariate Distributions</h1>
<p>Okay, I kinda lied. I said earlier the user only needs to specify the outcome and the predictor. But that’s not actually true. The user only needs to specify the outcome.</p>
<p>Shame on me.</p>
<p>In the background, Flexplot decides whether to plot a histogram (for numeric variables):</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/uni1.png" alt="uni1" style="width: 700px; max-width: 100%;" /></p>
<p>Or a barchart (for categorical variables):</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/uni2.png" alt="uni2" style="width: 700px; max-width: 100%;" /></p>
<h1 id="bivariate-relationships">Bivariate Relationships</h1>
<p>The indisputed king of graphing numeric by numeric relationships is the scatterplot, which flexplot handles with aplomb:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens1.png" alt="screens1" style="width: 700px; max-width: 100%;" /></p>
<p>Notice that it defaults to a loess line. That’s intentional. I want to highlight deviations from linearity, just to help the researcher see if the model is appropriate. But, we could always change it to a straight line (and let’s go ahead and make the dots more opaque while we’re at it):</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens2.png" alt="screens2" style="width: 700px; max-width: 100%;" /></p>
<p>Now how about categorical on numeric? There’s several contenders, including boxplots, violin plots, standard error plots, etc. My plot of choice is what I call the “jittered density plot”, or JD plot. This graphic which shows the raw datapoints (jittered), but the jittering is proportional to the density. It’s kind of like a hybrid between a violin plot and a regular ole’ jittered dot plot. The image below overlays the median and the interquartile range:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens3.png" alt="screens3" style="width: 700px; max-width: 100%;" /></p>
<p>I like this better than the other plots because the raw data reveal so much more than the boxes (boxplot) or density curves (violin plots). For example, have you ever plotted a boxplot for a dataset that has 5 observations? It looks identical to one where we have <em>lots</em> of observations:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens4.png" alt="screens4" style="width: 500px; max-width: 100%;" /></p>
<p>You could, of course, decide to plot means/standard deviations, or means/standard errors:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens5.png" alt="screens5" style="width: 700px; max-width: 100%;" /></p>
<h1 id="multivariate-relationships">Multivariate Relationships</h1>
<p>This is where flexplot really shines. And I mean REALLY shines. It shines so much, you have to squint to look at it.</p>
<p>(Too far?)</p>
<p>There’s a lot of decision-making that’s happening in the back end, and I’m not going to elaborate on the rules, but Flexplot will handle multivariate data with different colors/lines…</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens6.png" alt="screens6" style="width: 700px; max-width: 100%;" /></p>
<p>or the user can decide to panel by putting the second variable in the “Paneled variable” box:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens7.png" alt="screens7" style="width: 700px; max-width: 100%;" /></p>
<p>But let’s go ahead and REALLY make this complicated by including three numeric predictors:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens8.png" alt="screens8" style="width: 700px; max-width: 100%;" /></p>
<p>In the background, Flexplot is “binning” the two numeric variables and creating separate panels for each combination of bins. This is similar to William Cleveland’s “coplots”, with small modifications.</p>
<p>But that’s hard to see what’s going on. So let’s show a regression line instead of loess (the loess lines look pretty straight anyway) and remove the standard errors from the fitted lines:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens9.png" alt="screens9" style="width: 700px; max-width: 100%;" /></p>
<p>That’s better, but it’s still hard to see what’s going on. And this, my friends, is where ghost lines come in.</p>
<h1 id="ghost-lines">Ghost lines</h1>
<p>Paneling is great. It’s an easy way to avoid overlapping of datapoints and they can make patterns clearer. BUT, then your eye has to travel further to make comparisons. That’s where ghost lines come in. Ghost lines repeat the pattern from one panel across the others:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens10b.png" alt="screens10b" style="width: 700px; max-width: 100%;" /></p>
<p>In this case, the gray lines are the “ghost lines”. This line is repeating the pattern from the second row, third column panel across all the others. Usually when I’m looking at multivariate relationships between numeric predictors, my first strategy is to try to dismiss the possibility there’s an interaction present. Interactions present themselves as non-parallel lines. With the ghost line, that becomes much easier to see.</p>
<p>(In the image above, the lines deviate a bit from parallel, but probably not enough to worry about).</p>
<p>Let’s go ahead and look at another example where ghost lines are quite beneficial:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens10.png" alt="screens10" style="width: 700px; max-width: 100%;" /></p>
<p>Here, the line for the female panel is repeated in the male pattern. That makes it VERY easy to see that males generally score lower on satisfaction than females, and that the relationship between communication and satisfaction is relatively consistent across genders (i.e., there’s little evidence of an interaction).</p>
<p>When there’s little evidence of an interaction, that makes plotting <em>way</em> easier. That means the effects are all main effects, and we can visualize them as bivariate relationships using added variable plots</p>
<h1 id="added-variable-plots-avps">Added Variable Plots (AVPs)</h1>
<p>I don’t see these used much, and it’s quite a shame because they are ridiculously amazing. For those unfamiliar with AVPs, here’s the basic idea. Let’s say we want to see the effect of gender on satisfaction, after controlling for communication (like an ANCOVA). You can actually produce a plot that maps onto that concept. What you do is first fit a regression model predicting satisfaction from communication, then residualize it (in other words, subtract the fit of satisfaction from the actual satisfaction scores). This <em>removes</em> the influence of communication from satisfaction (unless there’s some nonlinearity or interactions present). We can then plot gender against the residualized satisfaction scores.</p>
<p>That’s how AVPs usually work, but in my experience, people tend to freak out when the scale of the Y axis doesn’t match the scale of the data. (Residuals will be centered around zero, instead of the mean of satisfaction, which, in this case, is around 50). To make these plots more digestible, I simply add the mean back into the residuals.</p>
<p>So, for this example, let’s go ahead and residualize the effect of communication. Clicking on the “Residualize predictor variable” checkbox will remove the effects of all variables <em>but</em> the last variable entered (in this case, gender):</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens11.png" alt="screens11" style="width: 700px; max-width: 100%;" /></p>
<table>
<tbody>
<tr>
<td>That’s about as simple of a plot as you can get! Notice how the Y axis has changed from “satisfaction” to “satisfaction</td>
<td>communication” to indicate the DV has had the communication effect removed.</td>
</tr>
</tbody>
</table>
<p>Now, remember that complicated multivariate relationship between numeric variables? Let’s go ahead and plot that again, but this time with AVPs:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens12.png" alt="screens12" style="width: 700px; max-width: 100%;" /></p>
<p>This is now showing the relationship between satisfaction and honesty, once we have removed the effects of interests and communication.</p>
<p>Let’s go ahead and study the main effect of interests (controlling for communication and honesty):</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens13.png" alt="screens13" style="width: 700px; max-width: 100%;" /></p>
<p>Interesting. It seems in this dataset, the variable interests is a stronger predictor of relationship satisfaction than honesty. (Couples that nerd out together, stay together).</p>
<p>By the way…these are simulated data. Please don’t use these results as an excuse to lie to your partner :)</p>
<p>I have just modeled my general strategy for plotting multivariate relationships: I put everything I can there at once with ghost lines. If the lines look all parallel, I then do added variable plots to study the main effects. All the while, I’m shifting views to try to gain a complete picture of what’s happening.</p>
<h1 id="general-linear-model">General Linear Model</h1>
<p>Within the Flexplot module, there’s also another menu option called “General Linear Model.” The idea behind this is to combine the strengths of Flexplot with statistical modeling. By default, every statistical analysis will automatically generate a plot that attempts to visualize the statistical analysis. For example:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens14.png" alt="screens14" style="width: 700px; max-width: 100%;" /></p>
<p>By the way, the motivation behind this represents the second guiding philosophy of flexplot:</p>
<blockquote>
<p>Graphics are visual representations of statistical models. As such, the visuals need to match the type of analysis conducted. #Jamovi. #Flexplot</p>
</blockquote>
<p>(Totally within the twitter character limit).</p>
<p>It would not make sense, for example, to show a boxplot (satisfaction against separated vs. not separated) when one conducts an ANCOVA because the Y axis of the boxplot has not been residualized; the AVP, however, <em>will</em> have it residualized, as shown in the right figure. BUT, the left figure shows an interaction present: the relationship between interests and satisfaction is negative for those separated and positive for those who are not separated. This shows that ANCOVA is actually not appropriate. (And it also means the right graphic should not be interpreted).</p>
<p>Within the General Linear Model section of Flexplot, with any multivariate relationship, the software will attempt to generate two graphics: one as an AVP and one showing all the data at once (often paneled). But, the graphic capabilities within GLM are a lot more limited than with Flexplot. Because of that, I recommend visualizing it in Flexplot first, then model the relationship.</p>
<p>We can also ask for diagnostic plots:</p>
<p><img src="https://blog.jamovi.org/assets/images/flexplot/screens15.png" alt="screens15" style="width: 700px; max-width: 100%;" /></p>
<p>The analysis also comes complete with effect size estimates. By the way, you will not see any p-values in the GLM. This is for two reasons: (1) I don’t find them particularly informative, and (2) I am desperately trying to rid my students of the habit of looking straight at the p-values when doing an analysis.</p>
<p>Should you want p-values, I’d recommend you change your mind.</p>
<p>Otherwise, I’d recommend the GAMLj module.</p>
<h1 id="concluding-thoughts">Concluding thoughts</h1>
<p>I may not be the best developer. I have no computer science background, I was self-taught, I use = instead of <- in R, and I prefer for loops to apply statements.</p>
<p>But I am active developer.</p>
<p>I <em>personally</em> find Flexplot extremely useful, so it’s in <em>my</em> best interest to make sure it functions well. Not to mention that my students use it. Rarely does a week go by that a student fails to catch a bug in the software.</p>
<p>Just look at my <a href="http://www.github.com/dustinfife/flexplot" target="_blank">github account</a>; I am <em>frequently</em> pushing new versions online and adding new features.</p>
<p>That’s both a good and a bad thing for you, fair reader. I anticipate that in a few months, the menus will probably look different than what I’ve presented. BUT, that also means that if you report a bug to me, it will probably be fixed within a week.</p>
<p>So, please do report bugs to me. I will appreciate it greatly.</p>
<p>As I mentioned, I have big plans for Flexplot in the future, including:</p>
<ul>
<li>the ability to modify various plot elements (e.g., theme, point size, point shape, axis labels, colors)</li>
<li>adding additional nonlinear fits (poisson regression, exponential curves, etc.)</li>
<li>the ability to visualize mixed models, time-series data, and structural equation models</li>
</ul>
<p>I also have plans for the general linear model component, including:</p>
<ul>
<li>the ability to model interactions (I know, it’s a huge oversight at the moment, but these can be modeled in the <a href="http://www.github.com/dustinfife/flexplot" target="_blank">R flexplot package</a>)</li>
<li>the ability to perform nested and non-nested model comparisons</li>
</ul>
<p>“But Dustin,” you might ask, “how can I possibly keep up with your break-neck pace and revolutionary software modifications that will likely change the entire landscape of scientific research?”</p>
<p>I’m glad you asked, astute reader.</p>
<p>From time to time, I present modifications on my youtube channel: <a href="http://www.youtube.com/quantpsych" target="_blank">Quant Psych</a></p>
<p>You can see me there and I look forward to hearing from you!</p>["dustin_fife"]tl;dr I was recently perusing several journals in psychology, looking for examples of bad graphics. One would think such an exercise would be quite simple. People are generally really bad at creating graphics. But the problem was worse than I thought.jamovi 1.0 released!2019-05-24T16:40:00+10:002019-05-24T16:40:00+10:00https://blog.jamovi.org/2019/05/24/jamovi-one-zero<h3 id="tldr">tl;dr</h3>
<p>Today is a huge day for jamovi! <strong>version 1.0 is now available!</strong> This represents the culmination of thousands of hours of work since our first release in 2017, and one of the most rewarding projects we’ve ever worked on. We’re also acutely conscious of the fact that we could never have made it this far without the belief, the help, bug reports, and feature requests of the broader jamovi community. We really feel quite humbled by the level of support we have received.</p>
<p>To celebrate this significant milestone, We’d like to thank some of the more prominent jamovi contributors <!--more--> – people who, often in the background, have made quite substantial contributions, and have helped make jamovi what it is today. As you’ll see, it’s quite a diverse list of contributors!</p>
<h3 id="marcello-gallucci-university-of-milano-bicocca-italy">Marcello Gallucci (University of Milano-Bicocca, Italy)</h3>
<p>Marcello is the developer of the GAMLj and jAMM modules. GAMLj is a module for general linear models, linear mixed effects models, and generalised linear models. In the past, LMEs and GZLMs have proven intimidating for many people to come to grips with. GAMLj changes that, and makes specifying these models a much simpler process; allowing people to think about <em>what</em> they’re doing, rather than <em>how</em> they’re doing it. Indeed, we’ve been encouraged to see LMEs even being taught in undergraduate programs!</p>
<p>jAMM is an alternative to the popular PROCESS macro for SPSS. jAMM brings complex mediation and moderation analyses to an accessible free and open platform. If you haven’t taken a look at jAMM, I’d encourage you to take a look.</p>
<h3 id="barton-poulson-datalabcc">Barton Poulson (<a href="https://datalab.cc/tools/jamovi">datalab.cc</a>)</h3>
<p>No platform is complete without a comprehensive set of videos to walk new-comers and students through the process – so it’s been fabulous to have barton provide a comprehensive (4.5 hours!) set of videos under a creative commons license. We’ve heard much positive feedback from people using them in their courses.</p>
<h3 id="david-foxcroft-brookes-uk">David Foxcroft (Brookes, UK)</h3>
<p>In the same vein, David Foxcroft (and Dani Navarro) have provided the <a href="https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi">learning statistics with jamovi</a> textbook. This fabulous text provides a complete introduction to the sorts of statistics used in psychology and the social sciences. Once again, we’ve heard only positive things about it.</p>
<h3 id="kyle-hamilton-university-of-california-merced-usa">Kyle Hamilton (University of California, Merced, USA)</h3>
<p>It is really satisfying to start a meta-analysis in jamovi, type the values for each of the studies into the spreadsheet, and watch as the results update as you go. For this, we have Kyle Hamilton to thank.</p>
<h3 id="seiji-shibata-sagami-womens-university-japan">Seiji Shibata (Sagami Women’s University, Japan)</h3>
<p>One of the challenges of international software is to translate it into different languages. Although we’re still embarrassed that jamovi isn’t available in non-english languages, there are some great foreign language resources available. One person who has worked tirelessly on producing these is Seiji Shibata; Translating the text <em>learning statistics with jamovi</em> into Japanese, and providing a number of additional jamovi japanese language resources.</p>
<h3 id="bob-muenchen-r4statscom">Bob Muenchen (<a href="https://r4stats.com">r4stats.com</a>)</h3>
<p>jamovi has really benefitted from a lot of great suggestions from a lot of people, but sometimes someone comes along who goes above and beyond, providing compelling suggestions and nuanced feedback. Bob muenchen is this man. It’s been great to be able to run design ideas by him, and draw upon his experience. More than a few of jamovi’s slick and sexy features have been inspired by Bob. Thanks Bob!</p>
<h3 id="romaric-hainez-université-de-picardie-france">Romaric Hainez (Université de Picardie, France)</h3>
<p>Romaric was probably the first person to trial jamovi across large classes of people, back in 2017. His early feedback was invaluable as we ironed out the sorts of issues that beset newly launched software products. It’s hard to overstate the value of early adopters to a fledgling project. Romaric continues to make insightful suggestions in our forums to this day.</p>
<h3 id="and-many-more-">And many more …</h3>
<p>And of course there’s many more people, more than we could ever hope to thank. Community lies right at the core of our values. Scientific software should be shaped by a diverse community of people, from diverse backgrounds, with diverse philosophies, and so it’s exciting to see the breadth of people involved with jamovi.</p>
<p>But what about you? Have you thought about how you or your institution could contribute to the jamovi community, and it’s mission? Head on over to our <a href="https://www.jamovi.org/contribute.html">Contribute page</a>, and see if there’s something that you can do.</p>["jonathon_love", "damian_dropmann", "ravi_selker"]tl;dr Today is a huge day for jamovi! version 1.0 is now available! This represents the culmination of thousands of hours of work since our first release in 2017, and one of the most rewarding projects we’ve ever worked on. We’re also acutely conscious of the fact that we could never have made it this far without the belief, the help, bug reports, and feature requests of the broader jamovi community. We really feel quite humbled by the level of support we have received. To celebrate this significant milestone, We’d like to thank some of the more prominent jamovi contributorsjamovi: multi-file import and templates2019-03-27T10:00:00+11:002019-03-27T10:00:00+11:00https://blog.jamovi.org/2019/03/27/import<h3 id="tldr">tl;dr</h3>
<p>In many areas, multiple data sets need to be combined before data can be analysed. An example of this is experimental data in the field of psychology, where a data file is produced for each participant. This blog post (actually a video), introduces multi-file import available in the 0.9.6 series of jamovi.</p>
<!--more-->
<p>Additionally, I introduce jamovi templates; a way to save a set of analyses <em>without</em> any data. A template is analagous to a script file; when combined with a <em>new</em> data set, the analyses update and produce results for the new data.</p>
<p><div class="video-container">
<iframe width="560" height="325" src="https://www.youtube-nocookie.com/embed/u1K47yLEMbc" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen=""></iframe>
</div></p>
<h2 id="v-functions-and-multiple-files">V functions and multiple files</h2>
<p>Most people are familiar with jamovi <code class="highlighter-rouge">V</code> or variable functions, such as <code class="highlighter-rouge">VMEAN()</code>. These functions work over a whole variable - so <code class="highlighter-rouge">VMEAN()</code> calculates the mean of all the values in a variable. However, when we import multiple files into the one data set, this may not be the appropriate behaviour. We may want to calculate a separate mean for each file we’ve imported. We can achieve this using the <code class="highlighter-rouge">group_by</code> argument. Specifying <code class="highlighter-rouge">VMEAN(..., group_by=source)</code> (where <code class="highlighter-rouge">source</code> is the name of the source column), produces a separate mean for each level of <code class="highlighter-rouge">source</code>.</p>
<p>It can also be important when using the <code class="highlighter-rouge">Z()</code> function with multiple imported files. It’s common to exclude responses that are, say, more than 3 standard deviations from the mean with a filter with a formula like <code class="highlighter-rouge">-3 < Z(score) < 3</code>. However, without a <code class="highlighter-rouge">group_by</code> argument, the z-scores will be calculated using the grand mean of all responses, from all participants, rather than just the mean for <em>that</em> participant. Here is the same formula with a <code class="highlighter-rouge">group_by</code> argument: <code class="highlighter-rouge">-3 < Z(score, group_by=source) < 3</code>.</p>
<h2 id="labjs">labjs</h2>
<p>This video demonstrates analysing data files from <a href="https://lab.js.org/">labjs by Felix Henninger</a>. If you haven’t used labjs, it’s a free and open framework for designing surveys and computer presented experiments, and builder interface for desiging these tasks graphically.</p>
<p>One of the challenges of designing graphical software is what I call, the ‘feature cliff’. This is where as a user’s requirements increase, the amount of effort to achieve those goals can increase suddenly and dramatically. For example, you might be designing a simple survey, with a simple survey tool, when you realise you need to provide some more complex stimuli. If this goes beyond the survey tool’s capability, you might need to transition to an entirely different framework, which, although more capable, requires <em>a lot</em> more effort than what you were using before. The beauty of labjs, is that when you run into a hurdle, such as your task no longer being easily ammenable to the graphical builder, you can begin including code <em>just</em> for that part of the task. You don’t have to discard your experiment and start over. You can continue to use the graphical builder for <em>most</em> of the experiment, and only code the parts which need to be coded.</p>
<p>This is especially useful when working with less technically savvy colleagues. They can design the experiment with the graphical builder, and if they require a component to be coded, they can send me their experiment and I can add <em>just</em> the bit that they can’t do themselves. They still understand most of the experiment, and can continue to customise it without me – and I don’t need to take over the development and maintenance of their experiment myself (yay!).</p>
<p>So if you haven’t taken a look at labjs, I encourage you to do so.</p>["jonathon_love"]tl;dr In many areas, multiple data sets need to be combined before data can be analysed. An example of this is experimental data in the field of psychology, where a data file is produced for each participant. This blog post (actually a video), introduces multi-file import available in the 0.9.6 series of jamovi.Introducing GAMLj: GLM, LME and GZLMs in jamovi2018-11-13T13:00:00+11:002018-11-13T13:00:00+11:00https://blog.jamovi.org/2018/11/13/introducing-gamlj<h3 id="tldr">tl;dr</h3>
<ul>
<li>GAMLj is a jamovi module for <a href="https://en.wikipedia.org/wiki/General_linear_model">general linear models</a>, <a href="https://en.wikipedia.org/wiki/Mixed_model">linear mixed-effects models</a>, and <a href="https://en.wikipedia.org/wiki/Generalized_linear_model">generalized linear models</a></li>
<li>GAMLj makes these classes of models accessible to a much broader audience</li>
<li>Linear mixed-effects models make a great alternative to repeated measures ANOVA</li>
</ul>
<p><!--more--></p>
<p>One of the goals of jamovi is to make more sophisticated analyses accessible to a broader audience. A great example of this is the GAMLj module introduced here. If you’ve never used these models before, hopefully today I can convince you that with GAMLj they are within your reach, and that there are advantages to using these instead of more traditional analyses, such as repeated measures ANOVA.</p>
<p>For more technical readers, here is a feature list:</p>
<ul>
<li>OLS Regression (GLM)</li>
<li>OLS ANOVA (GLM)</li>
<li>OLS ANCOVA (GLM)</li>
<li>Random coefficients regression (Mixed)</li>
<li>Random coefficients ANOVA-ANCOVA (Mixed)</li>
<li>Logistic regression (GZLM)</li>
<li>Logistic ANOVA-like model (GZLM)</li>
<li>Probit regression (GZLM)</li>
<li>Probit ANOVA-like model (GZLM)</li>
<li>Multinomial regression (GZLM)</li>
<li>Multinomial ANOVA-like model (GZLM)</li>
<li>Poisson regression (GZLM)</li>
<li>Poisson ANOVA-like model (GZLM)</li>
<li>Overdispersed Poisson regression (GZLM)</li>
<li>Overdispersed Poisson ANOVA-like model (GZLM)</li>
<li>Negative binomial regression (GZLM)</li>
<li>Negative binomial ANOVA-like model (GZLM)</li>
<li>Continuous and categorical independent variables</li>
<li>Omnibus tests and parameter estimates</li>
<li>Confidence intervals</li>
<li>Simple slopes analysis</li>
<li>Simple effects</li>
<li>Post-hoc tests</li>
<li>Plots for up to three-way interactions for both categorical and continuous independent variables.</li>
<li>Automatic selection of best estimation methods and degrees of freedom selection</li>
<li>Type III estimation</li>
</ul>
<p>If that feature-list seems a bit overwhelming, it might be easier to think “GAMLj does a lot of stuff”. But don’t be put-off, GAMLj is suitable for beginners and advanced users alike.</p>
<h2 id="repeated-measures-anova">Repeated Measures ANOVA</h2>
<p>So the repeated measures ANOVA is something of a staple of the social sciences; it’s one of the most used tests. Unfortunately, it has the drawback that it cannot handle missing values – if any of the subjects in your data set have missing values they are <em>excluded completely</em>. It’s like they never participated in your study. This can be a real problem with the smallish sample sizes common in the social sciences. Fortunately, linear mixed models provide a simple alternative to repeated measures ANOVA, which is able to make use of these participants’ data. That means more power, less time spent collecting data, and better use of tax-payer’s money. What’s not to love about LME’s?!</p>
<p>So let’s begin with the Bugs data set provided with jamovi. Actually… the bad news is that the Bugs data set in jamovi is in <strong>wide-format</strong>. It looks like this:</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj1.png" alt="BUGS" style="width: 700px; max-width: 100%;" /></p>
<p>GAMLj requires the data be in <strong>long format</strong>. It needs to look like this:</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj2.png" alt="BUGS long" style="width: 700px; max-width: 100%;" /></p>
<p>As you can see, in wide-format, each <em>subject</em> is represented as a row, with a column for each measurement of the dependent variable. In long-format, each <em>measurement</em> of the dependent variable is represented as a row, with the rows for each participant tied together by a subject id.</p>
<p>In fact, the bad news keeps getting worse, because jamovi currently doesn’t provide a facility to convert data between wide format and long format (the jamovi developers assure me this is in the works). So if your data is in wide-format, you’ll have to convert it to long format in a different piece of software before importing into jamovi. I’ve done that for you here – <a href="https://blog.jamovi.org/assets/data/bugs_long.csv">the Bugs data set in long format</a>.</p>
<p>Hopefully the first thing you’ll notice is the missing value in row 6 – this participant would simply be excluded in a repeated measures ANOVA.</p>
<p>The Bugs data set (Ryan, Wilde, and Crist, 2013) contains ratings from people about how much they want to ‘get rid of’ a variety of bugs, each classified as being <em>high</em> or <em>low</em> on “Disgustingness” and <em>high</em> or <em>low</em> on “Frighteningness” (for example, a maggot is disgusting but not frightening, and a wasp is frightening, but not disgusting – see the <a href="http://faculty.kutztown.edu/rryan/RESEARCH/PUBS/Ryan,%20Wilde,%20%26%20Crist%202013%20Web%20exp%20vs%20lab.pdf">paper</a> for details (it is a little bit amusing).</p>
<p>Now let’s analyse this data set with GAMLj. If you haven’t already, install GAMLj from the jamovi library. This is available to the top right of the Analyses tab. Once installed, a new ‘Linear Models’ entry appears alongside the other analyses. From this we can select ‘Linear Mixed Models’. In this example, <code class="highlighter-rouge">Rating</code> is our dependent variable, and <code class="highlighter-rouge">Disgust</code> and <code class="highlighter-rouge">Fright</code> are our factors, and we can just drag these into place. However, we need to tell GAMLj which of these observations belong together, or belong to the same subject – we do this by specifying the ‘cluster’ variable, in this case <code class="highlighter-rouge">Subject</code>.</p>
<p>The final step is specifying the random co-efficients (<strong>tl;dr</strong> specify <code class="highlighter-rouge">Fright | Subject</code> and <code class="highlighter-rouge">Disgust | Subject</code> as the random co-efficients). To understand these, let’s take a step back, and pretend that this is just a between subjects ANOVA. Focusing on the main effects, the equation for that would just be:</p>
<div class="equation-container">$$\hat{rating_{ij}}=a+b_{F} \times Fright_{ij} +b_{D} \times Disgust_{ij}$$</div>
<p>where each participant, <script type="math/tex">j</script>, has several scores, <script type="math/tex">i</script>. In this equation, there’s a <em>single value</em> for the coefficient representing the effect of <code class="highlighter-rouge">Fright</code>eningness (<script type="math/tex">b_F</script>) and a <em>single value</em> for the coefficient representing the effect of <code class="highlighter-rouge">Disgust</code>ingness (<script type="math/tex">b_D</script>). That is, we’re assuming frighteningness and disgustingness have <em>the same</em> effect on each participant. However, in a mixed design, we allow for variability of the effect of Frighteningness and Disgustingness <em>between subjects</em>. Rather than the effect of frighteningness and disgustingness being modeled as a fixed value for all participants (a fixed effect), we want them to be modelled as a random draw from a distribution for each participant (a random effect).</p>
<p>Our equation becomes:</p>
<div class="equation-container">$$\hat{rating_{ij}}=a+b_{Fj} \times Fright_{ij}+\bar{b}_{F} \times Fright_{ij} +b_{Dj} \times Disgust_{ij}+\bar{b}_{D} \times Disgust_{ij}$$</div>
<p>We can see that the co-efficients for the effect of Fright (<script type="math/tex">b_{Fj}</script>) and Disgust (<script type="math/tex">b_{Dj}</script>) vary across participants (there is one coefficient <script type="math/tex">b_{Fj}</script> and one <script type="math/tex">b_{Dj}</script> for each participant j). This captures the correlation between the repeated measures in the design. But we are still interested in the overall effects of Fright and Disgust, so the equation also features the fixed effects of the factors, <script type="math/tex">\bar{b}_{Fj}</script> and <script type="math/tex">\bar{b}_{Dj}</script>, which can be interpreted as the average effects of the factors, averaged across the random effects.</p>
<p>In GAMLj we specify this by assigning <code class="highlighter-rouge">Fright | Subject</code> and <code class="highlighter-rouge">Disgust | Subject</code> as random effects. These are effectively saying; ‘allow <code class="highlighter-rouge">Fright</code> to vary by subject’ and ‘allow <code class="highlighter-rouge">Disgust</code> to vary by subject’. Having specified this, all the options are specified, and our analysis runs. Let’s take a look at the results.</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj3.png" alt="BUGS" style="width: 500px; max-width: 100%;" /></p>
<p>Hopefully this table jumps straight out at you. It looks like an ANOVA table after all. They are the fixed effects, and we can interpret as fixed effects of a classical ANOVA. From this we can see a highly significant effect of <code class="highlighter-rouge">Fright</code> and <code class="highlighter-rouge">Disgust</code>, but a non-significant interaction.</p>
<p>We can compare these results with what we’d get with a repeated measures ANOVA:</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj4.png" alt="BUGS" style="width: 700px; max-width: 100%;" /></p>
<p>There’s not a huge difference here, but we can see that for the interaction effect the <em>p</em>-value is lower for the linear mixed effects model, which is likely a result of the linear mixed effects model being able to make use of those extra participants that the RM ANOVA was unable to use.</p>
<p>Let’s plot the means for each:</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj5.png" alt="BUGS" style="width: 500px; max-width: 100%;" /></p>
<p>It seems people do want to get rid of highly frightening insects and highly disgusting insects more than others.</p>
<p>But we can do more than just visualise the marginal means of <code class="highlighter-rouge">Fright</code> and <code class="highlighter-rouge">Disgust</code> — by checking the random effects box we get a plot of the estimated effect for each subject.</p>
<p><img src="https://blog.jamovi.org/assets/images/gamlj6.png" alt="BUGS" style="width: 500px; max-width: 100%;" /></p>
<p>So there we have it, an LME equivalent to Repeated measures ANOVA. Hopefully you can see how approachable these models can be, and that this class of models can become a part of your statistical toolbox. For more examples of Linear mixed models (and General linear models, and Generalised linear models), head on over to the <a href="https://mcfanda.github.io/gamlj_docs/">GAMLj docs</a>.</p>["marcello_gallucci", "jonathon_love"]tl;dr GAMLj is a jamovi module for general linear models, linear mixed-effects models, and generalized linear models GAMLj makes these classes of models accessible to a much broader audience Linear mixed-effects models make a great alternative to repeated measures ANOVALearning statistics with jamovi – a free introductory statistics textbook2018-10-25T13:00:00+11:002018-10-25T13:00:00+11:00https://blog.jamovi.org/2018/10/25/learning-statistics-with-jamovi<h3 id="tldr">tl;dr</h3>
<p><a href="https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi">learning statistics with jamovi</a> (<code class="highlighter-rouge">lsj</code> for short) is a basic, introductory statistics textbook that presents most of the topics typically seen in an introductory psychology course at undergraduate level. It is completely free to download, use, and adapt — released under a creative commons CC BY-SA 4.0 licence. Although it is geared towards psychology, the content and material is also relevant to other disciplines, for example health sciences and public health. Download <code class="highlighter-rouge">lsj</code> over <a href="https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi">here</a>.</p>
<!--more-->
<p style="text-align: center;">
<a href="https://drive.google.com/file/d/1D4SMMhbsXCozI5KywyOSYpzMp-ygylEf/view" target="_blank" style="margin: 0px auto;">
<img src="https://blog.jamovi.org/assets/images/lsj_cover.png" alt="lsj cover" style="width: 320px; max-width: 100%; display: inline; height: auto; margin: 0px auto;" />
</a>
</p>
<p><code class="highlighter-rouge">lsj</code> covers: study design, descriptive statistics, data manipulation, basic plots, statistical inference, the theory of hypothesis testing, chi-square tests, t-tests, correlation, regression, and ANOVA. Throughout the text demonstration analyses are shown using jamovi.</p>
<p><code class="highlighter-rouge">lsj</code> is a fork/adaptation of the excellent <a href="https://compcogscisydney.org/learning-statistics-with-r/">Learning Statistics with R (LSR)</a> by Danielle Navarro. What’s really neat about LSR, and by extension <code class="highlighter-rouge">lsj</code>, is that there are quite a few topics covered in the text that are missed out of most introductory stats textbooks and courses, but that are important for a good initial understanding of statistics. That’s why LSR was chosen as the basis for <code class="highlighter-rouge">lsj</code> — for us it covers the stuff that we wish we had found out about when we were first taught stats. For example, the disagreement between Neyman and Fisher about hypothesis testing is mentioned, and there is a detailed explanation of the different types of sums of squares (Types I, II and III) that is key for understanding unbalanced factorial ANOVA. Moreover, the Bayesian / frequentist divide is included, and there is some explanation and demonstration of the Bayesian approach to analysis, as a counter to the fact that just about all the inferential statistics in the book are presented from an orthodox frequentist perspective (which still fits with the tradition and requirements of many undergraduate psychology courses).</p>
<p>Although there is a lot in <code class="highlighter-rouge">lsj</code>, any statistics textbook is undoubtedly incomplete; there is just too much to cover. Ours is no exception, it’s a work in progress. If you spot any mistakes, or want to suggest some improvements, then please log an issue on github: <a href="https://github.com/davidfoxcroft/jbook/issues">https://github.com/davidfoxcroft/jbook/issues</a>.</p>
<p>Plans for the near future include adding material on repeated measures ANOVA, reliability analysis, and factor analysis. But of course there is scope to add even more. If you would like to contribute some updates to the book, or a chapter, then please get in touch via the <a href="https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi">lsj website</a>. We’ll keep all contributions in a publicly available repository, linked from the website, and will incorporate new material into the book when we update to the next version.</p>["david_foxcroft"]tl;dr learning statistics with jamovi (lsj for short) is a basic, introductory statistics textbook that presents most of the topics typically seen in an introductory psychology course at undergraduate level. It is completely free to download, use, and adapt — released under a creative commons CC BY-SA 4.0 licence. Although it is geared towards psychology, the content and material is also relevant to other disciplines, for example health sciences and public health. Download lsj over here.Transforming and recoding variables in jamovi2018-10-23T13:00:00+11:002018-10-23T13:00:00+11:00https://blog.jamovi.org/2018/10/23/transforming-variables<h3 id="tldr">tl;dr</h3>
<p><a href="https://blog.jamovi.org/2017/11/28/jamovi-formulas.html">Computed variables</a> have been available in jamovi for a while now. Although great for a lot of operations (e.g., calculating sum scores, generating data, etc.), they can be a bit tedious to use when you want to recode or transform multiple variables (e.g., when reverse-scoring multiple responses in a survey data set). Today we’re introducing ‘Transformed variables’, allowing you to easily recode existing variables and apply transforms across many variables at once.</p>
<!--more-->
<p><div class="gif-player" src="https://blog.jamovi.org/assets/images/transform_overall.png" data-anim-src="https://blog.jamovi.org/assets/images/transform_overall.gif" data-static-src="https://blog.jamovi.org/assets/images/transform_overall.png" data-title="transform_overall" data-name="GIF: Transform main" style="width: 700px; max-width: 100%; padding-bottom: 68%;"></div></p>
<h2 id="creating-transformed-variables">Creating transformed variables</h2>
<p>When transforming or recoding variables in jamovi, a second ‘transformed variable’ is created for the original ‘source variable’. This way, you will always have access to the original, untransformed data if need be. To transform a variable, first select the column(s) you would like to transform. You can select a block of columns by clicking on the first column header in the block and then clicking on the last column header in the block while holding the shift key. Alternatively, you can select/deselect individual columns by clicking on the column headers while holding down the ctrl/cmd key. Once selected, you can either select ‘Transform’ from the data tab, or right click and choose ‘Transform’ from the menu.</p>
<ul>
<li>Right-click on one of the selected variables, and click <code class="highlighter-rouge">Transform...</code>:
<img src="https://blog.jamovi.org/assets/images/transform_create1.png" alt="Transform Create 1" style="width: 400px; max-width: 100%;" /></li>
<li>Go to the <code class="highlighter-rouge">data</code> tab, and click <code class="highlighter-rouge">Transform</code>:
<img src="https://blog.jamovi.org/assets/images/transform_create2.png" alt="Transform Create 2" style="width: 400px; max-width: 100%;" /></li>
</ul>
<p>This constructs a second ‘transformed variable’ for each column that was selected. In the following example, we only had a single variable selected, so we’re only setting up the transform for one variable (called <code class="highlighter-rouge">score - log</code>), but there’s no reason we can’t do more in one go.</p>
<p><img src="https://blog.jamovi.org/assets/images/transform_create3.png" alt="Transform Create 3" style="width: 500px; max-width: 100%;" /></p>
<p>As can be seen in the figure above, each transformed variable has a ‘source variable’, representing the original untransformed variable, and a transform, representing rules to transform the source variable into the transformed variable. After a transform has been created, it’s available from the list and can be shared easily across multiple transformed variables.</p>
<p>If you don’t yet have the appropriate transform defined, you can select <code class="highlighter-rouge">Create new transform...</code> from the list.</p>
<h2 id="create-new-transformation">Create new transformation</h2>
<p>After clicking <code class="highlighter-rouge">Create new transform...</code> the transform editor slides into view:</p>
<p><img src="https://blog.jamovi.org/assets/images/transform1.png" alt="Transform 1" style="width: 600px; max-width: 100%;" /></p>
<p>Let’s take a look at each of these elements.</p>
<h4 id="1-name">1. Name</h4>
<p>The name for the transformation.</p>
<h4 id="2-description">2. Description</h4>
<p>Space for you to provide a description of the transformation so you (and others) know what it does.</p>
<h4 id="3-variable-suffix">3. Variable suffix</h4>
<p>Optional. Here, you can define the default name formatting for the transformed variable. By default, the variable suffix will be appended to the source variable name with a dash (<code class="highlighter-rouge">-</code>) in between. However, you can override this behavior by using an ellipsis (<code class="highlighter-rouge">...</code>), which will be replaced by the variable name. For instance, if you transform a variable called <code class="highlighter-rouge">Q1</code>, you could use variable suffixes to apply the following naming schemes:</p>
<ul>
<li><code class="highlighter-rouge">log</code> → <code class="highlighter-rouge">Q1 - log</code></li>
<li><code class="highlighter-rouge">..._log</code> → <code class="highlighter-rouge">Q1_log</code></li>
<li><code class="highlighter-rouge">log(...)</code> → <code class="highlighter-rouge">log(Q1)</code></li>
</ul>
<p>If left empty, the transformation name is used as the variable suffix.</p>
<h4 id="4-transformation">4. Transformation</h4>
<p>This section contains the rules and formulas for the transformation. You can use all the same functions that are available in computed variables, and to refer to the values in the source column (so you can transform them), you can use the special <code class="highlighter-rouge">$source</code> keyword. If you want to recode a variable into multiple groups, it’s easiest to use multiple conditions. To add additional conditions (i.e., if-statements), you click on the ‘Add recode condition’ button:</p>
<p><div class="gif-player" src="https://blog.jamovi.org/assets/images/transform2.png" data-anim-src="https://blog.jamovi.org/assets/images/transform2.gif" data-static-src="https://blog.jamovi.org/assets/images/transform2.png" data-title="transform2" data-name="GIF: Transform 2" style="width: 600px; max-width: 100%; padding-bottom: 26%;"></div></p>
<h4 id="5-used-by">5. Used by</h4>
<p>Indicates how many variables are using this particular transformation. If you click on the number it will list these variables.</p>
<h4 id="6-measure-type">6. Measure type</h4>
<p>By default the measure type is set to <code class="highlighter-rouge">Auto</code>, which will infer the measure type automatically from the transformation. However, if <code class="highlighter-rouge">Auto</code> doesn’t infer the measure type correctly, you can override it over here.</p>
<h2 id="example-1-reverse-scoring-of-items">Example 1: Reverse scoring of items</h2>
<p>Survey data often contains one or more items whose values need to be reversed before analyzing them. For example, we might be measuring extraversion with the questions “I like to go to parties”, “I love being around people”, and “I prefer to keep to myself”. Clearly a person responding 6 (strongly agree) to this last question shouldn’t be considered an extravert, and so 6 should be treated as 1, 5 as 2, 1 as 6, etc. To reverse score these items, we can just use the following simple transform:</p>
<p><img src="https://blog.jamovi.org/assets/images/transform_ex1.png" alt="Transform 1" style="width: 600px; max-width: 100%;" /></p>
<p>You can explore this transform by downloading and opening the file <a href="https://blog.jamovi.org/assets/data/transform_ex1.omv">transform_ex1.omv</a> in jamovi.</p>
<h2 id="example-2-recoding-continuous-variables-into-categories">Example 2: Recoding continuous variables into categories</h2>
<p>In a lot of data sets people want to recode their continuous scores into categories. For example, we may want to classify people, based on their 0-100% test scores into one of three groups, <code class="highlighter-rouge">Pass</code>, <code class="highlighter-rouge">Resit</code> and <code class="highlighter-rouge">Fail</code>.</p>
<p><img src="https://blog.jamovi.org/assets/images/transform_ex2.png" alt="Transform 2" style="width: 600px; max-width: 100%;" /></p>
<p>Note that the conditions are executed in order, and that only the first rule that matches the case is applied to that case. So this transformation basically says that if the source variable has a value below 50, the value will be <code class="highlighter-rouge">Fail</code>, if the source variable has a value between 50 and 60, the value will be <code class="highlighter-rouge">Resit</code>, and if the source variable has a value above 60, the value will be <code class="highlighter-rouge">Pass</code>. If you’d like an example data set to play around with, you can use <a href="https://blog.jamovi.org/assets/data/transform_ex2.omv">transform_ex2.omv</a>.</p>
<h2 id="example-3-replacing-missing-values">Example 3: Replacing missing values</h2>
<p>Now, let’s say your data set has a lot of missing values and removing the participants with missing values will end up in a severe loss of participants. There are a number of ways to deal with missing data, of which <a href="https://en.wikipedia.org/wiki/Imputation_(statistics)">imputation</a> is quite common. One pretty straightforward imputation method replaces the missing values with the variable mean (i.e., <a href="https://en.wikipedia.org/wiki/Imputation_(statistics)#Mean_substitution">mean substitution</a>). Although there are a bunch of problems associated with mean substitution and you should probably never do it, it does make for a neat demonstration :P</p>
<p><img src="https://blog.jamovi.org/assets/images/transform_ex3.png" alt="Transform 3" style="width: 600px; max-width: 100%;" /></p>
<p>Note that jamovi has borrowed <code class="highlighter-rouge">NA</code> from R to denote missing values. Don’t have a good data set handy? You can try it out yourself with the <a href="https://blog.jamovi.org/assets/data/transform_ex3.omv">transform_ex3.omv</a> data set.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Transformed variables are a great tool for transforming and recoding data, and solve a lot of different data manipulation problems. For us, the jamovi developers, transformed variables represent a really significant milestone. jamovi is now able to service the majority of social scientists data-wrangling needs. jamovi has become far more than an educational tool, and can increasingly hold it’s own alongside the giants in the field (SPSS et al.).</p>
<p>Transformed variables are available from <a href="https://www.jamovi.org/download.html">jamovi 0.9.5 upwards</a>. We hope you will enjoy it :)</p>["ravi_selker"]tl;dr Computed variables have been available in jamovi for a while now. Although great for a lot of operations (e.g., calculating sum scores, generating data, etc.), they can be a bit tedious to use when you want to recode or transform multiple variables (e.g., when reverse-scoring multiple responses in a survey data set). Today we’re introducing ‘Transformed variables’, allowing you to easily recode existing variables and apply transforms across many variables at once.jamovi module development workshop – 31st October 20182018-09-24T12:00:00+10:002018-09-24T12:00:00+10:00https://blog.jamovi.org/2018/09/24/module-development-workshop<h3 id="tldr">tl;dr</h3>
<ul>
<li>Come join us for our first jamovi module development workshop at <a href="https://www.mq.edu.au/">Macquarie University, Sydney Australia</a>. This workshop will take place on October 31st, 2018 and is free of charge. Please <a href="https://www.ccd.edu.au/events/conferences/2018/jamovi/index.html">register over here</a> if you want to participate.</li>
</ul>
<!--more-->
<p>One of the core values of jamovi is to decentralise statistical methods as much as possible, and to empower anyone (irrespective of statistical philosophy) to publish graphical, accessible analyses for anyone to use. For this reason, <a href="https://www.jamovi.org/library.html">the jamovi library</a> is probably our favourite accomplishment; a community driven collection of graphical analyses for the masses.</p>
<p>To develop analyses, a collection of tutorials are available from our ‘developer hub’ at <a href="https://dev.jamovi.org">dev.jamovi.org</a>. However, some people prefer to
learn ‘in person’, with experts available to field any questions that arise. For this reason, we’re running the first jamovi module development workshop on the 31st of October, 2018, at <a href="https://www.mq.edu.au/">Macquarie University, Sydney Australia</a>.</p>
<p>The workshop is free of charge. Please <a href="https://www.ccd.edu.au/events/conferences/2018/jamovi/index.html">register over here</a> if you want to attend the workshop.</p>
<h3 id="program">Program</h3>
<ul>
<li><strong>11:00 - 11:15</strong> Welcome/Registration</li>
<li><strong>11:15 - 13:00</strong> Session 1:
<ul>
<li>Overview</li>
<li>What makes a good analysis?</li>
<li>‘Goal centric’ user interface design</li>
<li>Getting started with <code class="highlighter-rouge">jmvtools</code></li>
<li>Time for self-directed projects</li>
</ul>
</li>
<li><strong>13:00 - 14:00</strong> Lunch (not included)</li>
<li><strong>14:00 - 17:00</strong> Session 2:
<ul>
<li>Implementing plots</li>
<li>Implementing user interfaces</li>
<li>Time for self-directed projects</li>
</ul>
</li>
</ul>
<h2 id="details">Details</h2>
<p>This workshop works through the process of developing a module from scratch, and touches on analysis design, user interface design, and effective use of ‘state’. The workshop will have some structured talks, but most of the time will be available for people to work on their own module at their own pace, with the workshop instructors available to field any questions that arise.</p>
<p>People are encouraged to come up with an idea for a jamovi module that they would like to implement, that they can develop during the workshop (even if
they don’t intend on publishing it). Modules can extend jamovi in a number of ways - providing analyses, plots, test selection, or you could even make a little game.</p>
<p>Although not crucial, people will get the most out of this workshop if they already have an understanding of developing R packages. If this isn’t an area you have experience in, we’d encourage you to work through some online resources ahead of the workshop, such as <a href="http://r-pkgs.had.co.nz/">r-pkgs.had.co.nz/</a>.</p>
<p>We encourage people who are very keen to work through the jamovi tutorials at <a href="https://dev.jamovi.org">dev.jamovi.org</a> before the workshop.</p>
<h2 id="on-site-workshops">On-site workshops</h2>
<p>Interested in a module development workshop at your institution? Contact us at <code class="highlighter-rouge">contact@jamovi.org</code> for any questions/requests. We have instructors based in Australia and the Netherlands.</p>["jonathon_love"]tl;dr Come join us for our first jamovi module development workshop at Macquarie University, Sydney Australia. This workshop will take place on October 31st, 2018 and is free of charge. Please register over here if you want to participate.