Once you generate target bar charts you leave the Analysis wizard and you go to the regular qbase+ interface. Suppose that you want to perform a statistical test to prove that the difference in expression that you see in the target chart is significant. At some point, qbase+ will ask you if your data is coming from a normal distribution. If you don’t know, you can select
I don’t know and qbase+ will assume the data are not coming from a normal distribution and perform a stringent non-parametric test. However, when you have 7 or more replicates per group, you can check if the data is normally distributed using a statistical test. If it is, qbase+ will perform a regular t-test. The upside is that the t-test is less stringent than the non-parametric tests and will find more DE genes. However, you may only perform it on normally distributed data. If you perform the t-test on data that is not normally distributed you will generate false positives i.e. qbase+ will say that genes are DE while in fact they are not. Performing a non-parametric test on normally distributed data will generate false negatives i.e. you will miss DE genes.
Checking if the data is normally distributed can be easily done in GraphPad Prism. To this end you have to export the data.
To export the results click the upward pointing arrow in the qbase+ toolbar: You want to export the normalized data so select
Export Result Table (CNRQ): You will be given the choice to export results only (CNRQs) or to include the errors (standard error of the mean) as well . We don’t need the errors in Prism so we do not select this option. The scale of the Result table can be linear or logarithmic (base 10) . Without user intervention, qbase+ will automatically log10 transform the CNRQs prior to doing statistics. So we need to check in Prism if the log transformed data are normally distributed. Additionally, you need to tell qbase+ where to store the file containing the exported data. Click the
Browse button for this .
Exporting will generate an Excel file in the location that you specified. However, the file contains the results for all samples and we need to check the two groups (treated and untreated) separately. The sample properties show that the even samples belong to the treated group and the odd samples to the untreated group. This means we have to generate two files:
Now we can open these files in Prism to check if the data is normally distributed.
File in the top menu
New Project File
Enter replicate values, stacked into columns (this is normally the default selection) since the replicates (measurements for the same gene) are stacked in the columns.
Prism has now created a table to hold the data of the untreated samples but at this point the table is still empty. To load the data:
File in the top menu
Source tab select
Insert data only
Filter tab and specify the rows you want to import (the last rows are these of the standard and the water samples, you don’t want to include them)
As the file is opened in Prism you see that the first column containing the sample names is treated as a data column. Right click the header of the first column and select
Analyze button in the top menu
Column statistics analysis in the
Column analyses section of the left menu
Flexible. It’s a bad reference gene so you will not include it in the qbase+ analysis so there’s no point checking its normality (it is probably not normally distributed). In that respect you could also deselect the other two reference genes since you will do the DE test on the target genes and not on the reference genes.
Descriptive statistics and the
Confidence intervals section deselect everything except
Mean, SD, SEM. These statistics is not what we are interested in: we want to know if the data comes from a normal distribution. The only reason we select Mean, SD, SEM is because if we make no selection here Prism throws an error.
Test if the values come from a Gaussian distribution section select the
D’agostino-Pearson omnibus test to test if the data are drawn from a normal distribution. Although Prism offers three tests for this, the D’Agostino-Pearson test is the safest option.
Prism now generates a table to hold the results of the statistical analysis: As you can see, the data for Palm are not normally distributed.
Since we found that there’s one group of data that does not follow a normal distribution, it’s no longer necessary to check if the treated data are normally distributed but you can do it if you want to. We will now proceed with the statistical analysis in qbase+. Statistical analyses can be performed via the Statistics wizard.
You can open it in the Project Explorer (window at the left):
Project1 if it’s not yet expanded
Experiments folder in the project if it’s not yet expanded
GeneExpression experiment if it’s not yet expanded
Analysis section if it’s not yet expanded
This opens the Statistics wizard that allows you to perform various kinds of statistical analyses.
Goal page: Select
Mean comparison since you want to compare expression between two groups of samples so what you want to do is comparing the mean expression of each gene in the treated samples with its mean expression level in the untreated samples. Click
Groups page: specify how to define the two groups of samples that you want to compare. Select
Treatment as the grouping variable to compare treated and untreated samples. Click
Targets page: specify for which targets of interest you want to do the test. Deselect
Flexible since you do not want to include it in the analysis. It’s just a bad reference gene. Click
Settings page you have to describe the characteristics of your data set, allowing qbase+ to choose the appropriate test for your data.
The first thing you need to tell qbase+ is whether the data was drawn from a normal or a non-normal distribution. Since we have 8 biological replicates per group we can do a test in Prism to check if the data are normally distributed.
Settings page you describe the characteristics of your data set so that qbase+ can choose the ideal test for your data. For our data set we can use the default settings. Click
Next. In the results
Table you can see that the p-value for Palm is below 0.05 so Palm is differentially expressed.