Stats
What sort of statistical training is needed for basic scientists? (March
29, 2008).
Someone wrote to a mailing list sponsored by the American Statistical
Association asking about what resources to use in a statistics class aimed at
basic scientists (as opposed to public health students and clinical
scientists). I offered a few general recommendations.
Robin Penslar's book on Research Ethics should be incorporated somewhere
into any researcher's curriculum. I would think that a class on Statistics is
as good a place as any, as many problems with ethics involve data fudging,
violation of statistical protocols, and carelessness with private data. These
are issues that Statisticians can and should speak about.
There is also a wealth of data available on the Internet these days that
students should review and identify projects. I would encourage the use of a
large data set such as in genomics. There is a fascinating data set looking at
microarray expression levels of 19 different human tissue types in 30
different individuals. The basic setup is described at
and the article has a link to supplemental research data that includes the
full data set in a text file that is surprisingly easy to manipulate.
There are other interesting data sources like this. Perhaps it would be
interesting to ask students to provide a simple analysis of a subset of a very
large data set like this one. The days of having to live with small toy data
sets is over.
Anyone working in a laboratory should be familiar with the basic tools of
quality control including control charts, fishbone diagrams, and Pareto
charts. If you were really ambitious, you might consider screening designs as
well.
As a general rule, basic science should place more emphasis on randomized
designs, especially block designs and multifactorial designs. It should place
less emphasis on Epidemiology topics, such as case-control designs and risk
adjustment models. Of course, you can't totally ignore Epi.
2008-07-14. Category: Teaching resources