Stats: Watch out for ambiguous
data (February 14, 2007). Someone brought me a data set with some
interesting values. It serves as a good example about why you need to
carefully review simple descriptive statistics before you plunge into a
complex analysis.
Stats: Auditing for data entry
errors (June 20, 2006). There was an interesting query on the MedStats
list about the appropriate sample size for an audit. This person had entered
1,500 records and wanted to check a sample of those records for data entry
errors. There was not enough time to perform double entry or to check 100% of
the records. So how many records should be checked?
Stats: Another regular
expression tip (May 23, 2006). I had a large text file and I had to find
the first example of a line that did NOT begin with the letter A. That's
easier said than done, but you can use some special symbols in regular
expressions to do this.
Stats: Lost files (May 23, 2006).
I work on these web pages from my desktop computer and two different laptop
computers. I also have an Administrative Assistant who will sometimes update
my web pages from her computer. In the middle of all of this, I ended up
copying an old file on top of a new files and lost several weblog entries.
With a bit of effort, I did find them in a backup zip file that I had made
last week.
Stats: Using regular
expressions to insert line breaks (May 18, 2006). I had to change a file
written in XML format. The file was pretty easy to manipulate except that it
had no line breaks in it. It was a single line of text with a length of
46,592 characters! That meant that I needed to be constantly scrolling left
and right. I thought to myself that it would be a whole lot easier to
manipulate this file if there were some line breaks. XML doesn't care if you
put in a few line breaks or if you use indenting or a variety of other things
that might make the file easier to read. You can insert line breaks fairly
easily using regular expressions, if you know what you are doing.
Stats: More lessons learned the
hard way (January 31, 2006). The more I do, the more I realize how little
I have thought about how to properly conduct a statistical analysis. One
lesson I thought I had learned was that it costs next to nothing to store
information electronically, but it can often save you a lot of time. But
recently, I have relearned the value of this lesson.
Stats: Hard learned lessons
(November 25, 2005). It's been a busy month, as noted below, and in a
rush to complete all my projects, I ended up doing some things that may have
caused a few problems (nothing permanent, of course, but they did up delaying
further some projects that were already behind schedule). I alluded to a bit
of this in my weblog entry, Non-destructive data editing (November 2, 2005),
but I have a few more lessons worth mentioning.
Stats: Non-destructive
data editing (November 2, 2005). I recently worked on a project looking
at patients having two different types of operations, with and without collar
sutures. The data set that the researchers sent to me had some
inconsistencies, though.
Stats: Another disaster
averted (August 16, 2005). When you are importing a file from one system
to another, lots of little things can trip you up. Here's an example, and it
shows a very subtle problem.
Stats: Moving R objects (July
28, 2005). I regularly work from home on my laptop, and when I need to
re-run some analyses in R, I usually just re-create the original data sets.
But there are several ways you can transfer objects from one R system to
another.
Stats: Merging in R (July 26, 2005)
Dear Professor Mean, I get a strange error message when I try to merge
two files in SPSS. What is going on? -- Computing Cheryl
Stats: More on regular
expressions (July 21, 2005). As I work more and more with microarrays,
the more I realize that having a knowledge of regular expressions will help.
For example, I had a comma separated file (.CSV) and it had an extra comma at
the end of every line. I wanted to remove those commas, but not any of the
others.
Stats: Dumping data from R to a
text file (June 27, 2005). In the prenatal liver study, I needed to give
some of the normalized gene expression levels to a researcher in a form he
could use. The data he needed was in a data frame with 94 rows and 16 columns
(folate.signal). But unfortunately, the names of the rows (gene.symbol) and
columns (liver.names) were stored in separate objects. Here's one way to
match the values back up.
Stats: Importing value labels from
Access into SPSS (May 24, 2005). Someone asked about importing data from
Access into SPSS. The Access file has value labels (e.g., 1=Male, 2=Female,
3=Missing) and wanted to know if there was any way to get this
information into SPSS.
Stats: A disaster averted (May
16, 2005). I'm working on a microarray experiment of prenatal liver
samples. When I was trying to normalize the data, I noticed that three of the
arrays had rather unusual properties.
Stats: String manipulations in R (May
10, 2005). As part of my efforts to analyze microarray data, I am finding
that I need to do simple string manipulations in R. Here is a list of
functions that might help.
Stats: Digitizing a graph
(March 15, 2005). Someone brought me a graph with a trend line relating
body surface area (BSA) to various cardiac measurements. This graph showed
both the trend line and limits at +/-2 standard deviations and +/-3 standard
deviations. She asked if I could write a program based on that graph that
would allow her to input a patient's BSA and cardiac measures and get a
Z-score in return.
Stats: Merging files in SPSS (January 15,
2004). Dear Professor Mean, I get a strange error message when I try
to merge two files in SPSS. What is going on? -- Computing Cheryl
Stats: Coding race/ethnicity
(February 3, 2003). If you have to collect data on the race and/or
ethnicity of your research subjects, you should be aware of the official U.S.
government definitions that all federal agencies have to follow. You don't
necessarily have to follow these guidelines, but they do offer up a way to
code your data that is reasonably standardized.
Stats: Longitudinal data (July 26, 2002).
Dear Professor Mean, I have longitudinal
data on the growth pattern of patients given growth hormone. How should I
store the data? --Jittery Jerry
Stats: Loading ODBC drivers from the Microsoft
Data Access Pack (January 24, 2001). Here are excerpts from some emails
posted to the SPSSX-L listserver on September 10-11, 2000. These emails
describe how to load special drivers for ODBC, especially the driver for
Access 97.
Stats: Exporting SPSS graphs and tables
(January 28, 2000). Dear Professor Mean, I need to export the output
from SPSS and use some of it in my word processing file. What is the best way
to do this? -- Manic Marsha
Stats: Spreadsheet or database (January 28,
2000). Dear Professor Mean, I am not sure
whether I should use a database or a spreadsheet to enter my data?
Stats: General guide to data entry (September
3, 1999). Dear Professor Mean, I'm about to start typing in my
research data. Do you have any general guidelines for data entry?
Stats: Importing spreadsheet data into SPSS
(August 20, 1999). Dear Professor Mean,
I need to import data in an Excel spreadsheet, but I can't get SPSS to
read this data properly. Can you help? -- Stumped Stan
Stats: Date calculations in SPSS (August 18,
1999). Dear Professor Mean, I am trying to use dates in SPSS for
certain calculations. For example, I want to use a compute statement in SPSS
to create a new variable called duration of injury (durinj). I know that I
must subtract the date of injury from the date of interview. However, when I
do this, I get a number in the millions. What am I doing wrong? -- Stumped
Sharon
Stats: Documenting your SPSS data sets
(August 18, 1999). Dear Professor Mean, I need to add some
documentation for SPSS data sets that I am creating. I know you covered this
in your "Gentle Introduction to SPSS" class, but I've already forgotten
everything. Can you review this for me? -- Baffled Bill
Stats: Importing database files into SPSS
(August 18, 1999). Dear Professor Mean, How do I import database
files into SPSS? I don't want to re-type everything, because there are 70,000
records. The data are stored in a Microsoft Access file. -- Vexed Vidya
Stats: Inputting a two-by-two table into SPSS
(August 18, 1999). Dear Professor Mean, I have data in a two by two
table. When I try to enter this data into SPSS, I can't get it to compute
risk ratios and confidence intervals. What am I doing wrong? -- Jinxed Jason
Stats: Modifying SPSS data (August 18, 1999).
Dear Professor Mean, Before I start my data analysis, I need to modify
some of the data in my SPSS data set. I don't want to re-type every number by
hand. Is there a faster way to do this? -- Impatient Pam
This webpage was written on 2007-06-20 and was last modified on
2008-07-08.