How to Prepare for a PhD in Biostatistics

How would you advise an undergraduate interested in a PhD to prepare for studying biostats?

More math. You can't be too rich or know too much math. In terms of courses, take more mathematics, take as much as you can. When you get into our biostatistics graduate program, we teach you statistics, so taking more statistics now won't help you in the long run, plus we may have to un-teach something if you learn statistics badly.

In terms of what math courses to take, try to take real analysis. Advanced linear algebra is very helpful. Every part of math is useful somewhere in statistics, though connections may be obscure, or more likely, just not part of some current fad. Numerical analysis and combinatorics are also helpful, and everything else is helpful too. First though is Real Analysis and Advanced linear algebra.

What might I read outside of my courses?

Start picking up books and articles that relate to statistics, math, science, and public health and read them. Lately there have been a number of excellent popular science books that relate to science, statistics and statistical thinking. Anything and everything by Malcolm Gladwell I highly recommend. Other books that come to mind are things like Stephen Jay Gould's essays; The Signal and the Noise by Nate Silver; The Theory That Would Not Die by McGrayne; any of the books by Jared Diamond. A very important book for most anyone: Thinking, Fast and Slow by Kahneman. The Black Swan by Nassim Taleb; How to Lie With Statistics by Darrell Huff.

For someone not yet expert in statistics, books on statistical graphics are directly statistical and will be much more accessible than a technical book. Books on statistical graphics will directly make you a better statistician, now. These teach you both to look at data and how to look at data. There's a set of 4 books by Edward Tufte. See . Get all four, I recommend hardcover over paperback, and definitely I wouldn't get the e-books. Read all four! These are not always practical books, but they inspire us to do our best and to be creative in our statistical and graphical analyses.

Read Visualizing Data by William Cleveland (A+ wonderful book). Additional graphics books include Graphical Methods for Data Analysis by Chambers Cleveland Kleiner and Tukey if you can find a copy. A more statistical book that will help instill the proper attitude about data is Exploratory Data Analysis by Tukey.

Read anything else you can find to read. Read widely and diversely. As you get stronger in math and statistics, change the level of the books. Start exploring the literature. Dive into one area and read as much as you can. Then find another area and check it out.

Can I depend on the department to teach me everything I need to be a good statistician?

Of course not. Active learning is paramount. No graduate department will teach you everything. All departments teach a core set of material, and it is up to students to supplement that core with additional material. How you supplement that core determines what kind of statistician you can be, how far you can go. Some people might supplement their core material with an in depth study of non-parametrics; others with Bayesian methods, statistical computing, spatial data analysis or clinical trials. I supplemented my graduate education with statistical graphics, Bayesian methods, statistical computing, regression methods, hierarchical models, semi-parametric modeling, foundations and longitudinal data analysis. The semi-parametric modeling, graphics and computing mostly came from books. The longitudinal data analysis came from a mix of books and journal articles. Bayesian methods and hierarchical models I learned mostly from journal articles. Foundations came from talking to people and listening to seminars, as well as from journal articles and books. I also tried to learn additional mathematical statistics using various texts, but wasn't very successful; similarly with optimal design.

How you supplement your education depends on your interests and may help you refine your interests. I found I wasn't that interested in time series, survey sampling, stochastic processes or optimal design. If you're interested in working with a particular professor, you're going to need to supplement with books in her/his area, and you're going to need to read that professor's research papers to see what you're going to be getting into.

What programming language(s) should I learn?

R is growing fast and may take over, sort of like kudzu, so it is well worth your time to become expert in it. Definitely learn/use R Studio. Some folks make a living just off their R expertise. A lot more make a living off their SAS expertise. But I bet the R people are having a lot more fun. The rest of this is what I garner from others, not from direct knowledge. If you want less of a statistics specialty language and to be closer to the computer end of things, C++ or JAVA are extremely popular (you should figure out why). Python seems to be coming on very strong. So maybe R and Python? Depends on what you like. Learn something about algorithms and something about modern computer programming interfaces. And a little HTML.

Go learn latex now. Become at least a partial expert. Knowing latex before you come in to grad school is very helpful.

Guns and Rare Events

Seems that people with severe mental illness are able to easily purchase guns in this country. Seems like people (whether actually mentally ill or not) who are upset and angry and want to hurt someone also can purchase a gun easily.

Often we hear that gun advocates think that more guns is the solution. This seems to revolve on a world view where most all of us have our hands on our guns, all of us with guns are ready to take appropriate defensive action, all of us with guns can read a scene perfectly and all of us will come to a correct decision at all times and every time about how to react.

Hypothesize many many people walking and driving around every day carrying their gun with them. Suppose (just for supposes sake) that half the people wandering around have a gun within reach. Think about what is required here. No one with a gun should be angry. No one carrying a gun should have hormones raging out of control. No one carrying a gun has mental illness. No one carrying a gun had too little sleep last night. No one carrying a gun has depression, schizophrenia, mania or delusions. No one carrying a gun is temporarily irrational. Nothing bad happens to anyone carrying a gun that gets them mentally out of joint: a bump in the crowd, loud truck sounds, a boss with an unreasonable demand just before quitting time, missing the green light, a mentally ill homeless person smelling bad standing on the street corner asking for handouts.

Have you ever done something stupid that you later regretted, even though you didn't suffer consequences? Would having access to a gun made that situation better? Think of all the stupid things everyone else has ever done. Ever seen someone in a fender bender get out of their car and start screaming at the other person, no matter who was at fault?

Those of us in academia worry a lot about our students, both generally and specifically. Every now and again, someone with access to a gun or guns comes to campus or to the area near campus with intent, sometimes to shoot a specific person such as an adviser, colleague, competitor or department chair. Sometimes they come to campus or the area near campus with the intent to kill just any random person they can find.

Having a lot of people wandering around with guns, I now have to worry about the sanity and training and general level-headedness of each and every one of them. And we can't tolerate a situation where any one of these gun toting people anywhere in the country are temporarily off-balance, temporarily ill-tempered, temporarily irritated, or temporarily over-stressed.

In academia, we do put people under stress. There's paperwork and bureaucracy, bad food and traffic. We give tests. Sometimes hard ones. We have deadlines. We demand a lot from our students. We teach, and we expect them to learn. We deliver bad news: you didn't get the grade you hoped for, or the grade you thought you deserved or thought you needed. You didn't get admitted to the school, the major, the program, graduate school. All this puts stress on an individual. Some people cope fine, some learn coping mechanisms, some cope sufficiently, some grow as a person and rise to the occasion. Little of this is done without stress.

What about the stresses of interpersonal relationships? Students at college are meeting many new people, from many different walks of life and this can be a fresh challenge. Students are away from their comfort zone for the first, or if in grad school, second time; many students travel and some travel great distances to attend college. One develops boy/girl friends, close friends and people you can't stand, though they share a bathroom with you. Sometimes friendships fall apart, sometimes they go lax. A support group can grow and it can shrink suddenly at holiday time or at the end of the school year. People get isolated, sometimes for long periods of calendar time, some for shorter periods. All of us are isolated for hours in a given day.

Some students press themselves very hard. Some students expect more from themselves than they are able to deliver. Some have expectations out of sync with their abilities and interests. And just about everyone learns how to adjust to reality when reality bites back. People naturally get stressed in life and in academia. And this pushes some folks a bit far, and for a while they can break, until they mend and come back, hopefully stronger, ready to try new things. Or they change their goals and their objectives, and move to a different sphere in their lives.

And people less capable of handling limits, instructions, social interactions, deadlines are more likely to be stressed by these situations.

And into this vital, vibrant personal and academic stew, full of growth and challenge, despair, growth and success, you want to mix lethal weapons too?

I suspect that people who want freely available lethal gun weapons are also those who seem to think that illness stops at the neck, that mental illness doesn't exist, or doesn't happen. Mental illness is something that can strike most anybody. It may not strike you. But you'll want to be insured for it, just in case. Someone can be healthy, and mentally well balanced one day and go buy a gun. And pass any check on mental illness. And that person, at that time, I'm not worried about them going out and shooting someone. But that person can live for another 50 years with that gun. And in all that time, you want to guarantee me that they will never be stressed? Never depressed or despondent? Never make a stupid decision? Never be trod upon by life? By life, by a boss, by a spouse, by bureaucracy, by a neighbor, by politics, by money worries? By kids or parents? Never get in a bad way? Never suicidal, never angry, never unhappy, never mad, never irate? Never mentally ill? Never get Alzheimer's?

You want to assure me that this healthy, mentally stable person will never mis-use that gun for the rest of their life?

And their spouse, their kids, their parents, and their kid's friends will never mis-use that gun for the rest of their lives as well? And everyone that might accidentally or purposefully gain access to that gun?

What training is required to properly use a gun? Cleaning, handling, safe storage, target practice, accuracy, reading a scene, what else? Do you want to assure me that everyone who might ever come into contact with that gun will have the proper training to use that gun? That they remember their training? That they own a gun safe? That their weapons are stored in that gun safe? That the combo to the safe isn't written on a piece of paper in the kitchen drawer with the spare car keys?

Now multiply this by the estimated 100,000,000 people in the United States who own a gun. I don't think people understand what happens when you multiply a small problem, a small probability of a bad outcome by 100 million.

You want to guarantee me that all 100,000,000 people are not having such a bad day that they won't mis-use that gun? You want to guarantee me that all 100,000,000 people know how to use a gun properly, that they'll safely store it away from their kids and any other kids? You want to guarantee me that all 100,000,000 people won't have their house burgled and the gun stolen where it disappears into the underground economy? Not one person in those 100,000,000 won't get furious at someone or something in the next week?

You want to guarantee me that all 100,000,000 people won't commit suicide, tomorrow, this year, sometime in the next fifty years? In any given day, 58 of those 100,000,000 nice US citizens commit suicide using a gun. About 30 people use a gun to kill someone else each day. In the US.

So, can you guarantee me that out of those 100 million people, that all 100 million of them are going to behave properly? Right now, about 88 of them per day on average are killing themselves or someone else. But that adds up to quite a lot of dead Americans by December 31st.

Have a nice day. And please don't shoot me.

How To Be A Kick-A## Teacher

25 helpful pieces of advice.

  1. Comportment:
    1. Walk like you're walking away from an explosion in a Hollywood movie.
    2. Tuck your chin in, tilt your head down and look at people from out of the top of your eyes.
    3. Squint.
  2. Lecturing:
    1. Show up late, then run over time because there is so much to cover.
    2. Bring to your lectures more hardware, dry erase markers, computers, tablets,
      cell phones and piles of paper and printed references than any one else.
    3. Consider lecturing in Esperanto so that your international students can learn as much as your native students.
    4. Liberally sprinkle your lectures with Fisher quotes in Latin and French.
  3. Questions:
    1. Take 5 minutes to respond to every question.
    2. Always ask if there are any questions, but don't wait for people to raise their hands.
    3. You have several options for answering questions:
      1. Give a lot of fake life philosophy type advice, but don't actually answer the question.
      2. Point out that you'll get to that answer later in the lecture. It doesn't matter if you actually do get to it later.
      3. Explain how semantic deconstruction enables a pithier answer. Never give the pithier answer.
  4. Give homeworks that are impossible to answer correctly, but don't grade them. Give everyone 100% but only after the quarter ends.
  5. Do not follow the syllabus.
    1. The syllabus should cover everything from Moby Dick to Playfair to Rise and Fall of the Roman Empire to the complete first edition of F. N. David's first book to John Graunt's tables of mortalities and a summary of Biometrika, but only through 1900.
    2. Do not update your syllabus. The syllabus your mentor, who retired the year after you joined the faculty, first wrote during world war II should suffice. He worked hard enough on it after all.
    3. If there are any pre-med students in the class, make sure you cover all the material on the MCAT.
    4. If there are any pre-law students you need not cover the material on the LSAT as there are currently enough seats in first year law classes for all applicants in the United States.
  6. Explain at length that the class really should satisfy distribution requirements in a different area.
  7. Direct everyone to read your blog where you post:
    1. Frequent links to XKCD and phdcomics.
    2. On semantic deconstruction and quantum mechanics.
  8. Teach your students how to do a better job of simulating real data.
  9. Explain the grading policy in detail, but don't follow it.
  10. Orthogonal decomposition theorem. Homeworks, lectures and tests should form an orthogonal basis in knowledge space.
  11. Jokes. Humor always makes difficult material more interesting. If you don't know any statistics jokes, here are some that you can use, preferably without attribution.
    1. Q: How many statisticians does it take to screw in a lightbulb? A: It depends on your model.
    2. Q: What's purple and commutes? An Abelian Grape. (Yeah, well).
    3. An applied statistician, an econometrician, a theoretical statistician and a psychometrician walk into the faculty bar on campus and don't talk to each other. Bose-Einstein statistics at work.
    4. Q: Why don't theoretical statisticians play hide and seek? A: Because no one will look for them.
    5. Q: How many econometricians does it take to make chocolate chip cookies? A: Ten. One to stir the batter and nine to peel the M&Ms.
    6. A statistician came home and found his house burned to the ground. When he asked what happened, the police told him "Well, apparently the chair of the math department came to your house, and ...".

      The statistician's eyes lit up and he interrupted excitedly, "The chair? Of the math department? Came to my house?"
    7. Q: Why do statistics departments put questions on asymptotics on comprehensive exams? A: Otherwise there would be no use for asymptotics at all.
    8. Theoretical statisticians do it better, but only asymptotically. And in the long run, ... well, you know. But don't say anything, it makes them happier.
  12. Prove everything:
    1. Proofs are clearer when you use measure theory, even in an introductory class for biologists.
    2. And martingales allow for more efficient proofs. After all, life is a martingale. Biologists need to learn to appreciate that.
    3. Entropy means it really doesn't matter. Physicists should appreciate that.
    4. Note that everything in class has been proved previously in Doob (1953) or in one of Herman Rubin's Annals of Mathematical Statistics papers from the 1950's.
    5. Or was it the 1945 paper?
  13. Develop the theory of minimal length confidence intervals for the mean of a normal when n=1 and the mean and variance are both unknown.
  14. When students are confused, coding theory (Daleks and Ood 1985) provides an alternative way to derive most statistical models. Ontological science should be relegated to discussion sessions where the TA can handle the presentation.
  15. Refer every questioner to Fisher's original publications on the subject for more more information.
  16. No software need be installed the first two weeks of computer lab. Never use the same software package two weeks in a row.
  17. Use data sets from your own collaborative papers as examples, with results given in class that exactly contradict what was published in the literature.
  18. Due dates can be given in Julian date (in ISO-8601 format of course) during the first half of the semester and in Aztec calendar form for the second half. Dates outside the semester and during finals week should use Ptolemaic and Carbon dates.
  19. Office hours: check the course schedule to maximize the number of conflicts of office hours with other courses.
  20. Cancel class on leap days, during full and new moons, the 13th of every month, for faculty meetings, special seminars and Tuesday and Wednesday of Thanksgiving, Veteran's day and memorial day weeks. Go to at least two international conferences each quarter.
  21. Announce every Friday that class is canceled for Saturday and Sunday.
  22. Take attendance, but never get past the M's.
  23. Always misspell your email address. If possible, delete your email address (there is so much spam after all) and acquire a new one after posting your syllabus to the web site.
  24. Cover ethics, but don't cite your sources.
  25. Per School of Medicine policy, you are required to hold several lectures each semester in rooms other than the scheduled room. Lecture days and room numbers are subject to change even after the lecture has started.
  26. Play classical music during lectures and industrial grunge during midquarters. For a special treat (and personal favorite) arrange for a live performance of Karlheinz Stockhausen's Helicopter String Quartet during your final.
  27. Love your students. Unless that's against school policy, in which case announce that your love is strictly Platonic. Unless that might be misinterpreted, in which case explain the 7 different forms of love, (Eros also known as sorE backwards, Philia, Ludic, Aghast, ACDC, Programa, Flotus, Read-Write, FIFO and GIGO) and after you get them all straight, the first class should be over and you still won't have covered the syllabus and course information sheets. Bring lots of paper towels when you cover GIGO. (For advanced classes: discuss FIFI and FISTR (First In, Screw The Rest) instead of FIFO.)
  28. Finally, after you've mastered all that. Respect your students, love the material, and enjoy yourself.

Next week: Continued third order directed non-Riemannian fractional fast Fourier stochastic differential particle separating longitudinal Latin hyper-active swarm graphs in British bus queueing theory, and practice. Lecturer: Thorin "Missing Totally at Random" Oakenshield.

Filed Under: