Dallas R Users Group Baseball Data Dive

This past Saturday I led a data dive workshop for the Dallas R Users Group using Lahman’s baseball statistics. After providing a brief introduction to the Lahman R package¬†and showing how to load the data and make some basic plots, I had the ~20 people in¬†attendance begin working on the following questions:

Visualization:
Visualize how the game of baseball has changed over the years.
Visualize a meaningful statistic on the US map.

Prediction:
Is winning the world series becoming less predictable?
Your friend Peter Daisy likes to bet on baseball games. He asserts that the best predictor of Division Winners is ERA. Is he right? If not, what is the single best predictor of Division Winners?

Scenarios:
The consultant. Nolan Ryan and Ron Washington just called and asked for your expert advice. They are going to focus on improving three statistics this next season, what should they be and why?
The agent. You found an athlete who wants to apply his talents to the game of baseball. He is right-handed, 5 feet 8 inch tall, and weights 165 lbs. Which position makes the most sense for him to start learning and why?
The general manager. MLB has allowed you and Mark Cuban to form an American League expansion team. Mark wants you to choose the three starting outfield players. You can have any current player you want, but Mark says you can’t spend more than 15 million combined. He expects you to balance offensive and defensive performance with these players. Which players do you pick and why?
The parent. Your son is a pitcher and wants to play baseball at the best college for getting into the big leagues. Which college should he attend and why?

The idea wasn’t to complete all of the questions, but to choose one or two of interest. Most of the participants were new to R and focused on visualizing how baseball has changed over the years. Some of the more experienced R users took on the agent and general manager questions. Since the questions were somewhat open-ended, it was fun to see the different approaches and R packages people used.

Feel free to reply with your answers to any of these questions!

When feedback is worthless

I just finished teaching another semester of Anatomy and Physiology which means students will soon be evaluating my course. Well, about 20% of my students will be evaluating my course. Why such little feedback?

About a year ago the University decided to switch from in-class paper evaluations to an online system. Online is always better, right? Wrong.

While going to an online system may decrease the work of a few administrators, it renders the whole evaluation worthless because so few students choose to respond to the optional online survey. The students who do respond likely either really liked or hated you. Talk about biased results! I wonder how much money is spent on this now useless form of feedback.

Take home message: don’t waste resources collecting data that is worthless upon arrival.

Your DNA is as elastic as nylon

In a paper published last year, scientists measured DNA’s elasticity and found it to be 83 newtons per meter, or about like nylon. Although DNA is not pulling a suburban like Travis Ortmayer in the above video (a hemp rope would have been much less elastic, see video), it does need to unzip and coil extensively. So if you plan on using string or rope to teach about DNA replication or histones, use nylon!