Thoughts on Grad School

For some background, there is no way I would have known in college that I would not have wanted to apply for postdocs after my PhD. My career plan then was to go through grad school, be successful, and someday end up a professor. During my Masters' degree I realized that this may not be the life I wanted. My advisers at the time (who were married to each other) would routinely be at the institute until 8 or 9 in the evening and would come in early in the morning. I often wondered if they discussed much else than physics at home. During my PhD, I also observed my advisers working long hours and working on weekends and holidays. Once my adviser even told me that anyone I (a physicist) dated should expect for me to be unavailable on weekends and holidays when there was important physics to do. Still, I was in too deep at this point so I determined the minimal amount of work I needed to get a PhD and completed that. Now that I have a job as a data scientist, the benefits of my PhD seem to be the people I met and the connections I made (which did help me get the job), but the actual knowledge I gained during the degree has been mostly useless. Looking back, I'm reminded of some of the major issues with UC Berkeley and academia and have outlined them here.

Academia takes advantage of people
For some reason, during grad school you are expected to volunteer your time, with no pay or credit. This is especially apparent during the summer when my contract said I was supposed to work 19 hours/week, but my adviser expected me to come in 40+ hours/week. What also shocked me is when I told fellow grad students about this issue, they were not even aware they were only supposed to be working half time. Further, if an adviser does not have funding and the student teaches during summer to cover costs, the student has no obligation to do research during that summer, but this isn't communicated to the student. These facts are rarely spelled out. The sad thing is, this doesn't end with grad school. I know post-docs (at my institution and others) who have told me that their contracts say they should work around 40 hours/week, but are routinely actually expected to work 50-60 hours/week.

My last semester in grad school I was not enrolled in classes, not getting paid for research and was just working on my thesis. Hence, there was no obligation for me to do anything that was not for my benefit. Still, my advisers tried to guilt me into coming into the office often (with threats of not signing my thesis) and to continue doing work. It still bothers me I paid the university (for full disclosure, I made plenty of money during a summer internship to cover these costs) for the "opportunity" to work for the research group for that semester.

When research is done for course credit, the lines are blurred a bit. The time spent on the "course" is not necessarily fixed. I would think that if it is a "course," then research obligations related to the course should start when the semester starts and end when the semester ends. Certainly the "course" should not require a student to attend a meeting on a university holiday or weekend (which happened to me during grad school). Most universities have standards and expect professors teaching courses to be available for their students. Some graduate students I know talk to their advisors a couple of times a semester which I would think hardly respects these standards. This has also led me to question what can and cannot be asked of a student in a course. For example, if there were a "course" in T-shirt making that made students work in sweat-shop like conditions to get a grade in a class, would this be legal? While not a tangible product, research for credit is a somewhat similar scenario where the students are producing papers that will ultimately benefit the professor's fame (and a small chance of benefiting the student).

Another issue is that, as budgets get tighter, expenses get passed off to students. During my time at Berkeley, keeping pens and paper in a storeroom was deemed too expensive. The department suggested each research group make their own purchases of these items through the purchasing website. Not only is this a waste of graduate student's time, the website was so terrible that often it was easier just to buy items and not get reimbursed for them. Most research is done on student's personal laptops, and while a necessity to continue work, rarely is their support from the university to make this purchase or pay for maintenance when it is used for research work. There is no IT staff, so again it falls on students and post-docs to waste their time dealing with network issues and computer outages rather than focusing on the work that is actually interesting.

Academia tries to ignore that most of its graduate students will not go into academia
I apparently have a roughly 50% chance of "making it" as a professor, which is mostly because I attended a good institution and published in a journal with a high impact factor. PhD exit surveys have found similar rates for the fraction of students that stay in academia. Yet, the general expectation in academia is that all of the students will go on to do a research-focused career (I know some professors who look down upon a teaching-focused career as well even though this is still technically academia).

Even after making it extremely clear to my adviser that I had no intentions of pursuing a postdoc, he told me that I should think about applying. He went so far as to say data science (my chosen profession) was a fad and would probably die out in a few years. Once during his class he seemed quite proud of the fact that between industry and academic jobs, most of his students had wound up staying in physics. While my adviser wasn't too unhappy about me taking courses unrelated to research, many advisers will strongly encourage their students to focus on research and not take classes. This is terrible advice, considering useful skills in computer science and math are often crucial to get jobs outside of academia.

The physics curriculum, in general, is flawed. At no point in a typical undergraduate/graduate curriculum are there courses on asymptotic analysis, algorithms, numerical methods, or rigorous statistics, which are all useful both inside and outside of physics. Often these are assumed known or trivial, yet this gives physicists a poor foundation and can lead to problems when working on relevant problems. I took classes on all of these topics, though they were optional, and they are proving to be more useful to me than most of the physics courses I took or even research I conducted as a graduate student.

This is a problem at the institutional level as well. For physics graduate students at UC Berkeley, the qualifying exam is set up to test students on topics of the student's choosing. I chose numerical methods and statistics as my main topic, but my committee asked me no questions on numerical methods and statistics (to be fair, one member tried but he didn't know what to ask, but then again, he could have prepared something to ask since the topics are announced months before the exam). Instead my committee asked me general plasma physics and quantum mechanics questions which were not topics I had chosen. I failed to answer those questions (and have even less ability to answer it now), but somehow still passed the exam. This convinced me that the exam was nothing more than a formality, yet no one was willing to make changes to make the exam more useful. An easy change would to frame the oral exam as interview practice, as most graduate students have no experience with interviews.

There are many student-led efforts to try to make transitioning into a non-academic job easier at UC Berkeley. But the issue is, for the most part they are student-led and have little faculty support. I was very involved in these groups and it was quite clear that my adviser did not want me to be involved. Considering that involvement is why I have a job right now, I would say I made the right choice.

Your adviser has a lot of power over you
As I alluded to before, even though I was receiving no money or course credit from my research group in my last semester, my advisers threatened to not sign my thesis unless I completed various research tasks (some unrelated to the actual thesis). I've talked to others that have had similar experiences, and I hate to say I'm confident that this extends beyond research tasks in some research groups. This could be solved by making the thesis review process anonymous and have the advisers take no part in it, but there seems to be no efforts to make this happen.

Another fault is that advisers can get rid of students on a whim. I've known people who have sunk three years into a research group only to be told that they cannot continue. Then, the student has to make a decision to start from zero and spend a ridiculous amount of their life in grad school or leave without a degree making the three years in the research group irrelevant. Because of this power, professors can make their students work long hours and come in on weekends and holidays. If the student does not oblige, the time the student has already put into the research group is just wasted time.

It doesn't help that research advisers are usually also respected members of their research area. That is, if a student decides to stay in academia (particularly in the same research area as grad school), the word of their adviser could make or break their career. This again gives opportunity for advisers to request favors from students.

Further, the university has every motive to protect professors, especially those with tenure, but not graduate students. This became embarrassingly clear, for example, in how the university handle Geoff Marcy before Buzzfeed got a hold of the news. If professors can ignore rules set by the University (and sometimes laws) with little repercussion, there is little faith graduate students can have that new rules the university instates to any of the problems mentioned here will be followed. I don't want to belittle the (mostly student-led) efforts to make sexual harassment less of a problem at UC Berkeley, but ultimately the only change I can pinpoint is that there is now more sexual harassment training, which has been shown by research at UC Berkeley to lead to more sexual harassment incidents.

Ultimately, graduate school and a career in academia certainly works for some people. Some people are passionate about science and love working on their problems, even if that means making a few sacrifices. Progression of science is a noble, necessary goal, and I am glad there are people out there to make it happen. My hope is that many of the problems mentioned here can be rectified so that the experience for those people and also the people who realize that academia isn't for them can succeed on a different path.

Election Thoughts

This election was anomalous in many ways. The approval ratings of both candidates were historically low. Perhaps related, third party candidates were garnering much more support than usual. The nationwide polling of Gary Johnson was close to 5% and Evan McMullin was polling close to 30% in Utah. There's never really been a candidate without a political history who has gotten the presidential nomination of a major party and there's never been a female candidate who has gotten the presidential nomination of a major party.

These anomalies certainly make statistical predictions more difficult. We'd expect that a candidate might perform similarly to past candidates with similar approval, similar ideologies, or similar polling trends, but there were no similar candidates. We have to assume that the trends that carried over in past, very different elections apply to this one, and presumably this is why so many of the election prediction models were misguided.

I have a few thoughts I wanted to write out. I am in the process of collecting more data to do a more complete analysis.

Did Gary Johnson ruin the election?
No. In fact, evidence points to Johnson helping Clinton, not hurting her. Looking at the predictions and results in many of the key states (e.g. Pennsylvania, Michigan, Florida, New Hampshire; Wisconsin was a notable exception) Clinton underperformed slightly compared to the expectation, but the far greater effect was that Trump overperformed and Johnson underperformed compared to expectation. This is a pretty good indicator that those who said they'd vote for Johnson ultimately ended up voting for Trump. There seems to be some notion that people were embarrassed to admit they'd vote for Trump in polls. This might be true (but also see this), but the fact that third party candidates underperform relative to polling is a known effect. However, the magnitude was certainly hard to predict because a third party candidate has not polled so well in recent elections. It doesn't really make sense to assume Johnson voters would vote for Clinton either. When Johnson ran for governor or New Mexico as a Republican, he was the Libertarian outsider, much like Trump was an outsider getting the Republican nomination for president. Certainly Johnson's views are closer to the conservative agenda than the liberal one.

Turnout affected by election predictions
There has been reporting on how Clinton has gotten the third highest vote total ever of any presidential candidate (after Obama 2008 and Obama 2012). This is a weird metric to judge her on considering turnout decreased compared to 2012 and Clinton got a much smaller percentage of the vote than Obama did in 2012. Ultimately, the statement is just saying that the voting pool has increased, not any deep statement about how successful Clinton is. In particular, let's focus on the 48% of the vote that Clinton got. I have to imagine that if there is a candidate with a low approval and there are claims she has a 98% chance of winning the election, that a lot of people just aren't going to be excited to go vote for her. I could see this manifesting as low turnout and increased third party support. Stein did do about three times better than she did in 2012 (as did Johnson). In addition, people who really dislike the candidate (and there are a lot of them, since the candidate has a low approval rating) are encouraged to show up to the election. I don't see obvious evidence of this, but I have to imagine there was incentive to go vote against Clinton. This could explain the slight underperformance relative to polls in the aformentioned states as well as the large Clinton underperformance in Wisconsin. There's been talk of fake news affecting the election results but I think the real news predicting near certain election of Clinton had just as much to do with it.

Would Clinton have won if the election were decided by popular vote?
This is a very difficult question to answer. The presidential candidates campaign assuming the electoral college system so clearly the election would be different if it were decided by popular vote. Certainly, this seems quite efficient for democrats. Democratic candidates can campaign in large cities and encourage turnout there, whereas Republican candidates would have to spread themselves thinner to reach their voter base. One thing I haven't seen discussed very much is that this would probably decrease the number of third party voters. In a winner-takes-all electoral college system, any vote that gives the leading candidate a larger lead is wasted. So, in states like California, where Clinton was projected to have a 23 point advantage over Trump, a rational voter should feel free to vote for a third party since this has no effect on the outcome. In a national popular vote, there are no wasted votes and a rational voter should vote for the candidate that they would actually like to see be president (of course people don't always act rationally). As argued before, Johnson's voters seem to generally prefer Trump over Clinton, so the number of these people that would change their vote under a popular vote election is definitely a relevant factor in deciding whether a national popular vote election would actually have preferred Clinton. Stein's voters would generally prefer Clinton over Trump, but there were fewer of these voters to affect the results.

Electoral college reform needs to happen
Yes, but if it didn't happen after the 2000 election, I think it's unlikely to happen now. The most likely proposal that I have been able to come up with (with the disclaimer that I have very little political know-how and am strictly thinking of this as a mathematical problem) is to increase the number of house members. This is only a change to federal law, and thus would not be as hard to change as the whole electoral college system, which would take a constitutional amendment. If states had proportional appointment of electors, then as the number of house members increases, the electoral college system approaches a national popular vote election. This is complicated by the winner-takes-all elector system most states have. For example, the total population of the states (and district) Clinton won seems to be 43.7% of the total U.S. population, so even though she won the popular vote, with winner-takes-all systems in place, it is difficult to imagine a simple change to the electoral college system that is closer to a popular vote.

Hypothesis Testing in a Jury

I recently served on a jury and was quite surprised as how unobjective some of the other jurors were being when thinking about the case. For our case, it turned out not to matter because the decision was obvious, but it got me thinking about a formal reasoning behind "beyond a reasonable doubt." This reasoning will involve more statistics than physics, but considering I've been thinking about Bayesian analyses recently in my research, it's quite appropriate.

At the most basic level, a jury decision is a hypothesis test. I wish to distinguish between the hypotheses of not guilty (call it \mathcal{H}_0, since the defendant is innocent until proven guilty) and guilty (call it \mathcal{H}_1). In Bayesian statistics, the way to compare two hypotheses is by computing the ratio of posterior probabilities.


Where p(\mathcal{H}|D) is the probability of the assumption \mathcal{H} given the available data (the evidence). This probably does not seem obvious to compute, and I'll discuss later how one might determine values for these. If F=2, then the hypothesis \mathcal{H}_1 is twice as likely as the hypothesis \mathcal{H}_0. Thus, if F \gg 1, then the evidence for \mathcal{H}_1 is overwhelming. What this means in terms of "beyond a reasonable doubt" is debatable, but it is generally accepted that if F \gtrsim 100, there is strong evidence for \mathcal{H}_1 over \mathcal{H}_0 [1]. Similarly, if F\ll 1, then the evidence for \mathcal{H}_0 is overwhelming. Thus, the question or determining guilt or not guilt is equivalent to calculating F.

p(\mathcal{H}|D) can be rewritten using Bayes theorem as


Here, p(D|\mathcal{H}) is the probability of the evidence given the assumption of guiltiness (or not-guiltiness), which is more tractable than p(\mathcal{H}|D) itself. Note the prosecutor's fallacy can be thought of as confusing p(D|\mathcal{H}) with p(\mathcal{H}|D). p(\mathcal{H}) is the prior, which takes into account how much one believes \mathcal{H} with regard to other hypotheses. p(D) is a normalization factor to ensure probabilities are always less than 1. In the relation for F, this cancels out, so there is no need to worry about this term. With this replacement, the ratio of posterior probabilities becomes

F = \frac{p(D|\mathcal{H}_1)}{p(D|\mathcal{H}_0)}\frac{p(\mathcal{H}_1)}{p(\mathcal{H}_0)}.

The first ratio is called the Bayes factor. The second ratio quantifies the ratio of prior beliefs of the hypotheses. The ratio is, given no other information, the odds that the defendant is guilty. Suppose I accept that there was a crime committed, but that the identity of the criminal is in question. If there is only one person that committed the crime, this would then be the inverse of the number of people who could have committed the crime.

Now, I will consider the calculation of the Bayes factor for a real trial. Consider R v. Adams, which set a precedent for banning explicit Bayesian reasoning in British Courts in the context of DNA evidence. It was estimated during the trial that there were roughly 200,000 men in the age range 20-60 who could have committed a crime. Note that some extra assumptions on age and gender seem to be made here, so this does not seem applicable to the ratio of prior beliefs. However, if I lifted these restrictions, the Bayes factor for the victim's misidentification of the defendant would change accordingly, so this is not a concern.

First, consider the DNA evidence that was the only piece of evidence incriminating the defendant. Call this evidence D_0. p(D_0|\mathcal{H}_1) is the probability of a positive DNA match under the assumption that the defendant is guilty. This is presumably extremely close to 1, or DNA evidence would not be considered good evidence in trials. p(D_0|\mathcal{H}_0) is the probability of a positive match if the defendant is not guilty. Taking into account the population of the U.K., this was estimated in the trial to be between 1 in 2 million and 1 in 200 million (though possibly as low as 1 in 200 since the defendant had a half-brother) [2]. Thus, the Bayes factor considering only the DNA evidence, is between 2 million and 200 million. With the 1 in 200,000 prior probability, the posterior probability ratio F is between 10 and 1000. Only the higher end of this range is overwhelming evidence, and in the case of conflicting evidence, the jury is supposed to give the benefit of the doubt to the defendant, so it seems a "not guilty" verdict would have been appropriate.

Further, this ignores all of the other evidence that helped to prove the defendant's innocence. This included the victim failing to identify the defendant as the attacker and the defendant having an alibi for the night in question. Let us call these two pieces of evidence D_1 and D_2. Unlike the DNA evidence, the witnesses do not explicitly mention what the relevant probabilities are for this evidence, so it is up to the jurors to make reasonable estimates for these quantities. p(D_1|\mathcal{H}_1) is the probability the victim fails to identify the defendant as the attacker given the defendant's guilt. Set this to be around 10%, though police departments may actually have statistics on this rate. On the other hand, p(D_1|\mathcal{H}_0), the probability the victim fails to identify the defendant as the attacker given the defendant is not guilty is high, say around 90%. Thus, the Bayes factor considering the victim's failed identification of the defendant is about 1 in 10. Note that even if these numbers change by 10% this factor doesn't change in order of magnitude, so as long as a reasonable estimate is made for this factor, it doesn't really matter what the actual value is. The alibi is less convincing. Though the defendant's girlfriend testified, the defendant and the girlfriend could have confirmed their story with each other. Thus, I estimate the Bayes factor for the alibi p(D_2|\mathcal{H}_1)/p(D_2|\mathcal{H}_0) to be about 1 in 2.

Since all these pieces of evidence are independent, p(D_0,D_1,D_2|\mathcal{H})=p(D_0|\mathcal{H})p(D_1|\mathcal{H})p(D_2|\mathcal{H}). Thus, the Bayes factor for all evidence is between 100,000 and 10,000,000. Now, multiplying by the prior probability, this gives a posterior probability ratio, F, between 0.5 and 50. With the new evidence taken into account, there is no longer strong evidence that the defendant is guilty even in the best case scenario for the prosecution. Convicting someone with a posterior probability ratio of 50 would falsely convict people 2% of the time, which seems like an unacceptable rate if one is taking the notion of innocent until proven guilty seriously. Note that as long as the order of magnitude of each of the Bayes factor estimates doesn't change, the final result will also not change by more than 1 order of magnitude, so the outcome is fairly robust.

While this line of logic was presented to the jurors during the trial, the jurors still found the defendant guilty. The judge took objection to coming out with a definite number for the odds of guilt when the assumptions going into it are uncertain, though as argued before, for any reasonable choice, these numbers cannot change too much [2]. It seems that without formal training in statistics, it was difficult to accept these rules as "objective," even though this is a provably, well-defined, mathematical way to arrive at a decision. If common sense and the rules of logic and probability are really what jurors are considering to reach their decision, this has to be the outcome that they reach. [3] argues that to not believe the outcome of a Bayesian argument like this would be akin to not believing the result of a long division calculation done on a calculator.

The most common objection to Bayesian reasoning (though apparently not the one the judges in R v. Adams had) is that the choice of prior can be somewhat arbitrary. In the example above, in the estimation of people that could have committed the crime, I could take people who were on the same block, the same neighborhood, the same city, or maybe even the same state. Each of these would certainly give different answers, so care must be taken to choose appropriate values for the prior. This doesn't make the method wrong or unobjective. It is just that the method cannot start with no absolutely no assumptions. Given a basic assumption, though, it provides a systematic way to see how the basic assumption changes when all the evidence is considered.

This problem seems to stem from a misunderstanding of basic statistics by the jury, attorneys, and judges. Another example of this is the U.K. Judge Edwards-Stuart who claimed that putting a probability on an event that has already happened is "pseudo-mathematics" [4]. This just shows the judge's ignorance, as this is precisely the type of problem Bayesian inference can explain. It is a shame that in the U.K. Bayesian reasoning is actively discouraged due to R v. Adams, as this is the only rigorous way to deal with these types of problems. I wasn't able to find any specific examples in the U.S., but I assume the "fear" of Bayesian statistics in courts here is similar to the case in the U.K.

1. Jeffreys, H. 1998. The Theory of Probability.
2. Lynch, M. and McNally, M., 2003. “Science,” “common sense,” and DNA evidence: a legal controversy about the public understanding of science. Public Understanding of Science, 12-1, 83-103.
3. Fenton, N. and Neil, M., 2011. Avoiding Probabilistic Reasoning Fallacies in Legal Practice using Bayesian Networks. Austl. J. Leg. Phil., 36, 114-151.
4. Nulty and Others v. Milton Keynes Borough Council, 2013. EWCA Civ 15.