Daniel S. Standage

Permalink: 2016-05-13 by Daniel S. Standage in roundup tags: publication open r rnaseq social education reproducibility

My typical modus operandi is to post a roundup of links to interesting reads every week or two. I had to take a break from updating the blog over the last few months with my dissertation in beast mode. Now that the stress from that is gone (woo hoo!), I'm going to try to get back in the habit. I may still be digging up older reads for a couple of weeks.

And now for the links:

This post about academic CS and software

It makes sense that people expect academic research results to work in companies right away. Research that makes tangible, measurable contributions is often what ends up being most popular with funding sources (including industry), media outlets, and other academics reviewing papers, faculty applications, and promotion cases. As a result, academic researchers are increasingly under pressure to do research that can be described as "realistic" and "practical," to explicitly make connections between academic work and the real, practical work that goes on in industry.

I'm not sure Jean fully appreciates the argument here. I don't think people are clamoring for academic CS to produce industry-standard deployment-ready code. They're arguing that code and data used as proof-of-concept for their research should be made freely available. If it can't be openly evaluated and scrutinized, it's not science, it's anecdote.

This perfect counterpoint to the previous post

This week Andy Ko made available his Citrus/Barista structured editor from the 2000s. I downloaded and ran it: the binary did run, but it spat out repeated exceptions and I wasn’t sure if that was impairing the functioning of the software. Thankfully, Ko didn’t just publish a binary: he published the source code. For this, I salute him. I went to modify the source code and it turned out not to compile with a modern Java compiler. After some tweaks I got it compiling, and then fixed the exception. Because the code was on github, one accepted pull request later and the software in his repository will now compile and run on a modern machine. This — this is how software research should be.

This post about science fairs and privilege

The experience turned out well, but it also left me queasy. The only reason Veronica was able to carry out her experiment was that I had the flexibility to spend hours struggling through paperwork, and because I had a social network of scientists I’ve developed as a science writer. This was an exercise in privilege.

If Veronica had been the daughter of a single parent with a couple jobs and no connections to the world of science — if she had been like a lot of American kids, in other words — her idea would have gone up in smoke. She might not have even bothered thinking about the science fair at all.

Carl hits on the fact that science fairs are much better at identifying the students with the best social supports than they are at identifying the brightest or most curious students. There are a lot of directions we could take this social commentary. Personally, it's very encouraging to know that supporting my children with my time and social connections can have a huge positive impact on their success. On the flip side, I feel a huge responsibility for young people around me that are not so fortunate. How can I best fulfill my responsibility to support these budding bright young minds?

This pitch for a short R course developed by a high school student

Her interest in R grew out of a project where she and her fellow students and teachers went to the Canadian sub-Arctic to collect data on permafrost depth and polar bears. When analyzing the data she learned R (with the help of a teacher) to be able to do the analyses, some of which she did on her laptop while out in the field. Later she worked on developing a course that she felt was friendly and approachable enough for her fellow high-schoolers to benefit.

Interestingly, the previous article was fresh on my mind when I read this one. It is indeed remarkable what young people can do given support and access!

This little demo about error correction and RNA-Seq quantification

Does read error correction effect gene expression estimation? tl;dr: no.

This post about group chat

I see a lot of academic research groups hopping on the Slack/Gitter train. This post makes some good points about what group chat is/isn't good for.

That said, I still think group chat is an important tool in the communications toolbox. I just don’t think it’s the go-to tool. I think it’s the exception tool. It’s far more useful for special cases than general cases. When used appropriately, sparingly, and in the right context at the right time, it’s great. You just really have to contain it, know when not to use it, and watch behavior and mood otherwise it can take over and mess up a really good thing.

This post about paying (or not paying) to support science journalism

I'll admit that I use an ad blocker plugin for my web browser. But I am more than happy to disable it for news sites that prove (or promise) that they will not overwhelm me with megabytes of distracting interactive advertising with each page load. And the author of this post has it right regarding the economics of science journalism: we can't have it both ways. We don't get to complain about the poor quality of reporting while at the same time refusing to pay for quality science writing!

This post on the economics of college education

It is disingenuous to call a large increase in public spending a “cut,” as some university administrators do, because a huge programmatic expansion features somewhat lower per capita subsidies. Suppose that since 1990 the government had doubled the number of military bases, while spending slightly less per base. A claim that funding for military bases was down, even though in fact such funding had nearly doubled, would properly be met with derision.

Interestingly, increased spending has not been going into the pockets of the typical professor. Salaries of full-time faculty members are, on average, barely higher than they were in 1970. Moreover, while 45 years ago 78 percent of college and university professors were full time, today half of postsecondary faculty members are lower-paid part-time employees, meaning that the average salaries of the people who do the teaching in American higher education are actually quite a bit lower than they were in 1970.

By contrast, a major factor driving increasing costs is the constant expansion of university administration. According to the Department of Education data, administrative positions at colleges and universities grew by 60 percent between 1993 and 2009, which Bloomberg reported was 10 times the rate of growth of tenured faculty positions.

And this (I'm risking posting half the article here!).

The rapid increase in college enrollment can be defended by intellectually respectable arguments. Even the explosion in administrative personnel is, at least in theory, defensible. On the other hand, there are no valid arguments to support the recent trend toward seven-figure salaries for high-ranking university administrators, unless one considers evidence-free assertions about “the market” to be intellectually rigorous.

What cannot be defended, however, is the claim that tuition has risen because public funding for higher education has been cut. Despite its ubiquity, this claim flies directly in the face of the facts.

This post on how reproducibility is and should be defined, discussed

Part of the solution may lie in a separation of concerns. We need much greater clarity in the distinction between “described well enough to re-do” and “seems to be consistent in repeated tests”. We need to maintain those distinctions for each of the different levels above: replication, reproduction and generalisation. All of these in turn are separate to the question of whether a claim is true, or the universe (or our senses) reliable, or the social context and power relations that guided the process of claim making and experiment. It is at the intersections between these different issues that things get interesting: does the design of an experiment may answer a general question, or is it much more specific? Is the failure to get the same result, or our expectation of how often we “should” get the same result a question of experimental design, what is true, or the way our experience guides us to miss, or focus on specific issues? In our search for reliable, and clearly communicated, results have we actually tested a different question to the one that really matters?