Dodging pitfalls when transitioning from academia to industry

February 20, 2021

In 2016 I finished a PhD in neuroscience and made the move to tech. Since then, I’ve worked as a data scientist and product manager in several companies. I’ve also spent a lot of time screening and interviewing candidates for data science roles.

I frequently receive questions about how to execute this transition successfully. This post is an attempt to summarize my advice. It draws upon help I’ve received from lots of other people who have also made this transition. In particular, I credit the Faculty fellowship with jumpstarting my career in industry, and I highly recommend checking it out if you’re in the UK.

Good reasons to transition

There are lots of other reasons to transition, but I think these two are good reasons.

You like working with other people who are very different to you

Labs vary, but academic science is typically a relatively solitary experience.

Even when it’s collaborative, you tend to be teaming up with people who have relatively similar outlooks and skillset to you. Sure, maybe you know slightly more about the anterior cingulate cortex and they are more of a basal ganglia guy, but you are basically identical in the grand scheme of things.

If you like working with people with completely different perspectives, industry might be a good fit.

Successful data scientists in industry interact with lots of people who are radically different from themselves, and the best ones relish that. If you like the idea of spending time with people whose skills and outlooks are completely different to yours (designers, sales people, software architects), then you might enjoy working in industry.

Conversely, if you find that people without PhDs are boring and exasperating, might be better to stay in a world where everybody has one.

You have an attention span of < 1 year

I had a very productive PhD (lots of papers, lots of citations), but I didn’t do much science that I was really really proud of. This is because my attention span was somewhat shorter than the time it takes to do deeply original, thorough, neuroscience research.

Some people can sustain interest in projects over many years. If you’re not one of them, you might be better suited to an industrial or startup environment.

I think people have different intellectual wavelengths. I’m good at executing projects that take between 1 and 6 months. The scientists I really admire are good at executing projects that take between 1 and 6 years.

The ability to move fast and break things is highly prized in industry, but will cause arguments with your lab manager in academia. If you’d describe yourself as having a ‘bias for action’, perhaps coupled with a ‘shortish attention span’, you might be better suited to a role in industry.

What I worry about when hiring academics

People with PhDs are usually pretty smart, so I don’t think your focus should be on proving that to interviewers. Instead I think it’s more important to avoid some common issues, discussed below.

It shouldn’t be too hard to figure out how to work on these skills once you’ve acknowledged the need. You do have a PhD, after all.

As you go through, try and think through how you could prove to me that I shouldn’t worry about these things.

Full disclosure: I suffered from many of these flaws, and probably still do.

1. Impenetrable communication

Working in companies is more collaborative than academia, and it’s hard to collaborate if you can’t communicate.

Many academic disciplines do not reward clear communication. On the contrary, academics are frequently incentivized to indulge in verbose and obfuscatory communication due to the positive inferences that others draw from the utilization of a sophisticated lexis with convoluted syntax (see what I did there).

It’s almost impossible to be hired into a data science role if nobody can understand what you’re saying.

2. Interminable projects, often abandoned

As mentioned above, academic wavelengths are often a lot longer than those in industry. Academic projects can fail after many years of investment, and that’s fine. Furthermore, academics sometimes deride the idea that it might be worth trying to plan projects, estimate timings, or quantify uncertainty (“it’s science, dumb dumb, it’s inherently uncertain”).

A completely unsubstantiated model of project incompleteness. In my experience, the longer something goes without being finished after a brief startup window, the more likely it is to drag on forever.

It can be hard to say whether a project was executed swiftly or not when evaluating candidates from a different field. But you can still learn something by whether somebody finished stuff.

Even failures in science should result in some useful output (which typically involves communication, see above). If somebody did a bunch of experiments that didn’t work, but they pulled the plug, wrote something about it, and moved on, that’s great.

If they took 4 years to realize that something wasn’t going to work, I worry that in the cut-and-thrust of industry they’re going to struggle to allocate their time effectively and generate outputs that matter to the rest of the organization.

3. A tendency to find complex solutions to simple problems

Squiggles can be fun, but if you’re the person who’s always finding squiggly solutions to straight line problems, you’ll end up exasperating your colleagues and wasting their time.

One of the first major papers of my PhD relied upon a “Hierarchical Bayesian Model of Learning Under Uncertainty”. Nowadays, I tend to avoid the words “Bayesian” and “hierarchical”, and I really like the worlds “linear” and “regression”.

You can publish a paper by solving an easy problem with complex methods. You cannot make money by doing this. People with fancy PhDs sometimes overlook simple solutions, perhaps out of cognitive dissonance (“I’m really smart, and if I’m working on this problem, it must be a hard problem and therefore have a complex solution”).

In industry, you and your organization pay the cost of unnecessary complexity many times over. Here’s a few ways:

it takes you longer to do the work
it’s harder to verify that the work is correct
it’s more difficult to communicate the work to other people
it’s more difficult for people to work with your outputs (e.g. 10GB neural network vs 100MB random forest)
You consume more computational resources

And so on.

4. Motivated by curiosity or aesthetics rather than outcomes

Strong aesthetic drive coupled with a burning curiosity are defining characteristics of a good scientist. But in an industry setting, indulging them too much will often lead to very long projects (point 2) full of unnecessarily complex solutions (point 3).

Often this manifests itself as a reluctance to do the dirty work that needs to be done to get a project over the line (“I’m afraid I’m not going to clean up 10’000 rows of data, but I will happily convert this autoencoder to a variational autoencoder instead”). This can be exasperating and jeopardize projects, but it’s also unfair on the colleagues who end up doing all the grunt work that the person in the ivory tower declines to do.

My summary of lessons learned applying data science. pic.twitter.com/5xLZTs2C9j
— Vincent D. Warmerdam (@fishnets88) October 22, 2020

Vincent has lots of other good content at koaning.io

5. Bad coding and collaboration practices

It’s no secret that academics tend to write bad code, and it’s not their fault, because it’s really hard to write good code on your own. The incentives are not (yet) well arranged for highly reproducible open science (but lots of good people are working on it).

Many students learn to code as part of their PhD, assembling a grab bag of old code snippets and wonky practices from post-docs and professors who themselves never really took the time to learn any software engineering.

This can create issues in industrial contexts. Modern software practices rely upon a good segmentation of work into small chunks which can be distributed throughout the team, completed reasonably quickly, and pieced back together.

This requires both some software engineering chops and a willingness to submit your work to scrutiny by others (often earlier than you’d like). The discipline to break down ambitious things into manageable steps can be hard to develop.

6. Intolerance to changing demands

The objectives in academia have been the same for many years: publish papers, get tenure. Although academic science has lots of uncertainties, there’s rarely ambiguity about what constitutes a good outcome. A Nature paper is a Nature paper, and it’s bloody fantastic.

The target shifts frequently in industrial settings (especially in startups)

Unfortunately, the same is not true in industry. The ultimate goal (‘make money’) is stable, but it’s too far removed from your day to day to be a faithful guiding star. Organizations’ opinions on how to achieve that goal – and thus what you you should be spending your time doing – often changes.

This can be part of the fun, but if you’re used to very long lived projects it can be incredibly stressful and demoralizing. If you’re intolerant of these changing demands, you will frequently be upset and angry at your superiors. As a technical person without much business exposure, it can be hard to understand why yesterday’s hot project is today’s compost. This can lead to cynicism and a loss of faith in the rest of the organization.

How to prove you’ll be a good fit in industry

I think that side projects are overwhelmingly the best way to prove your competence for industrial roles.

A good open source side project can demonstrate:

Communication and collaborative ability
The ability to finish things
Solid coding practices
The willingness to do stuff you don’t like doing (e.g. DevOps)
The ability to spot a problem and develop a solution

Bonus points for:

Writing about it well
Getting some users or contributors you don’t know
Getting any form of publicity

Depending on your temperament, you might not find that starting your own side project is the best fit. Contributing to existing open source software is also a big plus, although it doesn’t demonstrate quite as much initiative and “finishing instinct” as executing your own.

Teaming up with somebody with complementary skills – a designer or software engineer who needs somebody with data chops- is also fantastic evidence of the interdisciplinary collaboration you’ll need to succeed in industry.

Very concretely, I would encourage you to pursue projects which:

Are in Python or Javascript (and not just in notebooks)
Involve deploying something, e.g. to Heroku or AWS
Rely upon git, with well formatted commits, Pull Requests and Code Reviews
Are well tested, with tests that run automatically
Rely upon a database

Wrapping up

Different people will have different struggles as they make this transition. It’s hard to be prescriptive about what steps to take to improve your chances of landing a role you love. Hopefully trying to scrutinize yourself from the perspective of an interviewer will help you understand which rough edges you need to polish to maximize your chances of success.

If you found this post thought provoking, you might also enjoy Don’t Call Yourself a Programmer, a wonderful “career README.txt” encouraging engineers to think about their role more holistically.

If you have specific questions or would like feedback upon your CV, application materials, or Github profile, please feel free to contact me @archydeb.

If you thought this was fun, you might also like these: