Why should you care about this stuff as a social scientist?

I still think I've done a poor job of communicating why you should care about learning this stuff as a social scientist, so I'll try to write a quick article giving you my thoughts on why.

Popularity; Python (and R) are quickly becoming some of the most popular programming languages. This means that anyone in technology you are working with is likely to know at least one of them, but very unlikely to know SPSS or SAS.

python_popularity

  • Open Source; is becoming the way of life for everything from development to data science. Software like SPSS and SAS are quickly falling behind because the types of functions and algorithms they can offer will always be behind open source, in most cases by 5-10 years. There are likely a few reasons for this, some that readily come to mind are bureaucracy; find a place in line on the next production cycle, resources; IBM and SAS are going to have a hard time competing with the tens of thousands of people contributing to open source software, innovation; programmers at these large companies aren't developing the XGBoosts, CATBoosts, CNNs, RNNs, etc. The one's that are need a place to build that architecture as they create it. They almost always choose open source for this.

  • Reproducibility; as both a practitioner and an academic reproducible research and work is quickly becoming a necessity. Just look at the popularity of the site arXiv and github. The expectation is becoming that you show your work. Just as a study's methodology and participant population is expected to be meticulously documented, in a peer reviewed journal article it is becoming more common that the same process be followed for one's analysis. Proprietary software makes this difficult, as it requires a user license to replicate it even if you were given the syntax, whereas with open source if the requirements are well documented anyone can reproduce it.

  • Scalability; with the invasion of cloud computing the expectation is that code can be scalable, sometimes to multiple computers. SPSS and SAS can't offer this. By this I mean that it can run instantaneously. Instead of analytics reports being completed on a timeline the expectation is they are automatically created as new data becomes available.

  • Cost; licenses for SPSS and SAS may seem expensive for individuals, but that cost is nothing compared to what corporate licenses can cost. I've worked for several companies where you can't access SPSS because two many people are taking up all of the licenses.

    • I've worked for a fortune 10 company that almost stopped paying for licenses because of all of the alternatives (Python, R, Alteryx, SAS, Scala, Spark, etc.). You definitely don't want to be caught not being able to do your job anymore because your company stopped paying for your software.
  • Community; I'm not sure why, but the community around open-source languages is so helpful and collaborative. They have tons of meetups, groups, even their own conferences (e.g. rstudio::conf, PyData).
    • You have a question about Python or R? You can basically just google it because your first couple pages will almost certainly be from Stackoverflow, which is a community forum designed for people to post programming questions, which allows other users to provide solutions that can be voted (think Reddit for actual relevant questions/topics).
    • You have a question about SPSS? That's what the Syntax Guide is for, right? If you are great with your control+F skills you can probably find your answer after a few hours...errr days. After all it's only 2500 pages!!
  • Keeping up with technology; it's just keeping up with the Joneses at this point. Everyone outside of the social sciences is learning to program in Python or R, so it will soon become an expectation when hiring. Even my previous boss, who's a wizard at SPSS and is likely only a few years away from retirement has spent time dabbling in Python and R (he heavily prefers R, and I think that's blasphemy!! but at least he's learning one).

And finally and most importantly, because you sound so smart and cool when you talk about it!!

python_meme

Ok, ok, this last one might be in my own head. The image below is more along the lines of the way my wife looks at me when I talk about it :)

rolling_eyes

These are the reasons I feel are most important, feel free to add more below in the comments and I can make edits to this article to add more

In [ ]: