GitHub monoculture

Saturday 3 May 2014This is more than ten years old. Be careful.

I continue to notice an unsettling trend: the rise of the GitHub monoculture. More and more, people seem to believe that GitHub is the center of the programming universe.

Don’t get me wrong, I love GitHub. It succeeded at capturing and promoting the social aspect of development better than any other site. And git, despite its flaws, is a great version control system.

And just to be clear, I am not talking about the recent turmoil about GitHub’s internal culture. That’s a problem, but not the one I’m talking about.

Someone said to me, “I couldn’t find coverage.py on GitHub.” Right, because it’s hosted on Bitbucket. When a developer thinks, “I want to find the source for package XYZ,” why do they go to the GitHub search bar instead of Google? Do people really so believe that GitHub is the only place for code that it has supplanted Google as the way to find things?

(Yes, Google has a monopoly on search. But searching with Google today does not lock me in to continuing to search with Google tomorrow. When a new search engine appears, I can switch with no downside whatsoever.)

Another example: I’m contributing a chapter to the 500 lines book (irony: the link is to GitHub). Here in the README, to summarize authors, we are asked to provide a GitHub username and a Twitter handle. I suggested that a homepage URL is a more powerful and flexible way for authors to invite the curious to learn more about them. This suggestion was readily adopted (in a pending pull request), but the fact that the first thing to mind was GitHub+Twitter is another sign of people’s mindset that these sites are the only places, not just some places.

Don’t get me started on the irony of shops whose workflow is interrupted when GitHub is down. Git is a distributed version control system, right?

Some people go so far as to say, as Brandon Weiss has, GitHub is your resume. I would hope they do not mean it literally, but instead as a shorthand for, “your public code will be more useful to potential employers than your list of previous jobs.” But reading Brandon’s post, he means it literally, going so far as to recommend that you carefully garden your public repos to be sure that only the best work is visible. So much for collaboration.

There’s even a site that will read information from GitHub and produce a GitHub resume for you, here’s mine. It’s cute, but does it really tell you about me? No.

There is power in everyone using the same tools. GitHub succeeds because it makes it simple for code to flow from developer to developer, and for people to find each other and work together. Still, other tools do some things better. Gerrit is a better code review workflow. Mercurial is easier for people to get started with.

GitHub has done a good job providing an API that makes it possible for other tools to integrate with them. But if Travis only works with GitHub, that just reinforces the monoculture. Eventually someone will have a better idea than GitHub, or even git. But the more everyone believes that GitHub is the only game in town, the higher the barrier will be to adopting the next great idea.

I love git and GitHub, but they should be a choice, not the only choice.

Comments

[gravatar]
I generally lean towards the "Free software" rather than the "open source" side of things and that's another criticism I have of github. It (especially the non-git parts like issues, wiki and pull requests discussion threads) for an open source project are locked up inside a proprietary system and I don't like that. Benjamin Mako Hill wrote a nice post about this factor in the ecosystem which I quite resonate with. http://mako.cc/copyrighteous/free-software-needs-free-tools

In general, I'm against the monoculture. Just that I have an extra reason to be against it.
[gravatar]
I'm with you.

A newer open source contributor spoke about her first year of contributions last year at Open Source Bridge and said that when she is trying to find out whether a piece of software is open source, or trying to find open source projects on a particular topic, she goes to GitHub and does a search. I think this reality is one to accommodate by, for instance, having a placeholder GitHub presence that perhaps mirrors the canonical Git repo and turning off pull requests (or syncing pull requests into Gerrit or Bitbucket requests). That is what Wikimedia does.

But it's just not a good idea to assume that all of the open source community's development is on GitHub. Wikimedia uses a Gerrit installation (and may move to a Phabricator instance soon), and of course there are a bunch of projects on BitBucket, Gitlab, Gitorious, and what have you. I agree with you - the "GitHub is the only place" assumption is harmful.
[gravatar]
And, um, some of us still use SourceForge. GET OFF MY LAWN!
[gravatar]
Completely agreed. Well put.

(As a regular user of GitHub for both personal and professional projects.)
[gravatar]
I think it has got to do more with brand power rather than anything else. GitHub is similar to how some people perceive the iPhone - "Good brand, good enough for me and I don't care about anything else (even if there might be better things out there)".
[gravatar]
I also find this 'github is your resume' idea to be absurd.

Imagine a developer in their early 40s, who had children before github even existed. They're busy with their work, their family, and they really don't have time for side projects. Nor should they; spending all your spare time programming if you have a family and a job makes you an unbalanced individual.

Sure, such people may do the occasional small thing to learn new technologies, but you know what... when I do that, it's usually only useful to me, so I don't put it on github. There are squillions of utterly useless 'projects' on github.

The whole thing reeks of the twenty-something childless tunnel vision that badly afflicts the tech sector.

As for twitter, aside from being able to follow food trucks, I seriously don't get it.
[gravatar]
As a user I wouldn't mind if there is a single go-to destination for finding code. Of course, code discovery through a google search is an option, but its easier from a user perspective to find everything in a single place. It does indeed add value.

But do agree with your idea that it could imply business monopolization. As Noufal pointed out, Free Software Needs Free Tools.
[gravatar]
Even worse,

another github based cv is available: http://osrc.dfm.io/ with even more details you ever expected.
[gravatar]
Do you also recommend people look for you on MySpace or maybe Friendster? Do you recommend web searches on AltaVista or Lycos? Should I go back to AOL as my internet service and email provider? Maybe you'd prefer we dump the internet all together in favor of disjointed dialup BBS services?

What a silly article.
[gravatar]
Google : search :: Github : social distributed version control
Facebook : social network :: Github : social distributed version control
Amazon : ebooks :: Github : social distributed version control
WordPress : blogging :: Github : social distributed version control
Excel : spreadsheets :: Github : social distributed version control

When the Ada Initiative left their partnership with Github, I decided to migrate my projects as well. I already had an account on Bitbucket; I moved my private repos there to save seven bucks a month long ago. And I already had an account on Gitorious; I forked one of Zed Shaw's projects a long time ago and it's there. So I created a GitLab account and started doing a feature comparison.

First of all, nearly all of my projects are one-person projects and most of them are just free cloud backup. Moving those to any of the Github alternatives is trivial. I settled on GitLab because it's open source (Rails) and the GitLove project is there. I've moved most of the small projects.

A few of the larger projects were next. There were about four of them with a significant number of watchers but really only one is active. The others were functionally superseded by my main project, CompJournoStick, so I made the hard decision to abandon/archive them and break all the links they've built up over the past five years.

That leaves CompJournoStick and my Octopress blog on Github Pages at znmeb.github.io. The blog is inactive; I use it mostly for CompJournoStick release announcements. I could probably port it to a wiki on one of the other hosting sites, or go back to a self-hosted WordPress blog. I tried copying it to Bitbucket's "almost-Github-pages" scheme and there were too many gotchas. For now, I'm leaving it on Github.

I want to move CompJournoStick but:

1. Broken links - whatever search positioning, social media links, etc. I have in the journalism community go out the window.
2. Broken trust - I look like a flake. Suddenly anyone who thought about engaging me on the project has a reason to avoid it. I haven't actively sought contributors, but I'd like to at least have the option. Saying "if you want to work with me on CompJournoStick, you have to move to Gitlab" is not an option.
3. I need an account on Github anyway to contribute to all of the other Github projects I use.

New projects will not be on Github. I'll most likely construct them on GitLab (or GitLove, if it gets moving) with backups on Bitbucket. I don't know about the blog; blogging is a chore (as is maintaining a LinkedIn profile). I'll probably drop the Gitorious account; it's just an addition to my online attack surface at this point.
[gravatar]
The real irony, Github is centralizing a distributed VCS.

The problem, Github is good at what it does and it will be hard to replace it with something better. Like Github replaced Sourceforge and Sourceforge replaced all those single CVS repositories.

But yes, Github has to die.
[gravatar]
I've contributed to (in order of approximate descending volume):

1. GitHub
2. Bitbucket
3. Gerrit (OpenStack)
4. Launchpad
5. Gitorious
6. Google Code
7. Sourceforge
8. Patches via email

This is probably rough order of preference too. Pretty subjective of course. I'll Google (ahem, web search, Google is another monoculture :-)) and contribute via whatever the maintainer has. I am probably somewhat likely to do it if it's one that I'm comfortable with, though I don't think about it consciously. I suspect that GitHub attracts the most contributors, because it's well-known and easy. There are lots of choices though.
[gravatar]
Thanks all for the comments. A few people have pointed out that I mention Google for search, and that it is an even larger monoculture. True, Google has a larger corner on the search market than GitHub has on their market. But doing a Google search today doesn't make it hard for me to use a different search engine tomorrow. GitHub is the host for repositories, and a tie-in point for tools. If another search engine appeared tomorrow that could beat Google, we'd all switch to it, no barrier to exit or entry.
[gravatar]
I observed a similar weakness back in 2010: http://kev009.com/wp/2010/10/stop-distributed-version-control-diaspora/

I was hoping for something like Diaspora (https://joindiaspora.com/) so we could use our choice of repo or self-hosting but still interact with a global community.
[gravatar]
When a developer thinks, "I want to find the source for package XYZ," why do they go to the GitHub search bar instead of Google?

But wait a minute.. Aren't there other search engines? What made you use Google as if "Googling" is search? Because they won search just like Github won version control and code hosting.
[gravatar]
I'm an old git (pun intended) who got into using VCS with CVS. Until recently I used Subversion for most of my projects quite happily, because usually I'm the only developer, even though it's open source, and that worked just fine for me.

A while ago I started trying to contribute to some projects to some projects on github and in my experience it is far from making it easier to contribute to such projects. There are just so many ways in which a git(-hub) workflow can be organized and very few projects take the time to explain the steps you need to take to make your first pull request in a detailed manner, step by step, they just assume, "hey, we're on github so everybody should know how this works, right?". Wrong.

Apart from that, I've seen so many internet service come and go in the last twenty years, that I've grown to be _very_ wary of relying on third parties to host my stuff, be it email, calenders, or my code.
[gravatar]
I just use self-hosted git repos and gitweb. Github creeps me out-- I think of them as the Facebook of code, in the negative sense. I'm creeped by Google too, but that's less bad, I just don't have any personal Google accounts. I've worked for some companies that used gmail as their internal email, so they created corporate gmail accounts for my work email, but those were the companies' accounts rather than mine. And if I had gmail accounts at company X and company Y, those accounts were unrelated to each other. It wouldn't surprise me if Google does some sneaky crap that lets them infer that I worked at both companies, but at least I didn't have to tell them up front.

Github is much worse: there are no company accounts, you're required to create and use a personal account which the companies then whitelist for access to their private repos. The Github TOS requires your real name and limits you to a single personal account. In other words they track your private work associations, since your personal account gets connected with the repos of companies X and Y even if you never say anything public about that. This strikes me as extremely invasive: if you're a consultant and your clients use Github, then Github now knows your client list. There is no valid business justification I can see for that disclosure: I'd have no problem at all having each client buy me a separate, paid account under the clients' names, similar to the gmail accounts.

Mostly because of the above privacy issue but also because of the monoculture, I've been boycotting Github (along with Google, Facebook, etc.) to the extent that I can.
[gravatar]
Speaking of the Terms of Service, you must be 13 years old or older to use GitHub, which excludes bright kids from contributing to our communities. And I also dislike the requirement to use your wallet name. (I outlined these and other problems when Wikimedia was considering GitHub vs. Gerrit a few years back.)
[gravatar]
Scrooge, so you're telling people we should use something because everyone else is doing it? What a silly comment hur hurr hurrrr.

While we haven't contributed to github, we've used plenty of ruby gems from it, and our experiences have been poor. With the influx of new devs learning that github is where you throw the code you think will help everyone out, every dev and his mother throws their poorly-built junky code up there. We get new programmers all of the time including 20 unnecessary, buggy gems on our projects, and even the gems we actually needed are buggy and slow, so we end up having to build the functionality from scratch anyway or switch them out for another ridiculously-named gem and rebuilding the interface. It ends up actually costing us significant amounts of time supporting these crappy gems.

So you can understand when it makes me irate when employers begin considering github involvement a heavy factor in determining if a developer is worthy of a high position in their Ruby shop. The best Ruby devs I know aren't on Github, started with languages such as C, and Perl, and don't have time to jump into every hip community that pops up every few years. Most of our github-fanclub devs haven't touched anything other than Ruby, don't know how to use other repositories, and keep on including gems to do work for them that we end up having to remove because the gem sucks.
[gravatar]
Indeed. I've grown a preference for Mercurial over Git (GASP!) and really like the fact that Bitbucket allows me to choose either format. Bitbucket is behind Github in some features (repo code search, please!) but is a great tool regardless. The trend towards the "Github or nothing" mentality baffels me a bit. At least bitbucket is still somewhat popular in the Python community. When I was working in Ruby land I never saw bitbucket mentioned.

Thanks for the article which elloquently addresses this trend.

Can we start a "Pro Diversity - Choose Bitbucket" campaign? ;)
[gravatar]
Ah heck, why not...let's give it a try:

https://twitter.com/RandySyringPro/status/463014829406838784

#ProDiversityChooseBitbucket
[gravatar]
To be fair, Travis CI wants to support other sites such as BitBucket, but the current code is currently *very* heavily structured around Github. It's going to require some massive refactoring to add support for another site
[gravatar]
So, why isn't coverage on GitHub?

Also, for what it's worth, I am about 100x less likely to contribute to a project that isn't on github. I think you're only hurting yourself by going against the flow.
[gravatar]
Antonio Dourado 7:36 PM on 4 May 2014
"But doing a Google search today doesn't make it hard for me to use a different search engine tomorrow."

Depends on how you use Google, really. If you don't have an account and just only use the search functionality, perhaps this would be true.

But I have an account, I use Gmail, GDrive, Chrome (until some weeks ago, at least), I have a Blogspot blog, and I use my Google account to login in several other websites. The amount of context and targeting that goes in every search I do is certainly very high, and, although some could argue that this is a privacy issue (and I might agree with them), the fact is that I get much better search results with Google than with any other search engine that doesn't "know me" so well.

So, yes, I could change search engines tomorrow, but it wouldn't be without a price.
[gravatar]
Drone.io is similar to Travis CI, it's free for public repos but also supports Bitbucket and Google Code, pretty handy, has a much cleaner interface and is getting improvements all the time, specially for deployments, just FYI: https://drone.io/

I use it. Also, I prefer Mercurial to Git, who needs staging?
[gravatar]
Other people have already spoken about the GitHub brand, which I believe is the crux of the monoculture with which you take issue. BitBucket is technically superior in some features but it feels like the team behind BitBucket has failed to advertise its benefits enough to draw people away from GitHub. I agree with you that GitHub should not be the only choice; but it is the default for so many developers these days, and it is worth asking why and why aren't more developers using BitBucket, SourceForge, et cetera. Maybe because getting an account up and running on GitHub feels easier? Unfortunately I have no serious answers to my own question, but GitHub did not gain its popularity without some merit (just like Google for search engines), and so it would be interesting to discuss what merits GitHub brings to the table that other alternatives lack.
[gravatar]
@Aaron Hill: You said, "To be fair, Travis CI wants to support other sites such as BitBucket, but the current code is currently *very* heavily structured around Github." I think that translates to, "Travis bought into the GitHub monoculture very early, and didn't plan for diversity."
[gravatar]
@Aaron Meurer: coverage.py is hosted on Bitbucket because when I needed to find it a home, git was not a clear choice for my home OS, which at the time was Windows. Now coverage.py stays on Bitbucket because the issues are there. BTW: there is a mirror repo on GitHub.

You say you are 100x less likely to contribute to a project if it isn't on GitHub, but you don't say why. Is it really so hard to use other sites? Isn't most of the work the actual coding, debugging, etc that happens on your own machine?
[gravatar]
I think you are completely underestimating the entry barrier cost of contribution to open source. When I talk about contributing to open source, I'm talking about something that I do completely voluntarily. Any entry barrier cost, even ones that appear small, can negatively affect the decision to do it to a degree that it doesn't happen.

With GitHub, I know the whole process on how to contribute code. I clone the repo, fork it, make some changes, push it up, and make a pull request. I've done this thousands of times. The entry barrier is low for a few reasons. One because I already know all the tools (git, GitHub, etc.). A project in hg is extremely hard for me to contribute to, because I do not know it. Second, these tools are actually powerful enough to make certain things possible. git's branching model and easy merging makes contributing easier than submitting literal patches. GitHub makes both opening and merging a pull request a single click. In fact, GitHub makes it possible to contribute to a project without ever leaving the browser. It's designed for people who don't want to use git, but I've done it myself. I am definitely savvy enough to clone a repo, fire up emacs, fix some typo, and push up a branch, and do it all pretty quickly, but even this is way slower than just correcting the typo in GitHub and pressing "fork and pull request".

We recently moved all SymPy issues from Google Code to GitHub. This was a huge downgrade in terms of issue features. Google Code lets you do nice things like automatically apply labels, and has very powerful searching (in GitHub, I can't even figure out how to do a negative label search). But this was a downgrade for a few (basically me and maybe a couple other core devs), and an upgrade for the community (all the people who want to report issues).

If you hate drive-by contributions, or if you want to develop your code in the cathedral, then go ahead and pick whatever works the best for you, and you alone. But if you're like me and you love drive-by contributions, and you think that the bazaar is a much better place to develop code, then GitHub is really the only place that you can do it, because GitHub has the one thing that no other site has, which is momentum.
[gravatar]
Let's do a bit of honest analysis here. When people tell you, 'I couldn't find coverage.py on GitHub', what that means is, 'Gee, it would have been really convenient if coverage.py was on GitHub, because that's what I'm familiar with; now I'll have to do some searching to figure out which website has the best mirror for my needs.'

> ... I suggested that a homepage URL is a more powerful and flexible way for authors to invite the curious to learn more about them. This suggestion was readily adopted (in a pending pull request), but the fact that the first thing to mind was GitHub+Twitter is another sign of people's mindset that these sites are the only places, not just some places.

So let's see, the fact that the 500 lines book project is hosted on GitHub, and the contributors are asked for their GitHub usernames as the first point of contact, strikes you as strange? And to be fair, everyone did readily accept the suggestion to use a URL instead, but guess what, my thought is that most everyone is going to just submit their GitHub profile URL. So at best you've added a few more characters to what everyone has to type out.

> Don't get me started on the irony of shops whose workflow is interrupted when GitHub is down. Git is a distributed version control system, right?

Yes, _git_ is a DVCS, but _GitHub_ is a full-featured code hosting, issue tracking, release and deployment solution. So let's not set up a straw man here, please.

> ... But reading Brandon's post, he means it literally, going so far as to recommend that you carefully garden your public repos to be sure that only the best work is visible. So much for collaboration.

Let's face it, whenever a system gives rise to any kind of incentive whatsoever, there will always be some trying to game that system. So, some people will think that GitHub seems like a good proxy for coding mojo and then they'll take that to the logical extreme. Doesn't mean that's actually true. See e.g. GitHub's famous 'meritocracy rug' tweet.

> ... Still, other tools do some things better....

Yes, some tools are better at some tasks and others are better at others. With this kind of argument, no one ever gets anywhere :-)

> ... But if Travis only works with GitHub, that just reinforces the monoculture.

Travis is hardly the only game in town. I'm no CI expert and even I know about Jenkins and CruiseControl. The kind of monoculture that Travis creates by targeting GitHub as a code source is simply a trade-off to get simplicity in exchange for flexibility, and is standard practice in software engineering.

> ... But the more everyone believes that GitHub is the only game in town, the higher the barrier will be to adopting the next great idea.

This is the old vendor lock-in bugaboo. Let's break it down: GitHub locks you in with its pull requests, its commit- and line-level comments, issues, social features etc. That's again a trade-off that you need to choose to make. E.g., Linus Torvalds explicitly doesn't use the 'lock-in' features of GH despite mirroring some of his projects there. Others embrace all the lock-in. The fact is, there will be lock-in to some extent no matter what tool you use, GitHub or no GitHub. Do you use Bugzilla today and decide to migrate to Phabricator tomorrow? Well, you'll need to massage all your issue data out of BZ and import it somehow.

With git, there's no question of lock-in--even at the most primitive level, you can always get all your commits out as plain text patches. Generally, anyone worried about lock-in with a VCS doesn't understand what a VCS _is._
[gravatar]
@Yawar: on the 500lines book README, yes part of the reason for asking for a GitHub username is because the book is being built on GitHub. As for your prediction, "my thought is that most everyone is going to just submit their GitHub profile URL," I guess we will find out. If you have no more to say about yourself than, "here is a pile of code," then please feel free to use your GitHub profile as your homepage.
[gravatar]
There is a natural tendency for technology usage to coalesce on a de facto standard, but this is not necessarily a sign of something sinister, or even of something counterproductive. It usually means that the leader in a category is now good enough such that it is hard to come up with a significantly better alternative. Git is exactly that kind of "good enough" solution.

As for Github (as distinct from git) it is like any major city in the real world. People from rural areas migrate to cities because... other people migrate to cities, and so on. Network effects are intrinsically beneficial. This is not sinister. It's just topology and feedback.

I use bitbucket and github interchangeably (unless I want a private repository for some reason: bitbucket!). I also use codeplex occasionally, and at work (sadly) I have to use TFS. I seamlessly mirror change sets between TFS and git repositories for convenient offline working.

If you define a VCS as something that uses a working directory in the file system, and transactional commits with comment+timestamp+username, then that forms a very open protocol. You can very easily shunt changes between such systems by getting them to look at the same working directory. There is really no lock-in at all.
[gravatar]
"Also, for what it's worth, I am about 100x less likely to contribute to a project that isn't on github. I think you're only hurting yourself by going against the flow."

Who cares. You probably don't know any "real" languages anyway.
[gravatar]
What is that suppppsed to mean?
[gravatar]
@Ned: You're throwing context out the window here. On a project that's hosted and managed in GitHub, listing everyone's GH profiles as their primary contact info is very natural. It lets people click through and see everyone's contributions, interests, work. But it's not just limited to that: the GH profile lists your email address and website, if you choose to reveal those. So again, _for a project managed in GitHub,_ a GH profile is IMO an excellent contact method.
[gravatar]
On the idea that "GitHub is your resume", here's my response, strictly from a pragmatic job hunting point of view: Your GitHub account is not your portfolio but it's a start. Short version: "It’s a great idea, but it’s only a start. Your portfolio should be more curated than that to be effective."
[gravatar]
@Aaron Meurer: You say that not having projects on GitHub will mean far fewer contributors. Perhaps - but the work involved in meaningful contributions usually outweighs the learning curve for e.g. another site or even another DVCS. I have numerous projects on both GitHub and BitBucket, and use BitBucket more because they allow you to have free private repositories. I find so little difference in the workflows on the two sites that I would wonder about the value of potential contributions from people who find minor differences in workflow a barrier to such contribution. IMO the quality of contributions and the commitment of contributors is more important than just a larger number of potential contributors, which is all that the GitHub monoculture gives you.
[gravatar]
Vinay Sajip, in my experience, evidence does not bear out your suggestion that a good, consistent user experience for contributors lowers the average quality of contributions.
[gravatar]
Even the time investment of learning one VCS is frustrating. I want to be spending my time producing code. If only GitHub offered free private storage for non-commercial projects it would be an obvious YES from me.
[gravatar]
This is a comment a year late but I feel someone should point this out. Vinay wasn't saying that a consistent user experience lowered the quality. Vinay was saying that knowing the user experience was immaterial was an indicator of higher quality. Those statements are not equivalent.


Good article. For a community that generally consists of a population fluent in multiple programming languages (and in some cases multiple natural languages) this concept of consistent user experience seems contrived and useless. The experience only needs to fit the purpose. In some cases that's Github. In others it's Bitbucket. Sometimes it isn't even git based. Some people and projects prefer Mercurial. I find those people insane but I won't not contribute because their filing system doesn't fit my personal sensibilities. The quality of the project itself is more important.
[gravatar]
I do a lot of development using mailinglists, patchwork and git-repos hosted by the project. As an example see patchwork.alpinelinux.org. I even prefer the patchwork workflow, but for publicity github works well. I get serious (non automatic) job offers because of my profile and our company gets business via github.
[gravatar]
I do a lot of development using mailinglists, patchwork and git-repos hosted by the project. As an example see patchwork.alpinelinux.org. I even prefer the patchwork workflow, but for publicity github works well. I get serious (non automatic) job offers because of my profile and our company gets business via github.
[gravatar]
(Very) belatedly pounding the table here in hearty agreement.

One encounters this all the time. Heroku's tooling comes to mind.
[gravatar]
Fun fact: the Vulkan specification tracks authorship/ownership over each extension not by email address, but by GitHub username. What a short-sighted and stupid choice. Thankfully they let me use my email address when merging my extension.

Add a comment: