Via Wired
 
-----
 
 
 
The ancient Library of Alexandria may have been the  largest collection of human knowledge in its time,
 and scholars still mourn its destruction. The risk of so devastating a 
loss diminished somewhat with the advent of the printing press and 
further still with the rise of the Internet. Yet centralized 
repositories of specialized information remain, as does the threat of a 
catastrophic loss.
 
Take GitHub, for example.
 
GitHub has in recent years become the world’s biggest collection of open source software.
 That’s made it an invaluable education and business resource. Beyond 
providing installers for countless applications, GitHub hosts the source
 code for millions of projects, meaning anyone can read the code used to
 create those applications. And because GitHub also archives past 
versions of source code, it’s possible to follow the development of a 
particular piece of software and see how it all came together. That’s 
made it an irreplaceable teaching tool.
 
The odds of Github meeting a fate similar to that of the Library of Alexandria are slim. Indeed, rumor has it that
 Github soon will see a new round of funding that will place the 
company’s value at $2 billion. That should ensure, financially at least,
 that GitHub will stay standing.
 
But GitHub’s pending emergence as Silicon Valley’s latest unicorn holds
 a certain irony. The ideals of open source software center on freedom, 
sharing, and collective benefit—the polar opposite of venture 
capitalists seeking a multibillion-dollar exit. Whatever its stated 
principles, GitHub is under immense pressure to be more than just a 
sustainable business. When profit motives and community ideals clash, 
especially in the software world, the end result isn’t always pretty.
 
Sourceforge: A Cautionary Tale
 
Sourceforge is another popular hub for open source software that predates GitHub by nearly a decade. It was once the place to find open source code before GitHub grew so popular.
 
There are many reasons for GitHub’s ascendance, but Sourceforge 
hasn’t helped its own cause. In the years since career services outfit DHI Holdings acquired
 it in 2012, users have lamented the spread of third-party ads that 
masquerade as download buttons, tricking users into downloading 
malicious software. Sourceforge has tools that enable users to report 
misleading ads, but the problem has persisted. That’s part of why the 
team behind GIMP, a popular open source alternative to Adobe Photoshop, quit hosting its software on Sourceforge in 2013.
 
Instead of trying to make nice, Sourceforge stirred up more hostility earlier this month when it declared
 the GIMP project “abandoned” and began hosting “mirrors” of its 
installer files without permission. Compounding the problem, Sourceforge
 bundled installers with third party software some have called adware or
 malware. That prompted other projects, including the popular media 
player VLC, the code editor Notepad++, and WINE, a tool for running Windows apps on Linux and OS X, to abandon ship.
 
It’s hard to say how many projects have truly fled Sourceforge 
because of the site’s tendency to “mirror” certain projects. If you 
don’t count “forks” in GitHub—copies of projects developers use to make 
their own tweaks to the code before submitting them to the main 
project—Sourceforge may still host nearly as many projects as GitHub, 
says Bill Weinberg of Black Duck Software, which tracks and analyzes 
open source software.
 
But the damage to Sourceforge’s reputation may already have been 
done. Gaurav Kuchhal, managing director of the division of DHI Holdings 
that handles Sourceforge, says the company stopped its mirroring program
 and will only bundle installers with projects whose 
originators explicitly opt in for such add-ons. But misleading 
“download” ads likely will continue to be a game of whack-a-mole as long
 as Sourceforge keeps running third-party ads. In its hunt for revenue, 
Sourceforge is looking less like an important collection of human 
knowledge and more like a plundered museum full of dangerous traps.
 
No Ads (For Now)
 
GitHub has a natural defense against ending up like this: it’s never 
been an ad-supported business. If you post your code publicly on GitHub,
 the service is free. This incentivizes code-sharing and collaboration. 
You pay only to keep your code private. GitHub also makes money offering
 tech companies private versions of GitHub, which has worked out well: 
Facebook, Google and Microsoft all do this.
 
Still, it’s hard to tell how much money the company makes from this 
model. (It’s certainly not saying.) Yes, it has some of the world’s 
largest software companies as customers. But it also hosts millions of 
open source projects free of charge, without ads to offset the costs 
storage, bandwidth, and the services layered on top of all those repos. 
Investors will want a return eventually, through an acquisition or IPO. 
Once that happens, there’s no guarantee new owners or shareholders will 
be as keen on offering an ad-free loss leader for the company’s 
enterprise services.
 
Other freemium services that have raised large rounds of funding, 
like Box and Dropbox, face similar pressures. (Box even more so since 
going public earlier this year.) But GitHub is more than a convenient 
place to store files on the web. It’s a cornerstone of software 
development—a key repository of open-source code and a crucial body of 
knowledge. Amassing so much knowledge in one place raises the specter of
 a catastrophic crash and burn or disastrous decay at the hands of 
greedy owners loading the site with malware.
 
Yet GitHub has a defense mechanism the librarians of ancient 
Alexandria did not. Their library also was a hub. But it didn’t have 
Git.
 
Git Goodness
 
The “Git” part of GitHub is an open source technology that helps 
programmers manage changes in their code. Basically, a team will place a
 master copy of the code in a central location, and programmers make 
copies on their own computers. These programmers then periodically merge
 their changes with the master copy, the “repository” that remains the 
canonical version of the project.
 
Git’s “versioning” makes managing projects much easier when multiple 
people must make changes to the original code. But it also has an 
interesting side effect: everyone who works on a GitHub project ends up 
with a copy own their computers. It’s as if everyone who borrowed a book
 from the library could keep a copy forever, even after returning it. If
 GitHub vanished entirely, it could be rebuilt using individual users’ 
own copies of all the projects. It would take ages to accomplish, but it
 could be done.
 
Still, such work would be painful. In addition to the source code 
itself, GitHub is also home to countless comments, bug reports and 
feature requests, not to mention the rich history of changes. But the 
decentralized nature of Git does make it far easier to migrate projects 
to other hosts, such as GitLab, an open source alternative to GitHub that you can run on your own server.
 
In short, if GitHub as we know it went away, or under future 
financial pressures became an inferior version of itself, the world’s 
code will survive. Libraries didn’t end with Alexandria. The question is
 ultimately whether GitHub will find ways to stay true to its ideals 
while generating returns—or wind up the stuff of legend.