Open source success has everything to do with innovation, not vendor lock-in concerns


Commentary: A new survey suggests many get it wrong when they assume companies choose open source to avoid lock-in.


Image: Getty Images/iStockphoto

Every enterprise uses open source, but the reasons for doing so often vary depending on one’s role within the enterprise. Anaconda, a popular data science platform with over 20 million users, surveyed its users to better understand the current state of data science adoption, including open source’s role therein. Among other findings, developers value open source so they can get work done right now, while their colleagues may value the price tag or utility.

But exactly no group puts “avoiding vendor lock-in” as their first (or even fourth) consideration for using open source. Open source can help companies achieve multi-cloud strategies, but by itself open source doesn’t magically make any workload portable. That’s simply not how open source (or enterprise) software works.

The good news is that no one seems to be waiting around on the “avoiding lock-in” argument.

SEE: How to build a successful developer career (free PDF) (TechRepublic)

Vendor lock-in: Who is talking about it?

As noted in the report, the survey respondents were asked to assign a proportional value to each of five commonly-cited benefits of open source software. Of the five, “most suitable tool for my needs” and “speed of innovation” claimed the most points, with “avoiding vendor lock-in” scraping into last place (Figure A).

Figure A

Image: Anaconda

If you’ve been paying attention to open source over the years, these numbers won’t be surprising. The closer the respondent is to the code itself, the more they care about the speed of innovation that open source enables, and the less they fret about lock-in. “Lock-in” is something vendors talk about–customers don’t seem to obsess over it in the same way. 

Don’t believe me? Over the past few decades while open source has been booming, we’ve seen proprietary databases, ERP systems, etc. boom right alongside it. Indeed, over the 20 years I’ve worked for open source companies, I have almost never had a customer “vote” against lock-in with their wallets.

This is not to say that companies aren’t buying into open source in a big way–they are. It’s just that “no lock-in” is the puniest of reasons for doing so.

Innovating with open source

Instead, organizations have long chosen open source to save money while boosting innovation, with the latter reason by far the more compelling. You’d struggle to find companies using TensorFlow to help with their machine learning aspirations because “it’s free”–they use it because it’s a great way to do things like fraud detection, as PayPal has found. Others like Twitter turn to Redis not because it’s free, but because it helps the company achieve dramatic scale. 

And so on. 

Developers, closest to the code, figured this out long ago–that’s why they picked “speed of innovation” at roughly twice the rate of any other open source benefit. I recently discussed whether open source drives business innovation with Weaveworks CTO Cornelia Davis: “No one cares about lock-in if the software isn’t very good. The first order of priority is that most want super innovative software.” That’s what open source increasingly delivers. 

Disclosure: I work at AWS, but this article reflects my views, not those of my employer.

Developer Essentials Newsletter

From the hottest programming languages to the jobs with the highest salaries, get the developer news and tips you need to know.

Sign up today

Also see

How Facebook’s open source factory gave rise to Presto


Commentary: When Facebook solves technical problems, it defaults to open source solutions like Presto.

Curios IT Engineer Standing in the Middle of a Working Data Center Server Room. Cloud and Internet Icon Visualization in the Foreground.

Image: gorodenkoff, Getty Images/iStockphoto

Facebook has been a bit of a punching bag lately, and for good reason. But for all its problems, Facebook continues to be one of the preeminent open source software factories on Earth. From React to Apache Cassandra to PyTorch, Facebook has open sourced some of the world’s most popular software, which, in turn, has given rise to companies built up to commercialize those projects.

Like Starburst, a company started by Facebook veterans to commercialize Presto, an open source distributed SQL query engine for running interactive analytic queries against data sources of any size. Starburst just raised $42 million to further accelerate Presto development and commercialization. In an interview with Starburst co-founder and CTO, Martin Traverso, he talked through how Facebook’s engineering culture gave life to Presto, and the open source ethos that powers it.

SEE: Developer code reviews: 4 mistakes to avoid (free PDF) (TechRepublic)

A culture of creation

Let’s rewind to 2012, when Facebook’s infrastructure team was still knee-deep in Apache Hive, a data warehouse project the company had created and open sourced back in 2010. Facebook had a massive 300 petabyte Hive data warehouse, which sounds great, and it was. But it was also incredibly slow. As Traverso related, a Facebook data scientist once quipped, “It’s a good day when I can run six Hive queries.” Hive, for all its merits, was a big productivity loss. 

There was talk throughout the Facebook data infrastructure team about building something better, but it was Traverso, along with Dain Sundstrom, David Phillips, and Eric Hwang, who got the nod to go build something better. Phillips, in particular, had used data warehouse engines and had both the incentive and the passion to do something about Hive, Traverso said. 

If the foursome had waited, perhaps they could have used Apache Drill (the first design meeting was in late 2012). But that’s not how Facebook engineering works. There were no obvious alternatives, and they had a need. “We had to do it by ourselves,” he said. And so they did: In 2012, they released Presto.

A culture of open source

This doesn’t explain why they open sourced it. It helped that Sundstrom had been involved in Apache Geronimo, but even that doesn’t really adequately cover the rationale for opening it up. As Traverso related, the founders weren’t simply hoping to solve an immediate Facebook need–they wanted to build something that would endure and be broadly applicable:

We like open source. We believe in open source. We believe that the best software is written by passionate developers working in open source communities. We wanted to build something that would be usable for Facebook, but also something that could be used by everyone else in the world. Also, by making it available to other people, we can make it better because we can get other people involved that have other needs and thereby build something that is more broadly applicable than just a single company and single use case.

And so they have. Today there is a diverse and growing body of contributors, sparked early on by considerable involvement from Teradata, as well as Netflix, LinkedIn, and others. Teradata had roughly 20 people working on Presto at one point, with perhaps half of those working on the Presto core. Over time some of those, including Justin Borgman, who ran Teradata’s Apache Hadoop-related products, eventually left to work on Presto full-time under the auspices of Starburst, which was founded in 2017. 

SEE: How to build a successful developer career (free PDF) (TechRepublic)

According to Traverso, the Presto team has worked hard to make it easy to contribute to the project. From a technical point of view, Traverso said, they’ve tried to make the code accessible and easy to understand. “It’s fairly uniform so as to make it easy to see what’s going on in the code. There are some projects where you jump in and it’s a big spaghetti plate, and it’s kind of hard to follow all the threads and make sense of it.” Presto, by contrast, is more structured around the attractions in the code, making it easier for someone to evaluate how and where they can make a meaningful contribution.

Starburst co-founder and CTO Martin Traverso

Image: Martin Traverso

In addition, the Presto founders understand that users will likely give up if they can’t do something useful with the project within the first five minutes. Presto makes it simple to go from download to running the query engine in minutes. 

Finally, there’s the community. The Presto Slack channel is currently 2,200 strong, with as many as 500 active at any given time. “It’s one of the most active open source projects I’ve seen,” noted Traverso. These people are happy to help new users get started with the project, or work with would-be contributors to facilitate their contributions. 

Though Presto was originally used to query data in HDFS (Hadoop), Traverso and the other founders needed it to be able to query not only Facebook’s customized HDFS, but also the “off-the-shelf” open source HDFS. So they created an abstraction over the storage layer, then made it pluggable. Because there’s a very clean interface between the engine and the storage layer, it has allowed the Presto community to build connectors for a wide array of data sources, including Cassandra, MongoDB, Elasticsearch, and over 30 more. 

“The more people get involved, the better the software gets,” said Traverso.

It’s worth remembering that Facebook has made it the default for engineers like Traverso to build and open source software precisely to gather communities around these projects. They may be born at Facebook, but because of Facebook’s embrace of open source, they don’t die there. 

Disclosure: I work for AWS, but the views expressed here are mine and don’t represent those of my employer.

Open Source Weekly Newsletter

You don’t want to miss our tips, tutorials, and commentary on the Linux OS and open source applications.
Delivered Tuesdays

Sign up today

Also see 

How open source “selfishness” can lead to burnout


Commentary: Open source isn’t really about kumbaya, but does that necessarily mean it needs to stress out project leads?


Image: Getty Images/iStockphoto

There’s how open source is “supposed to” work, and how it actually works. The “supposed to” involves “rainbows and butterflies, with everybody working together in harmony,” as OBS (Open Broadcaster Software) project founder and maintainer, Hugh “Jim” Bailey, said when I interviewed him. But the reality is, “People only contribute stuff that’s useful for them, almost exclusively,” as he went on to relate. Not open source for the good of all–open source for the good of one.

Sure, there’s some Adam Smith “invisible hand” in play here, with everyone looking out for their own self-interest and thereby improving code for all. But the burden of making this philosophical principle actually play out in practice requires a fair amount of work from a project maintainer.

SEE: How to build a successful developer career (free PDF) (TechRepublic)

Selfishness is a feature, not a bug

As Linux kernel maintainer Greg Kroah-Hartman has said, “Everybody contributes to Linux in a very selfish manner because [they] want to solve a problem for [them].” This isn’t a problem, he went on, but rather a Very Good Thing because “it turns out everyone has the same problems.” 

In general, he’s correct. But not always. 

For example, it’s awesome that Apple announced at its Worldwide Developers Conference (WWDC) that it would be contributing code to a slew of open source projects like Redis and nginx and Blender. But those contributions aren’t to make Redis generally better, for example–it’s just to add support for Apple’s new ARM-based chips. Many will benefit from this, but it’s not an altruistic contribution. 

The same is true of contributions to GDAL, an omnipresent open source geographic information system (GIS) that is found in Google Earth, Uber’s mapping technology, and more. As project lead Even Rouault said when I interviewed him, organizations tend to contribute specific drivers for the format or remote service that ties into their own products (or country). Such contributions help to make the project incrementally more useful for a wider group of people, but they don’t directly sustain the core upstream project.

Bailey concurred:

I expected open source to be like rainbows and butterflies, with everybody working together in harmony, like, ‘Oh, this is open source. I’ve got this great code. Here you go.’ But it’s not like that. People only contribute stuff that’s useful for them, almost exclusively. They usually don’t contribute code that is useful to everybody, though sometimes they do. Sometimes people are trying to improve the project, but most of the time, maybe 80% of the time, whenever you get a pull request for something, a request to merge code, it’s almost always [for their narrow self-interest]. 

It turns out that this can be a major burden for the maintainer.

Burning out on others’ open source contributions

As Google Cloud engineer Tim Hockin colorfully described it, “I call this ‘pooping in someone else’s yard’. Show up, drop off some … stuff … and disappear, leaving them to clean up the mess when they inevitably step in it. Fairly common in OSS.” That “clean up” has led to “many times where I’ve been burned out and I just need to take a week or two off. It’s been happening too much.” 

Whence the burnout?

Well, such self-interested, sporadic code contributions often aren’t particularly high-quality or tuned to the project, said Bailey: “It can be very difficult to review people’s code, because you want everything to be consistent in your project. There’s a lot of bad code that people try to contribute.” 

SEE: 10 ways to prevent developer burnout (free PDF) (TechRepublic)

Of course there’s also good code (particularly from regular contributors), but what does Bailey do to improve incoming code? “I try to communicate with them first. I try to understand what they’re trying to do. If it’s something that can’t be reconciled, then I just have to tell them, ‘I’m sorry, you wrote this for yourself, and this just isn’t going to benefit most users.'” However, if the code is bad, but the idea/feature is good, he’ll try to work with them (and sometimes will just fix the code himself). But it’s always time-intensive. “It could burn you out really, really easily,” he said. 

So, yes, open source is about self-interest, and the contributions people are making to Kubernetes and Linux and Envoy are always reflective of the personal (and corporate) interests of the developers involved. Sometimes, to Kroah-Hartman’s point, this can result in the good of the project. But just as often it can lead to burnout among project maintainers. 

Disclosure: I work for AWS, but the views here are mine and don’t necessarily reflect those of AWS.

Open Source Weekly Newsletter

You don’t want to miss our tips, tutorials, and commentary on the Linux OS and open source applications.
Delivered Tuesdays

Sign up today

Also see

How open source could help empower social change


Social change is on the minds of everyone across the globe. How can open source help make this a reality? Jack Wallen offers some suggestions.

Image: Getty Images/iStockphoto

The world is in a bit of upheaval at the moment; there’s a pandemic and there’s racial and social strife running rampant through the streets of every city. Although you might think the tech sector would be the last place to look for a means to a changed end, it’s time to rethink that take on technology and those behind it.

As we’ve seen with so many other endeavors, tech can help–especially open source.

Open source didn’t originally set out to become a movement beyond code. Eventually, however, it spilled out into various other avenues until it could be found just about everywhere. Now, open source has a chance to show that it can not only be a catalyst for change in the software and hardware industry, but a means for social change.


That’s a good question. 

There are, however, answers. Let’s dig in.

SEE: Diversity and Inclusion policy (TechRepublic Premium)

Software solutions

Let’s start out with the obvious: Software. Because open source tends to be both open and free, those solutions are perfectly suited for organizations geared for change. But, we’re not just necessarily talking about a group cobbling together a solution made up of the usual suspects: Apache, MySQL, WordPress, etc. There are open source projects created specifically to help empower social changes.

Some of those projects include:

  • Givesource: An open source fundraising platform for nonprofits. This project was created by marketing and software company Firespring and includes features like: Ease of use, responsive design, PaymentSpring integration, scalable, quick setup template, online/offline donations, matching fund support, and donor data reporting. 

  • ClientComm: An open source platform that empowers simplified communication between case officers and their clients. This tool gives case officers a powerful platform from which they can track clients and send case workers texts for any situation that might arise.

  • Mifos: An open source platform that banking institutions can use to offer low- or no-cost digital banking solutions to the poor.

  • alex: An open source tool that can detect gender favoriting, polarizing, race-related, religion inconsiderate, and other unequal phrasing in text.

  • tasking manager: A tool to help teams coordinate mapping on OpenStreetMap.

  • A community for mental health experiences wherein people can share their personal stories with allies.

  • refugerestrooms: A tool that helps provide safe restroom access to transgender, intersex, and gender nonconforming individuals.

  • Terrastories: A geostorytelling tool to enable local communities to locate and map oral storytelling traditions for places of significance. 

  • Clear My Record: A platform that can enable citizens to more easily clear their records so that they may remove barriers to jobs, housing, and educational opportunities.

  • pandemic-ebt-mn: A tool to support Pandemic EBT (P-EBT) applications in Minnesota.

  • pandemic-ebt-ca: A tool to support Pandemic EBT (P-EBT) applications in California.

  • B.E.A.R.: An app that provides a desktop GUI that reads California Department of Justice .dat files that contain criminal histories and identifies convictions that are eligible for relief under CA Proposition 64. 

  • Project Callisto: A platform to detect repeat perpetrators of professional sexual coercion and sexual assault.

  • Open Food Network: A platform that enables new, ethical supply chains for food.

Of course, one must also include the regular fare in this list, because without the likes of Apache, NGINX, MySQL, Rails, Rust, Nextcloud, and so many more, social change through open source wouldn’t be possible.

But what else can open source do to help drive change?

Improve terminology

If there’s one thing open source can do that could have an immediate and lasting effect, it would be to start changing some of the terminology used. A perfect example is within clustering technology. Once upon a time, it was common to use the master/slave nomenclature. That is simply not acceptable and many projects were already ahead of this game and switched to master/node. 

However, it’s time to drop the master tag as well. Instead of master, I’ll toss out some options:

  • Main

  • Head

  • Conductor

  • Director

  • Lead

  • Manager

  • Chief

  • Prime

  • Major

The point is, words matter; terminology like this is long past due for change. In that same vein, projects should also go through code and documentation to remove verbiage that might be hurtful or insensitive to specific groups.

SEE: GitHub to replace “master” with alternative term to avoid slavery references (ZDNet)

Embrace diversity

This is one area of change that open source has already taken charge of. I know many open source developers that come together as an entire rainbow of culture–it’s quite a beautiful thing to experience. 

However, it could go much further. A 2017 GitHub open source survey found that:

  • Three percent of respondents identified as female

  • One percent identified as non-binary

  • Ninety-five percent of respondents identified as male

  • Sixteen percent of respondents identified as minority ethnic or national group within their home country

  • Seven percent of the survey respondents identified as lesbian, gay, bisexual, or asexual

It is also reported that:

The good thing about open source is that, by its very nature, anyone can check out code, fork it, and create something of their own. So any programmer, regardless of color, race, religion, sexual identity, sexual preference, or gender, can start a project. If you’ve got the skills, open source has the code. While you’re at it, create a project focused on change.

Be bold. Code change into the world.

See something, say something

Finally, if you’re a part of the open source community, consider yourself as a means to a better end. If you see behavior that is counter to progress and positive social change, call it out. We’re well past the time for silence. And, at the moment, the court of public opinion has a very loud and large voice that holds powerful sway over companies.

However, if you take it upon yourself to call out unacceptable behavior, consider communicating to the perpetrator first. It could be a situation where the person has no idea they are perpetuating behaviors that have no place in an enlightened society. Educate them. If they reject your offer to help, then reach out to those in charge of the project they are working on. If that bears no results, continue escalating until change for the better happens.

Open source can do wonders for society, be it with software, a simple change in terminology, diversity in numbers, or policing unacceptable behaviors. By design, this community is open, and it’s time to be held to a higher standard. 

Be the change society needs.

Open source can help with that.

Open Source Weekly Newsletter

You don’t want to miss our tips, tutorials, and commentary on the Linux OS and open source applications.
Delivered Tuesdays

Sign up today

Also see