FLOSS Foundations

October 06, 2015

Dries Buytaert

The coming era of data and software transparency

Algorithms are shaping what we see and think -- even what our futures hold. The order of Google's search results, the people Twitter recommends us to follow, or the way Facebook filters our newsfeed can impact our perception of the world and drive our actions. But think about it: we have very little insight into how these algorithms work or what data is used. Given that algorithms guide much of our lives, how do we know that they don't have a bias, withhold information, or have bugs with negative consequences on individuals or society? This is a problem that we aren't talking about enough, and that we have to address in the next decade.

Open Sourcing software quality

In the past several weeks, Volkswagen's emissions crisis has raised new concerns around "cheating algorithms" and the overall need to validate the trustworthiness of companies. One of the many suggestions to solve this problem was to open-source the software around emissions and automobile safety testing (Dave Bollier's post about the dangers of proprietary software is particularly good). While open-sourcing alone will not fix software's accountability problems, it's certainly a good start.

As self-driving cars emerge, checks and balances on software quality will become even more important. Companies like Google and Tesla are the benchmarks of this next wave of automotive innovation, but all it will take is one safety incident to intensify the pressure on software versus human-driven cars. The idea of "autonomous things" has ignited a huge discussion around regulating artificially intelligent algorithms. Elon Musk went as far as stating that artificial intelligence is our biggest existential threat and donated millions to make artificial intelligence safer.

While making important algorithms available as Open Source does not guarantee security, it can only make the software more secure, not less. As Eric S. Raymond famously stated "given enough eyeballs, all bugs are shallow". When more people look at code, mistakes are corrected faster, and software gets stronger and more secure.

Less "Secret Sauce" please

Automobiles aside, there is possibly a larger scale, hidden controversy brewing on the web. Proprietary algorithms and data are big revenue generators for companies like Facebook and Google, whose services are used by billions of internet users around the world. With that type of reach, there is big potential for manipulation -- whether intentional or not.

There are many examples as to why. Recently Politico reported on Google's ability to influence presidential elections. Google can build bias into the results returned by its search engine, simply by tweaking its algorithm. As a result, certain candidates can display more prominently than others in search results. Research has shown that Google can shift voting preferences by 20 percent or more (up to 80 percent in certain groups), and potentially flip the margins of voting elections worldwide. The scary part is that none of these voters know what is happening.

And, when Facebook's 2014 "emotional contagion" mood manipulation study was exposed, people were outraged at the thought of being tested at the mercy of a secret algorithm. Researchers manipulated the news feeds of 689,003 users to see if more negative-appearing news led to an increase in negative posts (it did). Although the experiment was found to comply with the terms of service of Facebook's user agreement, there was a tremendous outcry around the ethics of manipulating people's moods with an algorithm.

In theory, providing greater transparency into algorithms using an Open Source approach could avoid a crisis. However, in practice, it's not very likely this shift will happen, since these companies profit from the use of these algorithms. A middle ground might be allowing regulatory organizations to periodically check the effects of these algorithms to determine whether they're causing society harm. It's not crazy to imagine that governments will require organizations to give others access to key parts of their data and algorithms.

Ethical early days

The explosion of software and data can either have horribly negative effects, or transformative positive effects. The key to the ethical use of algorithms is providing consumers, academics, governments and other organizations access to data and source code so they can study how and why their data is used, and why it matters. This could mean that despite the huge success and impact of Open Source and Open Data, we're still in the early days. There are few things about which I'm more convinced.

by Dries at October 06, 2015 05:04 PM

Louis Suárez-Pots


There’s a growing demand in the tech industry for free, open-source software that can do what VMware does. With the rise of smartphones and cloud computing, data centers are growing ever-larger to meet swelling demand. That means licensing fees like the kind VMware demands can become a headache for enormous companies like Apple, which run massive data centers to do things like sell apps and content and push software updates.

Source: Apple dumps VMware ESXi for KVM – Business Insider

Filed under: critique

by oulipax at October 06, 2015 01:22 AM

October 05, 2015

Louis Suárez-Pots


There are numerous concerns about the TPP. To begin with, the content has not officially been revealed; only portions leaked have been made public, and it is not clear that the overall document itself has been finalized. KEI’s commentary, by James Love, KEI Director, is short and worth reading.

With regard to copyright, Love writes:

“We are at a disadvantage to comment on the agreement, precisely because of that secrecy. We don’t know if the TPP will mandate a copyright term of life plus 70 years, change the global rules on copyright exceptions, block legislation to limit remedies for the infringement of orphan copyrighted works, require lower standards for granting patents, mandate patents on new uses of old drugs, require patent term extensions, block current U.S. incentives to induce greater transparency of the patent landscape for biologic drugs, mandate remedies for the infringement of patents on surgical methods, block the adoption of useful exceptions to test data, allow drug companies and publishers to challenge exceptions to rights under the investor state dispute settlement (ISDS) provisions in the agreement, or a hundred other issues of consequence.

These points that Love mentions are significant and not the same as the biological data provision that was evidently resolved though a compromise.

The problem with an accord like this goes beyond the dramatic dilution of national sovereignty, which is frankly not necessarily a bad thing, as some national states are more repressive than others: duh. Rather, it has placed policy creation and, to some extent implementation and resolution, in the hands of those organisations ablest to benefit from it. Right now, these would be multinationals who have the resources to create and destroy local markets.

Source: Statement of KEI on announcement of consensus on Trans Pacific Partnership (TPP) trade agreement | Knowledge Ecology International

Filed under: critique

by oulipax at October 05, 2015 04:29 PM

October 02, 2015

Louis Suárez-Pots