What the web is for

Tim Berners-Lee and the early Web

The World Wide Web (abbreviated www) was opened to the public in 1993. It didn’t come from a company, corporation or government. It was the passion project of Tim Berners-Lee, a computer scientist working at CERN. Berners-Lee was motivated by a desire to better share and organize information, building on recent developments like DNS. He created a system where pages were linked to other pages via hyperlinks. Crucially, he released his specifications, including Hypertext Markup Language (HTML) as public domain software. The Web came from research and wasn’t owned by any existing powerful entity, and anyone could use it as they wished. Anyone could run their own web server and instantly join the world wide web, and most of the organizations at the time who were interested in doing so were doing research and wanted to publish their own content, not just read content from others.

The barrier to entry wasn’t age or status, or how much money you could shell out for a cool domain name and good designer, it was just the technical know-how to install a computer program and a way to leave a computer running and connected to the internet. The Web at this time was non-commercial: people ran websites because they wanted people to see what they wrote, and they wanted to participate in building our collective knowledge, with some humor and crass jokes thrown in, of course. That’s what the web is for: sharing knowledge and information in a free and universally inclusive way. No one expected a return on their investment of the computer and internet connection; they were just happy that people were coming to see their site (some even kept track of readership with hit counters!) So how did we get from that place of scientific collaboration and wide-eyed optimism to our current land of doomscrolling-induced anxiety?

The internet as the evolution of the telephone, not just the television

A brief history

The history of electronic communications is quite fascinating, all the way as far back as spark gap transmitter. That’s out of scope for this discussion, so for today it sufficies to say the following chronology holds: the telegraph gave rise to the telephone. The telephone remains the preferred communication device for (ostensibly) private one-to-one conversations even today, although there are now applications and websites that can serve the same purpose. However, there was a separate communications network that was built starting in the early 1900s, with radio broadcasting. Broadcasting is an important word there, because to cast broadly means allowing anyone who wants to hear you to listen in. This means that instead of a 1:1 relationship, as is typical on a telephone, broadcasters and listeners (or later, viewers) have a 1:many relationship. There is one broadcaster and a large audience. This sets up a certain type of relationship dynamic which we see play out again and again.

Money gets involved

First, the broadcaster needs something to play which people will want to hear. Music is an obvious choice. But under a certain set of rules, music cannot be performed for an audience without a license. The license to do so is typically owned by a publisher or producer, who bought the rights from the musicians. In exchange, the musicians get publicity and a paltry amount of money (musicians, when they can make money, mostly make it from live shows). So the broadcaster now has an expense, beyond just operating the broadcast equipment and getting their FCC license to broadcast: they need money to pay for these music broadcast licenses. Where can they get that money? Well, what does a broadcaster have that they could monetize? One thing only: An audience’s attention. And as the AI folks will tell you, attention is all you need to make a buck. Audiences are also consumers; they buy things. Salesman have things they want to talk about. And hey, broadcasters have to pay the licensing fees somehow, right? So it’s a no-brainer: the radio broadcasters start selling advertisements.

Different models

So now we have two types of interaction: one-to-one telephone calls, and one-to-many broadcast, where the public at large is an audience that listens and consumes but does not participate in the conversation. In the phone model, users pay a fixed service charge (hopefully not to a monopoly). In exchange for our fixed monthly fee, we get the ability to call whomever we please so long as we know their phone number, and anyone in the world can call us if they’re willing to pay their own subscription. While our conversations might not exactly be private, at least they don’t need to be monetized. The phone company gets their monthly fee, they charge extra in certain cases, and they just don’t care what we’re saying. They don’t need to monetize us, the public at large, in the way that a broadcaster needs to in order to make a living. For radio and televsion, by contrast, we get broadcast streams “free” by paying in attention rather than money. And advertisers LOVE attention.

Cable TV

At some point someone at a broadcast TV station was feeling especially greedy and decided that advertising didn’t make ENOUGH money. Sure, they were staying in business, but the audience wasn’t going anywhere, and they could offer more channels if they could give people access to other stations outside their local area. Heck, they could even make original premium content, accessible only to their subscribers. So they decided to sell a subscription, just like the phone companies did. And for a while it seemed like that would be an option, until of course they got greedy again. They decided to double-dip: make consumers continue to pay a subscription, but ALSO add ads back in. Netflix pulled the same trick decades later, but this time with computers!

Internet as a Service

The internet we use today has its roots in ARPANET. Computers at the time were few and far between, and of comparable computing ability. There wasn’t the same mix of server-type computers and client-oriented computers that we have today. The computers on this network interacted very much like peers, and for the most part were trusted. I’ll skip a lot of the history, but services like email were designed in a decentralized peer-to-peer way that reflects the conditions and values of that time.

When the public internet finally came in the 1990s and anyone could join the digital revolution, we had to pay an Internet Service Provider. With the advent of the web, this is like a combination of a one-to-one telephone service and a many-to-many cable tv service. Alas, our economic system favours the group which acquires money the fastest. While ISPs could have stayed in business just charging a fixed access fee, like phone companies, there was another group who wanted to use the business model pioneered by radio and television: sell your audience; sell ads. This new group also had a trick up their sleeve: they could collect information about you, often without your immediate awareness. They could then use that information to target advertisements selectively, thereby increasing the efficiency and creepiness of their ads.

The rise of Google: how surveillance capitalism ruined everything

There was a major unsolved problem in the early web: search. While there were links and webrings, some of which still exist today, there wasn’t a good way to find information if you didn’t have a starting point. Searching a large web of ever-growing and often-changing content is a hard problem, so like good computer scientists the architects of the early web simply ignored it. But when the entire public started accessing the web, they needed a way to search it and find the content they wanted. There was no public community search engine; the people who built search engines were working towards a goal of profit, and like radio broadcasters the only monetizable asset they had was user attention. There were a few early search engines, like Yahoo and Ask Jeeves, but by far the most successful search engine was Google, and that’s where we’ll focus our attention. Because users were directly connecting to Google and typing in what they wanted to see, Google had a history of user searches and IP addresses.

This is interesting because unlike radio broadcasters, who have to broadcast the same ads to all users in the area, Google could leverage the search data that users input to make ads tailored to them specifically: targeted ads. Then website owners, the people who run the places Google search results are links to, could run Google ads on their sites, and if the site was large enough, that would cover their operating expenses. Of course by running Google’s ads on your website, you’re also providing them information about who is viewing it! Targeted ads only work as a concept if user behaviour is tracked. Some of you, particularly in the US, might think this is a violation of privacy rights like the fourth amendment. However those rights generally apply to what the government can or cannot do, and while there’s ample room for debate about how the law ought to be, as it stands today under the third-party doctrine, Google can do this. If you don’t want to give your data to Google, you have to stop using their search functionality, and stop visiting sites that run Google ads. That’s a tall order, but it’s what our lawmakers have collectively decided. This business model of collecting data from users then selling that data in the form of advertisements is what’s known as surveillance capitalism, and it’s possibly the most effective way to turn internet traffic into money. This same model, pioneered by Google, is used by Facebook and others. Companies which didn’t previously use this model, like Microsoft, also hopped on the bandwagon.

The entrenched monopolies and what a gobbled-up web looks like

When all the content on the internet is stored by a few companies like Google and Microsoft, not using their services is simply not an option for most of us. These companies have realized this and done everything they can to keep the content (which is why users want to be on the web in the first place) within their walled gardens and behind their terms of service agreeements. They’ve done this by buying out any competitors and lobbying for favourable legal treatment. This is the web of today: five websites, each of which consist mostly of screenshots from the other four. Those websites make their money by collecting data from users, which most users would like to be private, and then selling that data to advertisers. Of course if it makes them money and they won’t get in trouble, they’re happy to sell that data to anyone for the right price; governments, both foreign and domestic, love to buy this data, too! As a user, the options are bleak: either give all your data to these giant American corporations, or be removed from all participation in civil and social life.

A much smaller but much freer option

There are still things on the internet that the big companies don’t own, much to their chagrin. Possibly the best exemplar of a community-oriented web is Wikipedia, with its commitment to staying non-commercial and ad-free. The Internet Archive, which keeps records of the web as it used to be, also deserves a mention. IRC, for all its faults, is still around and there are still some community chatrooms using it today. These projects were created for public benefit, and are sustained by donations rather than monetizing user data. While this community-oriented web might not have the scale of the corporate web, it has every bit as much value and is essential for us to consider when thinking about the internet as a whole. When the internet, chatrooms, and the web first became popular, there was a leveling of discourse; you didn’t know whether the person on the other end of the connection was a PhD researcher or just some kid in his parents’ basement. There was a saying on the early web: on the internet, nobody knows you’re a dog. The result of this is that everyone spoke as equals, regardless of gender, race, age, or other physical characteristics. That isn’t to say that everyone was polite (far from it), but they were the same level of rude to everyone, and people were judged based on what they said and not where they came from. This is what the internet can look like if we don’t collect data from everyone all the time. It’s a place of inclusiveness, joy, and collaboration towards genuine knowledge.

This is the true legacy of the early web, and this is the portion of the internet that carries its vision forward. The public interest internet never went away, it just didn’t grow at the same rate as the corporate web.

What bad regulations look like

Bad regulations put the entreched ad-tech companies first. Bad regulations force more information to be collected. Bad regulations erode user rights, because they take it as a given that information must be collected at all times. Google wants you to think that what’s good for Google is good for the internet. They want you to think that their business model is the only option. That simply isn’t so. If more regulations are put in place that require data collection (things like age verification mandates), then companies who want to treat their users like customers rather than products will be unable to do so. The phone company model of pay a subscription and get a service will have a barrier to entry so high it may as well be illegal, and companies will have no choice but to either run a data collection service themselves (which big companies like Google already do) or give their user’s data to a third party service for validation. That third-party service will probably be operated by a company like Google, who is already capable of collecting and storing troves of user data.

In short, bad regulations remove or reduce our ability to avoid the entrenched tech monopolies, and force us to give them even more of our data without the ability to opt out.

What good regulations might look like

Good regulations put the users first, and ensure that the internet operates first and foremost as a service to users (who pay a subscription for it!). They also put community-oriented content like Wikipedia above commercial interests. Good regulations would ensure that companies like Google are limited in how much data they collect, and users have control over deleting and exporting their data. Good regulations would break up companies like Google, not because they have a “monopoly on search” but because they have far too much of our data for any single entity to adequately protect. Good regulations would make targeted ads into an unattractive business model. Good regulations would promote interoperability, even when companies may not want third-party services to connect to their systems.

The Internet is People

When it comes down to it, our global telecommunications network is made for people to connect with one another, not for companies to make a profit by selling user data. If we want it to stay that way, even in part, we need to regulate the internet as though it’s a phone network, not a radio or tv broadcast. The content on the internet was created by normal users, helping each other out and talking with their friends. That’s where the value lies, and that’s why people keep using it even in its current broken form.

Companies can’t offer a simple subscription-based service if they’re forced to collect the same data as a data-for-sale service; the economics just don’t make sense. If any companies offering a basic pay-for-a-service model are to survive, we can’t treat the internet as if data-for-profit is its fundamental nature. The same goes for community projects that run on donations: if we treat data-for-profit as the default, those projects will wither and die. I, for one, don’t want to see that happen.