Mental Models

I think we all make mental models constantly — simplifications of the world that help us understand it. And for services on the Internet, our mental models are probably very close — logically, if not in implementation — to the reality of what those services do. If not, how could we use them?

I also like to imagine how the service works, too. I don’t know why I do this, but it makes me feel better about the universe. For a lot of things, to a first approximation, the what and how are sufficiently close that they are essentially the same model. And sometimes a model of how it works eludes me entirely.

for example, my model of email is that an email address is the combination of a username and a system name. My mail server looks up the destination mail server, and IP routes my blob of text to the destination mail server, where that server routes it to the appropriate user’s “mailbox,” which is a file. Which is indeed how it works, more or less, with lots of elision of what I’m sure are important details.

I’ve also begun sorting my mental models of Internet companies and services into a taxonomy that have subjective meaning for me, based on how meritorious and/or interesting they are. Here’s a rough draft:

The What The How Example Dave’s judgment
obvious as in real life email Very glad these exist, but nobody deserves a special pat on the back for them. I’ll add most matchmaking services, too.
obvious non-obvious, but simple and/or elegant Google Search (PageRank) High regard. Basically, this sort of thing has been the backbone of Internet value to-date
not obvious / inscrutable nobody cares Google Buzz lack of popularity kills these. Not much to talk about
obvious obvious Facebook Society rewards these but technically, they are super-boring to me
obvious non-obvious and complex natural language, machine translation, face recognition Potentially very exciting, but not really very pervasive or economically important just yet. Potentially creepy and may represent the end of humanity’s reign on earth.

 

Google search is famously straightforward. You’re searching for some “thing,” and Google is combing a large index for that “thing.” Back in the Altavista era, that “thing” was just keywords on a page. Google’s first innovation was to use the site’s own popularity (as measured by who links to it and the rankings of those links.) to help sort the results. I wonder how many people had a some kind of mental model of how Google worked that was different than that of Altavista — aside from the simple fact that it worked much “better.” The thing about Google’s “Pagerank” was that it was quite simple, and quite brilliant, because, honestly, none of the rest of us thought of it. So kudos to them.

There have been some Internet services I’ve tried over the years that I could not quite understand. I’m not talking about how they work under the hood, but how they appear to work from my perspective. Remember Google “Buzz?” I never quite understood what that was supposed to be doing.

Facebook, in its essence is pretty simple, too, and I think we all formed something of a working mental model for what we think it does. Here’s mine, written up as SQL code. First, the system is composed of a few tables:

A table of users, a table representing friendships, and a table of posts. The tables are populated by straightforward UI actions like “add friend” or “write post.”

Generating a user’s wall when they log in is as simple as:

You could build an FB clone with that code alone. It is eye-rollingly boring and unclever.

Such an implementation would die when you got past a few thousand users or posts, but with a little work and modern databases that automatically shard and replicate, etc, you could probably handle a lot more. Helping FB is the fact they makes no promises about correctness: a post you make may or may not ever appear on your friend’s wall, etc.

I think the ridiculous simplicity of this is why I have never taken Facebook very seriously. Obviously it’s a gajillion dollar idea, but technically, there’s nothing remotely creative or interesting there. Getting it all to work for a billion users making a billion posts a day is, I’m sure, a huge technical challenge, but not requiring inspiration. (As an aside, today’s FB wall is not so simple. It uses some algorithm to rank and highlight posts. What’s the algorithm and why and when will my friends see my post? Who the hell knows?! Does this bother anybody else but me?)

The last category is things that are reasonably obviously useful to lots of people, but how they work is pretty opaque, even if you think about it for awhile. That is, things that we can form a mental model of what it is, but mere mortals do not understand how it works. Machine translation falls into that category, and maybe all the new machine learning and future AI apps do, too.

It’s perhaps “the” space to watch, but if you ask me the obvious what / simple how isn’t nearly exhausted yet — as long as you can come up with an interesting “why,” that is.

next, they’ll discover fire

OK, now I’m feeling ornery. Google just announced a new chip of theirs that is tailored for machine-learning. It’s called the Tensor Processing Unit. and it is designed to speed up a software package called TensorFlow.

Okay, that’s pretty cool. But then Sundar Pichai has to go ahead and say:

This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law).

No, no, no, no, no.

First of all, Moore’s law is not about performance.  It is a statement of transistor density scaling, and this chip isn’t going to move that needle at all — unless Google has invented their own semiconductor technology.

Second, people have been developing special-purpose chips that solve a problem way faster than can a general-purpose microprocessor since the beginning of chip-making. It used to be that pretty much anything computationally interesting could not be done in a processor. Graphics, audio, modems, you name it all used to be done in hardware. Such chips are called application specific integrated circuits (ASICs) and, in fact, the design and manufacture of ASICs is more or less what gave Silicon Valley its name.

So, though I’m happy that Google has a cool new chip (and that they finally found an application that they believe merits making a custom chip) I wish the tech press wasn’t so gullible as to print any dumb thing that a Google rep says.

Gah.

I’ll take one glimmer of satisfaction from this, though. And that is that someone found an important application that warrants novel chip design effort. Maybe there’s life for “Silicon” Valley yet.

notes on self-driving cars

A relaxing trip to work (courtesy wikimedia)
A relaxing trip to work (courtesy wikimedia)

Short post here. I notice people are writing about self-driving cars a lot. There is a lot of excitement out there about our driverless future.

I have a few thoughts, to expand on at a later day:

I.

Apparently a lot of economic work on driving suggests that the a major externality of driving is from congestion. Simply, your being on the road slows down other people’s trips and causes them to burn more gas. It’s an externality because it is a cost of driving that you cause but don’t pay.

Now, people are projecting that a future society of driverless cars will make driving cheaper by 1) eliminating drivers (duh) and 2) getting more utilization out of cars. That is, mostly, our cars sit in parking spaces, but in a driverless world, people might not own cars so much anymore, but rent them by the trip. Such cars would be much better utilized and, in theory, cheaper on a per-trip basis.

So, if I understand my micro econ at all, people will use cars more because they’ll be cheaper. All else equal, that should increase congestion, since in our model, congestion is an externality. Et voila, a bad outcome.

II.

But, you say, driverless cars will operate more efficiently, and make more efficient use of the roadways, and so they generate less congestion than stupid, lazy, dangerous, unpredictable human drivers. This may be so, but I will caution with a couple of ideas. First, how much less congestion will a driverless trip cause than a user-operated one? 75% as much? Half? Is this enough to offset the effect mentioned above? Maybe.

But there is something else that concerns me: the difference between soft- and hard-limits.

Congestion as we experience it today, seems to come on gradually as traffic approaches certain limits. You’ve got cars on the freeway, you add cars, things get slower. Eventually, things somewhat suddenly get a lot slower, but even then it’s certain times of the day, in certain weather, etc.

Now enter a driverless cars that utilize capacity much more effectively. Huzzah! More cars on the road getting where they want, faster. What worries me is that was is really happening is not that the limits are raised, but that we are operating the system much close to existing, real limits. Furthermore, now that automation is sucking out all the marrow from the road bone — the limits become hard walls, not gradual at all.

So, imagine traffic is flowing smoothly until a malfunction causes an accident, or a tire blows out, or there is a foreign object in the road — and suddenly the driverless cars sense the problem, resulting in a full-scale insta-jam, perhaps of epic proportions, in theory, locking up an entire city nearly instantaneously. Everyone is safely stopped, but stuck.

And even scarier than that is the notion that the programmers did not anticipate such a problem, and the car software is not smart enough to untangle it. Human drivers, for example, might, in an unusual situation, use shoulders or make illegal u-turns in order to extricate themselves from a serious problem. That’d be unacceptable in a normal situation, but perhaps the right move in an abnormal one. Have you ever had a cop the scene of an accident wave at you to do something weird? I have.

Will self-driving cars be able to improvise? This is an AI problem well beyond that of “merely” driving.”

III.

Speaking of capacity and efficiency, I’ll be very interested to see how we make trade-offs of these versus safety. I do not think technology will make these trade-offs go away at all. Moving faster, closer will still be more dangerous than going slowly far apart. And these are the essential ingredients in better road capacity utilization.

What will be different will be how and when such decisions are made. In humans, the decision is made implicitly by the driver moment by moment. It depends on training, disposition, weather, light, fatigue, even mood. You might start out a trip cautiously and drive more recklessly later, like when you’re trying to eat fast food in your car. The track record for humans is rather poor, so I suspect  that driverless cars will do much better overall.

But someone will still have to decide what is the right balance of safety and efficiency, and it might be taken out of the hands of passengers. This could go different ways. In a liability-driven culture me way end up with a system that is safer but maybe less efficient than what we have now. (call it “little old lady mode”) or we could end up with decisions by others forcing us to take on more risk than we’d prefer if we want to use the road system.

IV.

I recently read in the June IEEE Spectrum (no link, print version only) that some people are suggesting that driverless cars will be a good justification for the dismantlement of public transit. Wow, that is a bad idea of epic proportions. If, in the first half of the 21st century, the world not only continues to embrace car culture, but  doubles down  to the exclusion of other means of mobility, I’m going to be ill.

 

*   *   *

 

That was a bit more than I had intended to write. Anyway, one other thought is that driverless cars may be farther off than we thought. In a recent talk, Chris Urmson, the director of the Google car project explains that the driverless cars of our imaginations — the fully autonomous, all conditions, all mission cars — may be 30 years off or more. What will come sooner are a succession of technologies that will reduce driver workload.

So, I suspect we’ll have plenty of time to think about this. Moreover, the nearly 7% of our workforce that works in transportation will have some time to plan.

 

minor annoyances: debug-printing enums

This is going to be another programming post.

One thing that always annoys me when working on a project in a language like C++ is that when I’m debugging, I’d like to print messages with meaningful names for the enumerated types I’m using.

The classic way to do it is something like this:

Note that I have perhaps too-cleverly left out the break statements because each case returns.

But this has problems:

  • repetitive typing
  • maintenance. Whenever you change the enum, you have to remember to change the debug function.

It just feels super-clunky.

I made a little class in C++ that I like a bit better because you only have to write the wrapper code once even to use it on a bunch of different enums. Also you can hide the code part in another file and never see or think about it again.

C++11 lets you initialize those maps pretty nicely, and they are static const, so you don’t have to worry about clobbering them or having multiple copies. But overall, it still blows because you have to type those identifiers no fewer than three times: once in the definition and twice in the printer thing.

Unsatisfactory.

I Googled a bit and learned about how Boost provides some seriously abusive preprocessor macros, including one that can loop. I don’t know what kind of dark preprocessor magic Boost uses, but it works. Here is the template and some macros:

And here’s how you use it:

Now I only have to list out the enumerators one time! Not bad. However, it obviously only works if you control the enum. If you are importing someone else’s header with the definition, it still has the maintenance problem of the other solutions.

I understand that the C++ template language is Turing-complete, so I’m suspect this can be done entirely with templates and no macros, but I wouldn’t have the foggiest idea how to start. Perhaps one of you do?

simple string operations in $your_favorite_language

I’ve recently been doing a small project that involves Python and Javascript code, and I keep tripping up on the differing syntax of their join()  functions. (As well as semicolons, tabs, braces, of course.) join()  is a simple function that joins an array of strings into one long string, sticking a separator in between, if you want.

So, join(["this","that","other"],"_")   returns "this_that_other" . Pretty simple.

Perl has join()  as a built-in, and it has an old-school non object interface.

Python is object-orienty, so it has an object interface:

What’s interesting here is that join is a member of the string class, and you call it on the separator string. So you are asking a "," to join up the things in that array. OK, fine.

Javascript does it exactly the reverse. Here, join is a member of the array class:

I think I slightly prefer Javascript in this case, since calling member functions of the separator just “feels” weird.

I was surprised to see that C++ does not include join in its standard library, even though it has the underlying pieces: <vector>  and <string>. I made up a little one like this:

You can see I took the Javascript approach. By the way, this is how they do it in Boost. Boost avoids the extra compare for the separator each time by handling the first list item separately.

Using it is about as easy as the scripting languages:

I can live with that, though the copy on return is just a C++ism that will always bug me.

Finally, I thought about what this might look like back in ye olden times, when we scraped our fingers on stone keyboards, and I came up with this:

Now that’s no beauty queeen. The function does double-duty to make it a bit easier to allocate for the resulting string. You call it first without a target pointer and it will return the size you need (not including the terminating null.) Then you call it again with the target pointer for the actual copy.

Of course, if any of the strings in that array are not terminated, or if you don’t pass in the right length, you’re going to get hurt.

Anyway, I must have been bored. I needed a temporary distraction.

 

The answer is always the same, regardless of question. (Civil Aviation Edition)

The Wall Street Journal had an editorial last week suggesting that the US air traffic control system needs to privatize.

It’s not a new debate, and though I will get into some specifics of the discussion below, what really resonated for me is how religious and ideological is the belief that corporations just do everything better. It’s not like the WSJ made any attempt whatsoever to list (and even dismiss) counter-arguments to ATC privatization. It’s almost as if the notion that there could be some justification for a publicly funded and run ATC has just never occurred to them.

It reminded me of a similar discussion, in a post in an energy blog I respect, lamenting the “dysfunction” in California’s energy politics, particularly from the CPUC.

What both pieces seemed to have in common is a definition of dysfunction that hews very close to “not the outcome that a market would have produced.” That is to say, they see the output of non-market (that is, political) processes as fundamentally inferior and inefficient, if not outright illegitimate. Of course, the outcomes from political processes can be inefficient and dysfunctional, but this is hardly a law of nature.

For my loyal reader (sadly, not a typo), none of this is news, but it still saddens me that so many potentially interesting problems (like how best to provision air traffic control services) break down on such tired ideological grounds: do you want to make policy based on one-interested-dollar per vote or one-interested-person per vote?

I want us to be much more agnostic and much more empirical in these kinds of debates. Sometimes markets get good/bad outcomes, sometimes politics does.

For example, you might never have noticed that you can’t fly Lufthansa or Ryanair from San Francisco to Chicago. That’s because there are “cabotage” laws in the US that bar foreign carriers from offering service between US cities. Those laws are blatantly anti-competitive and the flying public is definitely harmed by this. This is a political outcome I don’t particularly like due, in part, to Congress paying better attention to the airlines than to the passengers. Yet, I’m not quite ready to suggest that politics does not belong in aviation.

Or, in terms of energy regulation, it’s worth remembering that we brought politics into the equation a very long time ago because “the market” was generating pretty crappy outcomes, too. What I’m saying is that neither approach has a exclusive rights to dysfunction.

A control towerOK. Let’s get back to ATC and the WSJ piece.

In it, the WSJ makes frequent reference to Canada’s ATC organization, NavCanada, that was privatized during a budget crunch a few years back, and has performed well since then. This is in contrast to to an FAA that has repeated “failed to modernize.”

But the US is not Canada, and our air traffic situation is very different. A lot of planes fly here! Anyone who has spent any serious time looking at our capacity problems knows thUS and Europe have very different sources of flight delaysat the major source of delay in the US is from insufficient runways and terminal airspace, not control capabilities per se. That is to say, modernizing the ATC system so that aircraft could fly more closely using GPS position information doesn’t really buy you all that much if the real crunch is access to the airport. If you are really interested, check out this comparison of the US and European ATC performance. The solution in the US is pouring more concrete in more places, not necessarily a revamped ATC. (It’s not that ATC equipment could not benefit from revamping, only that it is not the silver bullet promised.)

Here’s another interesting mental exercise: Imagine you have developed new technology to improve the throughput of an ATC facility by 30% — but the hitch is that when you deploy the technology, there will be diminution in performance during the switchover, as human learning, inevitable hiccups, and the need to temporary run the old and new systems in parallel takes its toll. Now imagine that you want to deploy that technology at a facility that is already operating at near its theoretical maximum capability. See a problem there? It’s not an easy thing.

Another issue in the article regards something called ADS-B (Automatic Dependent Surveillance – Broadcast), a system by which aircraft broadcast their GPS-derived position. Sounds pretty good, and yet, the US has taken a long time to get it going widely. (It’s not required on all aircraft until 2020) Why? Well, one reason is that a lot of the potential cost-savings from switching to ADS-B would come from the retirement of expensive, old primary radars that “paint” aircraft with radio waves and sense the reflected energy. Thing is, primary radars can see metal objects in the sky, and ADS-B receivers only see aircraft that are broadcasting their position. You may have heard in recent hijackings how transponders were disabled by pilot — so, though the system is cool, it certainly cannot alone replace the existing surveillance systems. The benefits are not immediate and large, and it leaves some important problems unsolved. Add in the high cost of equippage, and it was an easy target to delay. But is that a sign of dysfunction or good decision-making?

All of which is to say that I’m not sure a privately run organization, facing similar constraints, would make radically different decisions than has the FAA.

Funding the system is an interesting question, too. Yes, a private organization that can charge fees has a reliable revenue stream and is thus is able to go to financial markets to borrow for investment.  This is in contrast to the FAA, which has had a hard time funding major new projects because of constant congressional budget can-kicking. Right now the FAA is operating on an extension of its existing authorization (from 2012), and a second extension is pending, with a real reauthorization still behind that. OK, so score one for a private organization. (Unless we can make Congress function again, at least.)

But what happens to privatized ATC if there is a major slowdown in air travel? Do investments stop, or is service degraded due to cost cutting, or does the government end up lending a hand anyway? And how might an airline-fee-based ATC operate differently from one that ostensibly serves the public? Even giving privatization proponents the benefit of the doubt that a privatized ATC would be more efficient and better at cost saving, would such an organization be as good at spending more money when an opportunity comes along to make flying safer, faster, or more convenient for passengers? How about if the costs of such changes fall primarily on the airlines, through direct equippage costs and ATC fees? Or, imagine a scenario where most airlines fly large aircraft between major cities, and an an upstart starts flying lots of small aircraft between small cities. Would a privatized ATC or publicly funded ATC better resist the airlines’ anti-competitive pressures to erect barriers to newcomers?

I actually don’t know the answers. The economics of aviation are somewhat mysterious to me, as they probably are to you unless your an economist or operations researcher. But I’m pretty sure the Scott McCartney of the WSJ knows even less.