Apparently, both Elon Musk and Neil deGrasse Tyson believe that we are probably living in a more advanced civilization’s computer simulation.
Now, I’m no philosopher, so I can’t weigh in on whether I really exist, but it does occur to me that if this is a computer simulation, it sucks. First, we have cruelty, famine, war, natural disasters, disease. On top of that, we do not have flying cars, or flying people, or teleportation for that matter.
Seriously, whoever is running this advanced civilization simulation must be into some really dark shit.
I think we all make mental models constantly — simplifications of the world that help us understand it. And for services on the Internet, our mental models are probably very close — logically, if not in implementation — to the reality of what those services do. If not, how could we use them?
I also like to imagine how the service works, too. I don’t know why I do this, but it makes me feel better about the universe. For a lot of things, to a first approximation, the what and how are sufficiently close that they are essentially the same model. And sometimes a model of how it works eludes me entirely.
for example, my model of email is that an email address is the combination of a username and a system name. My mail server looks up the destination mail server, and IP routes my blob of text to the destination mail server, where that server routes it to the appropriate user’s “mailbox,” which is a file. Which is indeed how it works, more or less, with lots of elision of what I’m sure are important details.
I’ve also begun sorting my mental models of Internet companies and services into a taxonomy that have subjective meaning for me, based on how meritorious and/or interesting they are. Here’s a rough draft:
as in real life
Very glad these exist, but nobody deserves a special pat on the back for them. I’ll add most matchmaking services, too.
non-obvious, but simple and/or elegant
Google Search (PageRank)
High regard. Basically, this sort of thing has been the backbone of Internet value to-date
not obvious / inscrutable
lack of popularity kills these. Not much to talk about
Society rewards these but technically, they are super-boring to me
non-obvious and complex
natural language, machine translation, face recognition
Potentially very exciting, but not really very pervasive or economically important just yet. Potentially creepy and may represent the end of humanity’s reign on earth.
Google search is famously straightforward. You’re searching for some “thing,” and Google is combing a large index for that “thing.” Back in the Altavista era, that “thing” was just keywords on a page. Google’s first innovation was to use the site’s own popularity (as measured by who links to it and the rankings of those links.) to help sort the results. I wonder how many people had a some kind of mental model of how Google worked that was different than that of Altavista — aside from the simple fact that it worked much “better.” The thing about Google’s “Pagerank” was that it was quite simple, and quite brilliant, because, honestly, none of the rest of us thought of it. So kudos to them.
There have been some Internet services I’ve tried over the years that I could not quite understand. I’m not talking about how they work under the hood, but how they appear to work from my perspective. Remember Google “Buzz?” I never quite understood what that was supposed to be doing.
Facebook, in its essence is pretty simple, too, and I think we all formed something of a working mental model for what we think it does. Here’s mine, written up as SQL code. First, the system is composed of a few tables:
You could build an FB clone with that code alone. It is eye-rollingly boring and unclever.
Such an implementation would die when you got past a few thousand users or posts, but with a little work and modern databases that automatically shard and replicate, etc, you could probably handle a lot more. Helping FB is the fact they makes no promises about correctness: a post you make may or may not ever appear on your friend’s wall, etc.
I think the ridiculous simplicity of this is why I have never taken Facebook very seriously. Obviously it’s a gajillion dollar idea, but technically, there’s nothing remotely creative or interesting there. Getting it all to work for a billion users making a billion posts a day is, I’m sure, a huge technical challenge, but not requiring inspiration. (As an aside, today’s FB wall is not so simple. It uses some algorithm to rank and highlight posts. What’s the algorithm and why and when will my friends see my post? Who the hell knows?! Does this bother anybody else but me?)
The last category is things that are reasonably obviously useful to lots of people, but how they work is pretty opaque, even if you think about it for awhile. That is, things that we can form a mental model of what it is, but mere mortals do not understand how it works. Machine translation falls into that category, and maybe all the new machine learning and future AI apps do, too.
It’s perhaps “the” space to watch, but if you ask me the obvious what / simple how isn’t nearly exhausted yet — as long as you can come up with an interesting “why,” that is.
Okay, that’s pretty cool. But then Sundar Pichai has to go ahead and say:
This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law).
No, no, no, no, no.
First of all, Moore’s law is not about performance. It is a statement of transistor density scaling, and this chip isn’t going to move that needle at all — unless Google has invented their own semiconductor technology.
Second, people have been developing special-purpose chips that solve a problem way faster than can a general-purpose microprocessor since the beginning of chip-making. It used to be that pretty much anything computationally interesting could not be done in a processor. Graphics, audio, modems, you name it all used to be done in hardware. Such chips are called application specific integrated circuits (ASICs) and, in fact, the design and manufacture of ASICs is more or less what gave Silicon Valley its name.
So, though I’m happy that Google has a cool new chip (and that they finally found an application that they believe merits making a custom chip) I wish the tech press wasn’t so gullible as to print any dumb thing that a Google rep says.
I’ll take one glimmer of satisfaction from this, though. And that is that someone found an important application that warrants novel chip design effort. Maybe there’s life for “Silicon” Valley yet.
Short post here. I notice people are writing about self-driving cars a lot. There is a lot of excitement out there about our driverless future.
I have a few thoughts, to expand on at a later day:
Apparently a lot of economic work on driving suggests that the a major externality of driving is from congestion. Simply, your being on the road slows down other people’s trips and causes them to burn more gas. It’s an externality because it is a cost of driving that you cause but don’t pay.
Now, people are projecting that a future society of driverless cars will make driving cheaper by 1) eliminating drivers (duh) and 2) getting more utilization out of cars. That is, mostly, our cars sit in parking spaces, but in a driverless world, people might not own cars so much anymore, but rent them by the trip. Such cars would be much better utilized and, in theory, cheaper on a per-trip basis.
So, if I understand my micro econ at all, people will use cars more because they’ll be cheaper. All else equal, that should increase congestion, since in our model, congestion is an externality. Et voila, a bad outcome.
But, you say, driverless cars will operate more efficiently, and make more efficient use of the roadways, and so they generate less congestion than stupid, lazy, dangerous, unpredictable human drivers. This may be so, but I will caution with a couple of ideas. First, how much less congestion will a driverless trip cause than a user-operated one? 75% as much? Half? Is this enough to offset the effect mentioned above? Maybe.
But there is something else that concerns me: the difference between soft- and hard-limits.
Congestion as we experience it today, seems to come on gradually as traffic approaches certain limits. You’ve got cars on the freeway, you add cars, things get slower. Eventually, things somewhat suddenly get a lot slower, but even then it’s certain times of the day, in certain weather, etc.
Now enter a driverless cars that utilize capacity much more effectively. Huzzah! More cars on the road getting where they want, faster. What worries me is that was is really happening is not that the limits are raised, but that we are operating the system much close to existing, real limits. Furthermore, now that automation is sucking out all the marrow from the road bone — the limits become hard walls, not gradual at all.
So, imagine traffic is flowing smoothly until a malfunction causes an accident, or a tire blows out, or there is a foreign object in the road — and suddenly the driverless cars sense the problem, resulting in a full-scale insta-jam, perhaps of epic proportions, in theory, locking up an entire city nearly instantaneously. Everyone is safely stopped, but stuck.
And even scarier than that is the notion that the programmers did not anticipate such a problem, and the car software is not smart enough to untangle it. Human drivers, for example, might, in an unusual situation, use shoulders or make illegal u-turns in order to extricate themselves from a serious problem. That’d be unacceptable in a normal situation, but perhaps the right move in an abnormal one. Have you ever had a cop the scene of an accident wave at you to do something weird? I have.
Will self-driving cars be able to improvise? This is an AI problem well beyond that of “merely” driving.”
Speaking of capacity and efficiency, I’ll be very interested to see how we make trade-offs of these versus safety. I do not think technology will make these trade-offs go away at all. Moving faster, closer will still be more dangerous than going slowly far apart. And these are the essential ingredients in better road capacity utilization.
What will be different will be how and when such decisions are made. In humans, the decision is made implicitly by the driver moment by moment. It depends on training, disposition, weather, light, fatigue, even mood. You might start out a trip cautiously and drive more recklessly later, like when you’re trying to eat fast food in your car. The track record for humans is rather poor, so I suspect that driverless cars will do much better overall.
But someone will still have to decide what is the right balance of safety and efficiency, and it might be taken out of the hands of passengers. This could go different ways. In a liability-driven culture me way end up with a system that is safer but maybe less efficient than what we have now. (call it “little old lady mode”) or we could end up with decisions by others forcing us to take on more risk than we’d prefer if we want to use the road system.
I recently read in the June IEEE Spectrum (no link, print version only) that some people are suggesting that driverless cars will be a good justification for the dismantlement of public transit. Wow, that is a bad idea of epic proportions. If, in the first half of the 21st century, the world not only continues to embrace car culture, but doubles down to the exclusion of other means of mobility, I’m going to be ill.
* * *
That was a bit more than I had intended to write. Anyway, one other thought is that driverless cars may be farther off than we thought. In a recent talk, Chris Urmson, the director of the Google car project explains that the driverless cars of our imaginations — the fully autonomous, all conditions, all mission cars — may be 30 years off or more. What will come sooner are a succession of technologies that will reduce driver workload.
So, I suspect we’ll have plenty of time to think about this. Moreover, the nearly 7% of our workforce that works in transportation will have some time to plan.
One thing that always annoys me when working on a project in a language like C++ is that when I’m debugging, I’d like to print messages with meaningful names for the enumerated types I’m using.
The classic way to do it is something like this:
Note that I have perhaps too-cleverly left out the break statements because each case returns.
But this has problems:
maintenance. Whenever you change the enum, you have to remember to change the debug function.
It just feels super-clunky.
I made a little class in C++ that I like a bit better because you only have to write the wrapper code once even to use it on a bunch of different enums. Also you can hide the code part in another file and never see or think about it again.
C++11 lets you initialize those maps pretty nicely, and they are static const, so you don’t have to worry about clobbering them or having multiple copies. But overall, it still blows because you have to type those identifiers no fewer than three times: once in the definition and twice in the printer thing.
I Googled a bit and learned about how Boost provides some seriously abusive preprocessor macros, including one that can loop. I don’t know what kind of dark preprocessor magic Boost uses, but it works. Here is the template and some macros:
// the enumators for our enums, in a slightly weird
// parenthesized list form
#define EVENTS_LIST (ev_failed) \
#define MODES_LIST (mode_frobulate) \
Now I only have to list out the enumerators one time! Not bad. However, it obviously only works if you control the enum. If you are importing someone else’s header with the definition, it still has the maintenance problem of the other solutions.
I understand that the C++ template language is Turing-complete, so I’m suspect this can be done entirely with templates and no macros, but I wouldn’t have the foggiest idea how to start. Perhaps one of you do?
join() functions. (As well as semicolons, tabs, braces, of course.)
join() is a simple function that joins an array of strings into one long string, sticking a separator in between, if you want.
"this_that_other" . Pretty simple.
join() as a built-in, and it has an old-school non object interface.
Python is object-orienty, so it has an object interface:
What’s interesting here is that join is a member of the string class, and you call it on the separator string. So you are asking a
"," to join up the things in that array. OK, fine.
I was surprised to see that C++ does not include join in its standard library, even though it has the underlying pieces:
<string>. I made up a little one like this:
Now that’s no beauty queeen. The function does double-duty to make it a bit easier to allocate for the resulting string. You call it first without a target pointer and it will return the size you need (not including the terminating null.) Then you call it again with the target pointer for the actual copy.
There are a lot of good rants out there already, so I don’t think I can really add much, but I just want to say that this does sadden me. It’s not about analog v. digital per se, but about simple v. complex and open v. closed.
The headphone jack is a model of simplicity. Two signals and a ground. You can hack it. You can use it for other purposes besides audio. You can get a “guzinta” or “guzoutta” adapter to match pretty much anything in the universe old or new — and if you can’t get it, you can make it. Also, it sounds Just Fine.
Now, I’m not just being ant-change. Before the 1/8″ stereo jack, we had the 1/4″ stereo jack. And before that we had mono jacks, and before that, strings and cans. And all those changes have been good. And maybe this change will be good, too.
But this transition will cost us something. For one, it just won’t work as well. USB is actually a fiendishly complex specification, and you can bet there will be bugs. Prepare for hangs, hiccups, and snits. And of course, none of the traditional problems with headphones are eliminated: loose connectors, dodgy wires, etc. On top of this, there will be, sure as the sun rises, digital rights management, and multiple attempts to control how and when you listen to music. Prepare to find headphones that only work with certain brands of players and vice versa. (Apple already requires all manufacturers of devices that want to interface digitally with the iThings to buy and use a special encryption chip from Apple — under license, natch.)
And for nerd/makers, who just want to connect their hoozyjigger to their whatsamaducky, well, it could be the end of the line entirely. For the time being, while everyone has analog headphones, there will be people selling USB-C audio converter thingies — a clunky, additional lump between devices. But as “all digital” headphones become more ubiquitous, those adapters will likely disappear, too.
Of course, we’ll always be able to crack open a pair of cheap headphones and steal the signal from the speakers themselves … until the neural interfaces arrive, that is.
EDIT: 4/28 8:41pm: Actually, the USB-C spec does allow analog on some of the pins as a “side-band” signal. Not sure how much uptake we’ll see of that particular mode.
Last week, I came across this interesting piece on the perils of using “big data” to draw conclusions about the world. It analyzes, among other things, the situation of Google Flu Trends, the much heralded public health surveillance system that turned out to be mostly a predictor of winter (and has since been withdrawn).
It seems to me that big data is a fun place to explore for patterns, and that’s all good, clean, fun — but it is the moment when you think you have discovered something new when the actual work really starts. I think “data scientists” are probably on top of this problem, but are most people going on about big data data scientists?
I really do not have all that much to add to the article, but I will amateurishly opine a bit about statistical inferencing generally:
I’ve taken several statistics courses over my life (high school, undergrad, grad). In each one, I thought I had a solid grasp of the material (and got an “A”), until I took the next one, where I realized that my previous understanding was embarrassingly incorrect. I see no particular reason to think this pattern would ever stop if I took ever more stats classes. The point is, stats is hard. Big data does not make stats easier.
If you throw a bunch of variables at a model, it will find some that look like good predictors. This is true even if the variables are totally and utterly random and unrelated to the dependent variable (see try-it-at-home experiment below). Poking around in big data, unfortunately, only encourages people to do this and perhaps draw conclusions when they should not. So, if you are going to use big data, do have a plan in advance. Know what effect size would be “interesting” and disregard things well under that threshold, even if they appear to be “statistically significant.” Determine in advance how much power (and thus, observations) you should have to make your case, and sample from your ginormous set to a more appropriate size.
Big data sets seem like they were mostly created for other purposes than statistical inferencing. That makes them a form of convenience data. They might be big, but are the variables present really what you’re after? And was this data collected scientifically, in a manner designed to minimize bias? I’m told that collecting a quality data set takes effort (and money). If that’s so, it seems likely that the quality of your average big data set is low.
A lot of big data comes from log files from web services. That’s a lame place to learn about anything other than how the people who use those web services think or even how people who do use web services think while they’re doing something other than using that web service. Just sayin’.
Well, anyway, I’m perhaps out of my depth here, but I’ll leave you with this quick experiment, in R:
rows = 10000
vars = 200
x = data.frame(replicate(vars,runif(rows,0,1)))
y = runif(rows,0,1)
a.mod = lm(y ~ ., x)
It generates 10,000 observations of 201 variables, each generated from a uniform random distribution on [0,1]. Then it runs an OLS model using one variable as the dependent and the remaining 200 as independents. R is even nice enough to put friendly little asterisks next to variables that have
When I run it, I get 10 variables that appear to be better than “statistically significant at the 5% level” — even though the data is nothing but pure noise. This is about what one should expect from random noise.
Of course, the r2 of the resulting model is ridiculously low (that is, the 200 variables together have low explanatory power ). Moreover, the effect size of the variables is small. All as it should be — but you do have to know to look. And in a more subtle case, you can imagine what happens if you build a model with a bunch of variables that do have explanatory power, and a bunch more that are crap. Then you will see a nice r2 overall, but you will still have some of your crap pop up.
I’ve known for some time that the semiconductor (computer chip) business has not been the most exciting place, but it still surprised me and bummed me out to see that Intel was laying off 11% of its workforce. There are lots of theories about what is happening in non-cloudy-software-appy tech, but I think fundamentally, the money is being drained out of “physical” tech businesses. The volumes are there, of course. Every gadget has a processor — but that processor doesn’t command as much margin as it once did.
A lot of people suggest that the decline in semiconductors is a result of coming to the end of the Moore’s Law epoch. The processors aren’t getting better as fast as they used to, and some argue (incorrectly) hardly at all. This explains the decline, because without anything new and compelling on the horizon, people do not upgrade.
But in order for that theory to work, you also have to assume that the demand for computation has leveled off. This, I think, is almost as monumental a shift as Moore’s Law ending. Where are the demanding new applications? In the past we always seemed to want more computer (better graphics, snappier performance, etc) and now we somewhat suddenly don’t. It’s like computers became amply adequate right about the same time that they stopped getting much better.
Does anybody else find that a little puzzling? Is it coincidence? Did one cause the other, and if so, which way does the causality go?
The situation reminds me a bit of “peak oil,” the early-2000’s fear that global oil production will peak and there will be massive economic collapse as a result. Well, we did hit a peak in oil production in 2008-9 time-frame, but it wasn’t from scarcity, it was from low demand in a faltering economy. Since then, production has been climbing again. But with the possibility of electrified transportation tantalizingly close, we may see true peak oil in the years ahead, driven by diminished demand rather than diminished supply.
I am not saying that we have reached “peak computer.” That would be absurd. We are all using more CPU instructions than ever, be it on our phones, in the cloud, or in our soon-to-be-internet-of-thinged everything. But the ever-present pent up demand for more and better CPU performance from a single device seems to be behind us. Which, for just about anyone alive but the littlest infants, is new and weird.
If someone invents a new activity to do on a computing device that is both popular and ludicrously difficult, that might put us back into the old regime. And given that Moore’s Law is sort of over, that could make for exciting times in Silicon Valley (or maybe Zhongguancun), as future performance will require the application of sweat and creativity, rather than just riding a constant wave. (NB: there was a lot of sweat and creativity required to keep the wave going.)
I’ve been thinking a lot lately about success and innovation. Perhaps its because of my lack of success and innovation.
Anyway, I’ve been wondering how the arrow of causality goes with those things. Are companies successful because they are innovative, or are they innovative because they are successful.
This is not exactly a chicken-and-egg question. Google is successful and innovative. It’s pretty obvious that innovation came first. But after a few “game periods,” the situation becomes more murky. Today, Google can take risks and reach further forward into the technology pipeline for ideas than a not-yet successful entrepeneur could. In fact, a whole lot of their innovation seems not to affect their bottom line much, in part because its very hard to grow a new business at the scale of their existing cash cows. This explains (along with impatience and the opportunity to invest in their high-returning existing businesses) Google’s penchant for drowning many projects in the bathtub.
I can think of other companies that had somewhat similar behavior over history. AT&T Bell Labs and IBM TJ Watson come to mind as places that were well funded due to their parent companies enormous success (success, derived at least in part, from monopoly or other market power). And those places innovated. A lot. As in Nobel Prizes, patents galore, etc. But, despite their productive output of those labs, I don’t think they ever contributed very much to the companies’ success. I mean, the transistor! The solar cell! But AT&T didn’t pursue these businesses because they had a huge working business that didn’t have much to do with those. Am I wrong about that assessment? I hope someone more knowledgable will correct me.
Anyway, that brings me back to the titans of today, Google, Facebook, etc. And I’ll continue to wonder out loud:
are they innovating?
is the innovation similar to their predecessors?
are they benefiting from their innovation?
if not, who does, and why do they do it?
So, this gets back to my postulate, which is that, much more often than not, success drives innovation, and not the reverse. That it ever happens the other way is rare and special.
Perhaps a secondary postulate is that large, successful companies do innovate, but they have weak incentives to act aggressively on those innovations, and so their creative output goes underutilized longer than it might if it had been in the hands of a less successful organization.