DIY Scale Dependence

What’s the big deal about “scale?” It’s a word that I’ve written about before here, and one that certain types of ecologists can’t seem to stop talking about. But it can be an infuratingly vague word to pin down, given that it can have more than one meaning, even in technical usage. And the fact that scale-dependent thinking is applicable to such a staggeringly wide range of phenomena, while a testament to its relevance, hardly helps in nailing down what it means. As a grad-student friend of mine, no slouch when it comes to quantitative reasoning, asked me recently, “What’s the deal with ‘scale?’ It all just seems kinda theoretical to me.”

Let’s take a concrete example. Say you’re interested in how seabirds forage for small fish, and so you put small GPS recorders on some birds which record their position every second. Examining the tracks, one of them looks like this:

Since I don’t have any real data, I had to make some up. For simplicity’s sake, I used a random walk: at each step, the bird moves a random distance from its current position.[1] The path starts on the right side near the middle, and then wanders around in a counterclockwise loop, ending up in the top right. This isn’t a terrible approximation of a bird swooping around its foraging grounds, searching for fish. If you use R, you can reproduce the above using these commands.

set.seed(10)
x <- cumsum(rnorm(500, 0, 5))
y <- cumsum(rnorm(500, 0, 5))
plot(x, y, ty='o' , cex=0.2 , xlab='x (meters)' , ylab='y (meters)' )

An important question in foraging ecology is whether the animal is taking in more energy from eating than it is spending from running/swimming/flying around looking for food. To get an estimate of how many calories the bird has burned, you calculate how far it has flown. Figure out how much the x and y change between each time step, and use the pythagorean theorem to get the distance traveled.

distance <- sum(sqrt(diff(x)^2 + diff(y)^2))

This gives a total distance flown of 3114.4 meters.

But suppose that because of budget cuts to the National Science Foundation, your research grant was actually 30% smaller. As a result, you could only afford a cheaper version of the GPS tag, which records the bird's position every 2 seconds instead of every one second. No big deal, right? Calculate the distance, carry on:

subsample <- seq(0, 500, by=2)
x.2 <- x[subsample]
y.2 <- y[subsample]
distance.2 <- sum(sqrt(diff(x.2)^2 + diff(y.2)^2))

Whoa! That gives a total distance of only 2256.0 meters. What happened? The bird flew the exact same path, but when you sample every two seconds instead of one, it appears to have flown 858 meters less. How does that work? Looking at the new path overlaid on the old one helps:

plot(x, y, col='grey' , ty='o' , cex=0.2, xlab='x (meters)' , ylab='y (meters)' )
lines(x.2, y.2, ty='o' , cex=0.2)

Because it skips every other observation, the new path (in black) avoids some of the random squigglyness in the old one (grey). The total measured length is 28% shorter. This is important, because if you were calculating the energy use by the bird, you would get a very different answer with a 1-second and 2-second resolution. Your conclusions would be incomplete at best, and wrong at worst.

But if you're a curious person, you might start wondering what the calculated distance would be at other resolutions. Instead of waiting around passively for the next round of corporate tax cuts before enacting your experiment, you decide to be proactive, and rush to your computer. After dashing off an angry email to your representative, you run a simple simulation, resampling the bird's track at different resolutions, and then plotting the total distance measured as a function of sampling scale.

scales <- 1:200
distances <- rep(0, length(scales))
for (i in 1:length(scales)) {
  subsample <- seq(1, 500, by=scales[i])
  x.sub <- x[subsample]
  y.sub <- y[subsample]
  distances[i] <- sum(sqrt(diff(x.sub)^2 + diff(y.sub)^2))
}
plot(scales, distances)

This graph shows very clearly what we mean when we talk about "scale-dependence." The distance measured depends, in a very real sense, on the scale of measurement. Fortunately, this relationship appears quite regular and predictable. It is, in fact, an example of a power law. Power laws are one of those mathematical relationships that show up all over the place in nature. Ratio of heart rate to body size? Velocity spectrum in a turbulent flow? Distribution of income among the richest Americans? Gravity? All power laws. In our case, the relationship looks like this:

D = a \, s^b

...which means, simply, that the distance measured is proportional to the scale of measurement raised to a power b. These constants, a and b, are easy to estimate. Taking the log of both sides of the equation, we can transform it from a power law to a straight line:
\ln(D) = \ln(a) + b \ln(s)

and from there, we can do a regular linear regression using R's "lm" function.

reg.distance <- lm(log(distances) ~ log(scales))
summary(reg.distance)

In this fit, because it is a log-transform of the real relationship, the slope (-0.528) is actually the value of the exponent in the power law, and the y-intercept (8.260) is the natural log of the coefficient a. We transform it back to get the true power law relation.

D = e^{8.260} \, s^{-0.528}  = 3867.331 \times s^{-0.528}

This relationship can be generalized a bit further as follows:

\frac{D_1}{D_2} = \left( \frac{s_1}{s_2} \right) ^ {-0.528}

What this says is that if you change the sampling resolution by a certain percentage, the percent change in measured distance is not the same—it is the proportional change in scale raised to the -0.528 power. Assuming the relationship is the same for the other bird tracks (which we need to check), we now have a general expression for how the measured distance changes with the scale of measurement. So no matter what scale we measure at, we can predict what the distance would have been had we measured at another scale.

In the real world, scaling relationships like this won't always appear, and when they do, they will usually only apply over a particular range of scales. Still, if you can deduce a scaling relationship, it's a powerful tool for reasoning about your problem, and may clarify what used to look like inconsistencies. Moral of the (somewhat artificial) story? Sample at multiple scales, or at high enough resolution that you can resample your data at a lower resolution, as I did here. And keep your mind open to the idea of different patterns and processes at different scales—what you see at first is not always the whole story!

[1] In physics, this kind of random walk is known as Brownian motion, and has all kinds of interesting properties.

This entry was posted in Quantitative and tagged , , , , . Bookmark the permalink.

5 Responses to DIY Scale Dependence

  1. gabe says:

    This exists in my stuff too. For example: there has been endless debate over whether American slavery was capitalist.

    On the one hand: it was profitable, fully integrated into the world market, and involved one class of people confiscating all the stuff made by another class of people, selling the stuff, and putting a fraction of the proceeds back into the people who made the stuff in the first place, so that they can eat, and survive to make more stuff. Other parts of the proceeds go toward a better or simply larger process for making more stuff. That looks, basically, like capitalism.

    On the other hand: it’s not just profit that makes capitalism capitalism, it’s the centrality of the “wage relationship.” In feudalism, the lord had to rely on extra-economic coercion to squeeze stuff out of the peasant. If the lords had no swords, the peasants would just keep all the stuff they made. Confiscation of surplus value rested on the threat of lordly violence. (Indeed, all the feudal hierarchy really amounted to was a system for distributing upward the value created by peasants, to increasingly powerful parasites: to local lords, thence to barons, dukes, kings, etc.) But a capitalist doesn’t need guns to make workers work; they don’t have land, so to eat, they have to work. The economy itself does the coercion. This, of course, leaves slavery — definitionally violent and coercive — looking pretty profoundly non-capitalist. And even the “peaceful” or “ordinary” features of the relationship between master and slave are intellectually troubling on this count as well: in America, compared to other slave systems, it was notably “paternalist.” This isn’t to say benevolent; rather, the politics of the plantation consisted of masters granting privileges to slaves in order to procure labor discipline and prevent rebellion; in turn, slaves learned to work this system, strategically being undisciplined or rebellious, both because they were after the small-time gains that they could win this way, and, of course, because they actually were rebellious. Here too we have a set of economic relationships that are regulated by distinctly non-market forces.

    Yet if we see it on a different scale, the picture snaps into focus. As European empires spread over the globe — a phenomenon driven by the rise of capitalism — they invariably seemed to generated apparently non-capitalist labor systems on their margins. What does it mean that cotton mills in Lancashire, industrial capitalism of the classic sort, were made possibly by the “non-capitalist” slave plantations of Mississippi and Alabama? Or that, as the English poor were uprooted from the land and concentrated in the cities, becoming industrial workers, the cheap calories that fueled their workdays came from sugar in their tea, from cane grown in Jamaica? (This is the cultural origin of British tea-drinking, incidentally.) Or, for that matter, that many enormous fortunes of merchant capitalists — including the endowment of Brown University — were made by transporting human beings from West Africa into bondage in the “non-capitalist” slave systems of the New World?

    At this level of scale — sometimes called “world-systems analysis” — the question of whether the plantation itself was capitalist seems to fade away. It becomes clear, as one historian puts it, that plantation slavery may not have been “of capitalism,” but it was surely “in it.” We might go a step further, and say that slavery may not have been strictly capitalist, but there is no imagining capitalism without slavery, or slavery without capitalism.

  2. Carly Miller says:

    Hi,

    I wanted to ask you a question about your site. Would you mind emailing me: carlymiller687@gmail.com

    Thanks,
    Carly

  3. Mom says:

    Smokin’, Sam (and Gabe).

  4. Al Dove says:

    Does this apply to other movement patterns like Levy walks? I assume it does. If so, how does one pick the right temporal scale of data collection? This is an important practical consideration for someone like me who deploys these tags from time to time.

    • Sam says:

      Yes, it will apply for Levy walks…basically, for any kind of rough or convoluted path. Choosing a scale for data analysis or collection will depend on your questions. If you’re interested in long seasonal migrations, sampling once a day might be enough. If you’re interested in the time or distance between food patches, you just need to sample frequently enough to recognize when the animal has gone into “foraging mode.” And at a small enough scale, the animal’s path will smooth probably smooth out and stop looking like a random walk. If you sample (or resample) at multiple scales, you may be able to find an empirical scaling relation that allows you to extrapolate to a scale you didn’t sample at (with the caveat that the scaling relation is probably only valid over a certain range of scales). Of course this might all be a moot point if the tag can only get a fix once a day, or whenever the animal surfaces.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>