Here are the concerns I have put to Professors Oberholzer and Strumpf.
Secondary Issues: Your Instruments
In Table 11 your instruments are different for each album. In Table 12, your instruments only have 17 observations, one for each week. That doesn’t seem like much information on which to explain the downloading behavior toward 670 albums. I am particularly concerned about the German school holiday variable. To start, I am surprised that the coefficient was even positive. I looked at German school holidays and I see that there are usually 12 days in October plus the typical Christmas holiday. Yet according to Table 3, October is when downloads were lowest. Is there something else going on here? Would you mind providing data on the number of German kids on vacation for each of your 17 weeks?
How important are the files of German school children to American downloaders? We really do not know. You only provide data on the total files of Germans used by Americans.
You suggest that these files are more available to Americans when German school children are on vacation. That is because you assume that German school kids leave their computers on when they are at home. But there are two problems with this story. First, Germany is 7 or so hours ahead of most American time zones, so Americans are sleeping while German school kids are at school. When the German highschooler wakes up at 10am on his vacation days, even the most resilient American night-owls are heading off to bed. The availability of these files doesn’t seem terribly relevant to Americans when they are asleep. Perhaps you would then suggest that Americans download files while sleeping. But then German schoolkids could leave their computers on while at school. There are numerous possibilities..
Plus, German university students, when they come home for vacation (if they stay at the university) are likely to not be making their files available since they are not using their computers, which presumably stay at school with no reason to be turned on (after all, you assume the high school kids turn off their computers when not in use). So this is a countervailing influence, and we don’t know which is stronger. Also, one would think that the university students’ computers at school were more likely to have high speed connections making them more influential than their numbers. Since your time sample includes the Christmas holidays going home for the holidays is a real possibility. Also, I thought most European countries charge by the minute for local calls, so that German kids wouldn’t be expected to make their files available at home for hours on end anyway.
I also thought that the way these peer-to-peer systems work, if some of the computers disappear, new computers take their place in the network. In other words, an American user might have contact with 4000 or so nodes, and if some of these nodes close down, computers further on the fringe become part of his available network. That is how the workings of these networks has been explained to me (see the papers by Krishnan et. al.).
Staying with the German kids for one more moment, and the problems with this instrument, you assume that a decrease in files from German kids shifts the supply curve for downloaders, making downloads more difficult. But the supply of MP3s is not like the supply in ordinary markets. Markets occur when goods are scarce, which is why the price is positive. The supply of MP3s might be sufficiently large that there is no scarcity for most files. In that case the impact of files available from German school children may not impact downloads any more than everyone in Germany taking a deep breath would impact the air available for breathing in the US. In other words, there may well be excess capacity in the system and removing German schoolchildren from the system may have no impact. You do get a positive coefficient on the German holiday variable, but that may just be due to the fact that some holidays are common between the countries, and Christmas is a big holiday in both countries.
I have less to say about your congestion variables. Since the congestion variable is measured weekly, how much variation during the week does it hide? I do wonder, whether Internet usage is like phone usage – heaviest during the day when businesses are in operation. I would have thought that music downloading was likely to take place at home during off-peak usage. If so, the congestion variables impacted by the daytime activity (in Europe and the US) might not have much meaning for US downloading at all (when Europe is asleep and American businesses are closed). Another concern comes from the fact that you say that peer-to-peer networks use up 25% of all Internet bandwidth. If peer-to-peer does occur during the day, then is it possible that, when the Internet is congested it is due to a spike in peer-to-peer file sharing? There seems to be a potential simultaneity problem with this variable, which is an instrument being used to solve a simultaneity problem with another variable.