This is a short post on a minor but consequential pitfall of social network analyses of film actors.
One thing that has always bothered me about social network analysis of so-called “actor networks” using data from IMDB is the very simple fact that these analyses are based on the assumption that because two actors appear in the same film, they know each other.
This is simply not true.
Modern filmmaking techniques and the high cost of actor set time incentivizes filmmakers not to have expensive actors on set at the same time unless absolutely necessary. Instead, stand-ins are often used in place of star actors–especially in dialogue scenes–and footage is later edited to put the two star actors together in the finished product.
So, in theory, two actors can appear in the same film and even in the same scenes but never actually be on set together. Extrapolating, two actors could appear in the same film and never actually meet.
I’ve been waiting to find a solid example and finally found one.
Robert Rodriguez (@Rodriguez), the writer-director-producer best known for his films Sin City, From Dusk Til Dawn, Once Upon a Time in Mexico, and Spy Kids, was interviewed on the Tim Ferriss Show and described exactly this situation occurring during Sin City. Rodriguez describes Sin City as one of the most rapidly-executed projects he ever worked on, from initial concept and collaboration with Frank Miller to actually shooting the film in a matter of months. In fact, Rodriguez describes shooting scenes for Sin City with actor Mickey Rourke, in which Rodriguez or another crew member would stand in for the villain who at that time hadn’t been cast. Rutger Hauer was later cast and the complementary footage was shot for the scenes. According to Rodriguez, Rourke and Hauer claim they never met, despite appearing together in a Sin City scene in which Rourke’s character appears to have his hands on Hauer’s throat.
The lesson is what every good data scientist and computational modeler should always keep in mind: justify all assumptions and always include or at least consult subject-matter experts who know the system and data being studied!