During social interactions, how do we predict what other people are going to do next? One view is that we use our own motor experience to simulate and predict other people's actions. For example, when we see Sally look at a coffee cup or grasp a hammer, our own motor system provides a signal that anticipates her next action. Previous research has typically examined such gaze and grasp-based simulation processes separately, and it is not known whether similar cognitive and brain systems underpin the perception of object-directed gaze and grasp. Here we use functional magnetic resonance imaging to examine to what extent gaze- and grasp-perception rely on common or distinct brain networks. Using a 'peeping window' protocol, we controlled what an observed actor could see and grasp. The actor could peep through one window to see if an object was present and reach through a different window to grasp the object. However, the actor could not peep and grasp at the same time. We compared gaze and grasp conditions where an object was present with matched conditions where the object was absent. When participants observed another person gaze at an object, left anterior inferior parietal lobule (aIPL) and parietal operculum showed a greater response than when the object was absent. In contrast, when participants observed the actor grasp an object, premotor, posterior parietal, fusiform and middle occipital brain regions showed a greater response than when the object was absent. These results point towards a division in the neural substrates for different types of motor simulation. We suggest that left aIPL and parietal operculum are involved in a predictive process that signals a future hand interaction with an object based on another person's eye gaze, whereas a broader set of brain areas, including parts of the action observation network, are engaged during observation of an ongoing object-directed hand action.