One key ability of human brain is invariant object recognition, which refers to rapid and accurate recognition of objects in the presence of variations such as size, rotation and position. Despite decades of research into the topic, it remains unknown how the brain constructs invariant representations of objects. Providing brain-plausible object representations and reaching human-level accuracy in recognition, hierarchical models of human vision have suggested that, human brain implements similar feed-forward operations to obtain invariant representations. However, conducting two psychophysical object recognition experiments on humans with systematically controlled variations of objects, we observed that humans relied on specific (diagnostic) object regions for accurate recognition which remained relatively consistent (invariant) across variations; but feed-forward feature-extraction models selected view-specific (non-invariant) features across variations. This suggests that models can develop different strategies, but reach human-level recognition performance. Moreover, human individuals largely disagreed on their diagnostic features and flexibly shifted their feature extraction strategy from view-invariant to view-specific when objects became more similar. This implies that, even in rapid object recognition, rather than a set of feed-forward mechanisms which extract diagnostic features from objects in a hard-wired fashion, the bottom-up visual pathways receive, through top-down connections, task-related information possibly processed in prefrontal cortex.