Business process monitoring techniques have been investigated in depth over the last decade to enable organizations to deliver process insight. Recently, a new stream of work in predictive business process monitoring leveraged deep learning techniques to unlock the potential business value locked in process execution event logs. These works use Recurrent Neural Networks, such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), and suffer from misinformation and accuracy as they use the last hidden state (as the context vector) for the purpose of predicting the next event. On the other hand, in operational processes, traces may be very long, which makes the above methods inappropriate for analyzing them. In addition, in predicting the next events in a running case, some of the previous events should be given a higher priority. To address these shortcomings, in this paper, we present a novel approach inspired by the notion of attention mechanism, utilized in Natural Language Processing and, particularly, in Neural Machine Translation. Our proposed approach uses all hidden states to accurately predict future behavior and the outcome of individual activities. Experimental evaluation of real-world event logs revealed that the use of attention mechanisms in the proposed approach leads to a more accurate prediction.