A novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers (Vaswani et al., 2017).