greatx.datasets

GraphDataset

A series of datasets used in GreatX.

class GraphDataset(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]

Bases: InMemoryDataset

A series of datasets used in GreatX. These datasets are stored in npz format, consisting of a single graph.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • name (str) – The name of the dataset. See available_datasets() for all available datasets.

  • transform (Optional[Callable], optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access, by default None

  • pre_transform (Optional[Callable], optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk, by default None

Example

>>> from greatx.dataset import GraphDataset
>>> import torch_geometric.transforms as T
>>> GraphDataset.available_datasets() # see all available datasets.
['cora', 'citeseer', 'pubmed', ...]
>>> dataset = GraphDataset(root='.', name='Cora')
>>> data = dataset[0] # there is only one graph

Note

We follow the setting in Nettack from the: “Adversarial Attacks on Neural Networks for Graph Data” paper, which considers the largest connected component for each graph.

For more details of these datasets, see https://github.com/EdisonLeeeee/GraphData

static available_datasets() List[str][source]

Return all available datasets.