Working with Data

The Data API provides a simple way to store and manage datasets within a notebook.

Datasets are identified by a unique name and remain available in the sandbox until they are removed or the sandbox is restarted.

A dataset can contain any supported data structure, including DataFrames.

Data
├── sales
├── customers
└── products

Creating a Dataset

Use data.set(key) to create a new dataset.

var sales = [
  [ 'id', 'product', 'revenue' ],
  [   1 ,       'A',      120  ],
  [   2 ,       'B',       95  ],
  [   3 ,       'C',      180  ],
];

data.set('sales', sales);

The dataset is stored under the name sales.

If a dataset with the same name already exists, it is replaced.

Listing Available Datasets

Use data.keys() to list all datasets currently available in the notebook.

var dataKeys = data.keys();
notebook.log(dataKeys);

Result:

[ 'sales', 'customers', 'products' ]

You can also check the number of stored datasets:

notebook.log(data.size());

Retrieving a Dataset

Use data.item(key).get() to retrieve a dataset by name.

var sales = data.item('sales').get();

You can verify that a dataset exists before accessing it:

notebook.log(data.item('sales').exists();

Managing Datasets

The Data Item API exposes operations that apply to a single dataset rather than the entire collection.

Renaming a Dataset

Rename an existing dataset:

data.item('sales').renameTo('orders');

Copying a Dataset

Create a duplicate of an existing dataset:

data.item('sales').copyTo('sales_backup');

Removing a Dataset

Delete a dataset from the notebook:

data.item('sales').remove();

Clearing All Datasets

Remove every dataset currently stored in the notebook:

data.clear();

Use this operation carefully, as all stored datasets will be removed.

Next Step

While datasets can be created directly from code, most analyses start from external files.

The next section shows how to import CSV data through the Rawlytics Notebook interface and make it available to your notebook.