Working with large amounts of data

If you're using a ListView to display information from a data source, you must consider how the performance of your app is affected by the amount of data. Small amounts of data can be loaded as a complete set during initialization with little to no performance impact. Large sets of data must be managed differently to avoid start-up delays, slow scrolling, and other indicators of poor performance. Cascades provides several classes to help manage the flow of large amounts of data from external data sources into a data model, for display in a ListView.

To display data that is sourced from a large set of data, you can use the DataQuery and AsyncDataModel classes and their subclasses. The DataQuery class provides an interface to fetch data from a specific data source, such as a local SQL database. AsyncDataModel provides a background caching mechanism that fetches data from the data source before it's needed for display by a ListView. By using this prefetch mechanism, the presentation of data in a ListView is separated from the retrieval of data from the data source, improving the performance of the ListView.

The following diagram illustrates how DataQuery and AsyncDataModel fetch data for a ListView.

Sequence diagram showing how AsyncDataModel fetches data.

For small sets of data, you can use the SimpleQueryDataModel class. This class loads data from a DataQuery and provides a flat list of data. A data update triggers SimpleQueryDataModel to perform a full reload from the data source, so this class should be used for small data sets only.

Using DataQuery and AsyncDataModel

The DataQuery class gives you a simple interface to fetch data items from a data source and return a list of DataItem objects that have QVariant payloads. You can unpack these payloads to retrieve the data. The DataQuery class has subclasses that provide additional control over data retrieval:

  • HeaderDataQuery: Use with two-level models to fetch a list of HeaderDataItem objects and their corresponding DataItem objects.
  • SqlDataQuery: Use to fetch a list of DataItem objects from an SQL database using select.
  • SqlHeaderDataQuery: Use to fetch a list of HeaderDataItem objects and their corresponding DataItem objects from an SQL database using custom queries.

The AsyncDataModel class allows you to maintain a cache window, or a copy of a subset of items, over a data source and manage that cache by fetching data in the background. AsyncDataModel inherits the data model functions of QueryDataModel and adds functions to set the cache.

All updates to the AsyncDataModel cache are performed on the same thread where the AsyncDataModel object is created. Your app should access the AsyncDataModel object from the same thread.

To better understand how AsyncDataModel uses caching to interact with a ListView, here are a few diagrams that show how the cache loads and updates data, assuming a cache size of 200 (the default capacity) and the use of overall revisions. As a user scrolls through a list, the list must continually update itself using data from the cache.

Initial cache load

The cache loads the first 200 items from the data source when load() is called. The cache now contains items 0 to 199.

Diagram showing initial cache load.

Request near the end of the cache

A cache move occurs when the ListView requests data near the end of the cache, for example, around 20% from the edge (item 160 in this example). The AsyncDataModel retrieves the next 60 items from the data source to center the cache at item 160. The cache now contains items 60 to 259 and this is enough data for the ListView to update without the user noticing. The AsyncDataModel reports this change to the ListView.

Diagram showing a cache move.

Change in the data source

A cache refresh occurs when the source data changes. The dataChanged() signal is emitted and AsyncDataModel retrieves the 200 items that are above and overlapping the current cache position, including items that may be currently visible in the ListView. This ensures the cache and the ListView contain the most recent information when the user scrolls up or down. The AsyncDataModel reports this change to the ListView.

Diagram showing a change to the data source.

Request for items not in the cache

A cache miss occurs when the ListView requests items that aren't in the cache (in this example, a request for item 260 causes a cache miss). This could be due to the user scrolling rapidly through the list. When this happens, AsyncDataModel retrieves the next 100 items to center the cache at item 260. The cache now contains items 160 to 359. The AsyncDataModel reports this change to the ListView.

When the query request to retrieve data from the data source is started, any intermediate queries are discarded.

Diagram showing a cache miss.

If you're not using revisions or keys, updates due to a cache move, refresh, or miss may cause significant disruption in the ListView display.

To use DataQuery and AsyncDataModel with a ListView, you define the data model using the attachedObjects property of the containing UI control and populate its associated query property. Next, you use the dataModel property of the ListView to specify the id of the data model that it uses. This is shown in the following code sample:

import bb.cascades 1.2
import bb.cascades.datamanager 1.2
 
Page {
    content: Container {
        layout: StackLayout {}
        ListView {
            id: myListView
            dataModel: dm
            listItemComponents: [
                ListItemComponent {
                    StandardListItem {
                        title: ListItemData.firstname + " " + 
                            ListItemData.lastname
                        imageSource: ListItemData.image
                        description: ListItemData.title
                    }
                }
            ]
        }
        attachedObjects: [
            AsyncDataModel {
                id: dm
                query: SqlDataQuery {
                    source: "sql/contacts1k.db"
                    query: "select firstname, lastname, title, image
                        from contact order by lastname"
                    countQuery: "select count(*) from contact"
                    onError: console.log("query error: " + code + ", "
                        + message)
                }
                onLoaded: console.log("initial model data is loaded")
            }
        ]
    }
    onCreationCompleted: {
        dm.load();
    }
}

Here, the AsyncDataModel with id dm uses an SqlDataQuery to retrieve data. As the user scrolls through the ListView, AsyncDataModel prefetches the next set of data from the SQL database when scrolling nears the limit of the cache. In this way, the data is displayed with minimal impact to performance.

Retrieving hierarchical data

Information that's stored in a data model is often organized hierarchically (that is, stored as headers with child data items). Headers represent sorting keys, such as names or categories, and data items represent detailed information that fall under the headers. As seen in the image below, this is logically organized in the data model as a tree of items.

Diagram showing a data model hierarchy.

For this type of structure, headers are separate from data items and your app must retrieve each separately.

Displaying headers and data items in a ListView requires a two-level AsyncDataModel. The AsyncDataModel class has this subclass you can use:

  • AsyncHeaderDataModel: Use to maintain a cache window over a data source using a header query and a data item query.

To retrieve both headers and data items for a ListView, you define the AsyncHeaderDataModel using the attachedObjects property of the containing UI control and populate its associated query property with the HeaderDataQuery to use. Next, you use the dataModel property of the ListView to specify the id of the data model that it uses. This is shown in the following code sample:

import bb.cascades 1.2
import bb.cascades.datamanager 1.2
 
Page {
    content: Container {
        layout: StackLayout {}
        ListView {
            layout: StackListLayout {
                headerMode: ListHeaderMode.Sticky
            }
            id: myListView
            dataModel: dm
            listItemComponents: [
                ListItemComponent {
                    type : "header"
                    Header {
                        title: ListItemData.header
                    }
                },
                ListItemComponent {
                    type : ""
                    StandardListItem {
                        title: ListItemData.firstname + " " + 
                            ListItemData.lastname
                        imageSource: ListItemData.image
                        description: ListItemData.title
                    }
                }
            ]
        }
        attachedObjects: [
            AsyncHeaderDataModel {
                id: dm
                cacheSize: 200
                query: SqlHeaderDataQuery {
                    source: "sql/contacts1k.db"
                    query: "select firstname, lastname, title, image 
                        from contact order by lastname, firstname"
                    countQuery: "select count(*) from contact"
                    headerQuery: "select substr(lastname, 1, 1) as
                        header, count(*) from contact group by header"
                    onError: console.log("query error: " + code + ", "
                        + message)
                }
                onLoaded: console.log("initial model data is loaded")
            }
        ]
    }
    onCreationCompleted: {
        dm.load();
    }
}

Here, SqlHeaderDataQuery, which is a subclass of DataQuery, is used to retrieve data from an SQL database. There are two queries defined: headerQuery that returns the header and query that returns the data item. SqlHeaderDataQuery retrieves and caches the full list of headers for both the initial load and cache refreshes. As the user scrolls through the ListView, AsyncHeaderDataModel fetches the next set of data items from the SQL database when scrolling nears the limit of the cache. In this way, the data is displayed with minimal impact to performance.

When using HeaderDataQuery and AsyncHeaderDataModel, you must do the following:

  • You should perform an additional query for headers that includes a unique key value for each header data item (see HeaderDataItem::setKeyID()) and the count of child items for the header (see HeaderDataItem::setChildCount()).
  • The query for data items must provide the data in an order that aligns with the headers.

Header items don't need a revision because the headers are fully loaded on each cache refresh.

Transforming data

The DataQueryDecorator and DataModelDecorator classes allow you to transform data for use in other parts of your application. For example, you can transform data to group names under alphabetical headers ("A", "B", "C", and so on) or convert strings for better presentation ("RESULT" to "result") in a ListView.

Diagram showing how decorators work.

To extend the behavior of existing queries operating on a data source, you can use the DataQueryDecorator class to transform results from a standard DataQuery. For example, if you want to change the returned data to include an image path that's specific to some condition found in the query result, override the processResults() function of DataQueryDecorator. First, extend the DataQueryDecorator class:

class DataQualityDataQueryDecorator:
    public bb::cascades::datamanager::DataQueryDecorator {
    Q_OBJECT

public:
    DataQualityDataQueryDecorator(QObject* parent = 0);
    virtual ~DataQualityDataQueryDecorator();

    // This function injects the proper image path into the data,
    // based on the data quality field value.
    void processResults(
      QList<bb::cascades::datamanager::DataItem>* results);
};

Then, override the processResults() function with the actions you want to perform on the data. Here, the code iterates through each query result and sets the image path (map["dataQualityImage"]) based on the value of a result field (map["data_quality"]):

void DataQualityDataQueryDecorator::processResults(
        QList<bb::cascades::datamanager::DataItem>* results) {

    int count = results->size();

    for (int i = 0; i < count; i++) {
        QVariantMap map = (*results)[i].payload().toMap();
        QString dataQuality = map["data_quality"].toString();

        // Check the quality field value, and inject the proper image
        // file based on the value found.
        if (dataQuality == "Correct" ||
          dataQuality == "Complete and Correct") {
            map["dataQualityImage"] = "data_correct.png";
        } else {
            map["dataQualityImage"] = "data_incorrect.png";
        }

        // Save the newly processed data.
        (*results)[i].setPayload(map);
    }
}

To use this class as part of your AsyncDataModel in QML, declare it as the query to perform and declare the original SqlDataQuery in its query property. Here, we use an example SQL query:

AsyncDataModel {
    id: dataQualityModel
    query: DataQualityDataQueryDecorator {
        query: SqlDataQuery {
            source: "sql/discogs_medium.db"
            query: "select name, data_quality, primary_image 
                    from artist"
            countQuery: "select count(*) from artist"
            onDataChanged: console.log("data changed: revision=" +
                             revision)
            onError: console.log("SQL query error: " + code +
                       ", " + message)
        }
    }
    onLoaded: console.log("initial model data is loaded")
}

If you want to change items retrieved from a DataModel, you can use the DataModelDecorator class and override the functions appropriate to the data you want to transform. For example, you can use DataModelDecorator to expand or collapse each header based on the user selecting it. First, extend the DataModelDecorator class:

class ExpandableDataModelDecorator:
    public bb::cascades::datamanager::DataModelDecorator {
    Q_OBJECT

public:
    ExpandableDataModelDecorator(QObject* parent = 0);
    virtual ~ExpandableDataModelDecorator();

    // Inject "expanded" data value to true or false
    // depending if the data index equals the expanded
    // index. Call the function data() of the DataModel
    // it wraps.
    Q_INVOKABLE virtual QVariant data(
        const QVariantList &indexPath);

    // This function returns none zero children count
    // only when the selected index equals the
    // expanded index. Call the function childCount() of
    // the DataModel it wraps.
    virtual int childCount(const QVariantList& indexPath);

    // This function confirms child count only at time
    // when selected index equals that of the expand
    // index. Call the function hasChildren() of the
    // DataModel it wraps.
    virtual bool hasChildren(const QVariantList& indexPath);

public Q_SLOTS:

    // This function toggles expanding or collapsing
    // of header items. It sets the expanded index
    // and emits itemsChanged() signal once the
    // index has been set to expand or collapse.
    void expandHeader(const QVariantList& indexPath,
        bool expand);

private:
    // Sets the index to collapse or expand.
    int m_expandedIndex;

    // This is a helper function to verify index path
    // correctness, and whether the index is
    // expandable or not(already expanded).
    bool isExpandable(const QVariantList& indexPath) const;
};

Then, override the data() function to set a flag (map["expanded"]) that reflects whether the returned data is expanded or not:

QVariant ExpandableDataModelDecorator::data(
  const QVariantList& indexPath) {

    QVariant data =
      bb::cascades::datamanager::DataModelDecorator::data(
        indexPath);
    if (data.isValid()) {
        QVariantMap map = data.value<QVariantMap>();
        if (indexPath.size() == 1) {
            map["expanded"] = (indexPath[0] == m_expandedIndex);
        }
        return map;
    }
    return data;
}

To perform the expansion, create a function that sets the item index to either collapsed or expanded and emits a signal that the items have changed:

void ExpandableDataModelDecorator::expandHeader(
  const QVariantList& indexPath, bool expand) {

    if (indexPath.size() == 1 && expand) {
        int index = indexPath[0].toInt();
        if (index == m_expandedIndex) {
            m_expandedIndex = -1;
            emit itemsChanged(
              bb::cascades::DataModelChangeType::AddRemove);
        } else {
            m_expandedIndex = index;
            emit itemsChanged(
              bb::cascades::DataModelChangeType::AddRemove);
        }
    }
}

Override the childCount() and hasChildren() functions of DataModelDecorator to return accurate results based on whether the header is expanded or collapsed. When the header is expanded, childCount() should return 0 and hasChildren() should return false.

int ExpandableDataModelDecorator::childCount(
  const QVariantList& indexPath) {

    if (isExpandable(indexPath)) {
        return 0;
    }
    return 
      bb::cascades::datamanager::DataModelDecorator::childCount(
        indexPath);
}

bool ExpandableDataModelDecorator::hasChildren(
  const QVariantList& indexPath) {

    if (isExpandable(indexPath)) {
        return false;
    }
    return
      bb::cascades::datamanager::DataModelDecorator::hasChildren(
        indexPath);
}

Both of these functionss use a helper function called isExpandable() that determines whether the index is expandable or not:

bool ExpandableDataModelDecorator::isExpandable(
        const QVariantList& indexPath) const {
    return indexPath.size() == 1 && 
      indexPath[0].toInt() != m_expandedIndex;
}

Here's how you include this class as part of your QML code (in this example, it's incorporated into an ActionItem but you can use any type of object you want):

ActionItem {
    ActionBar.placement: ActionBarPlacement.InOverflow
    onTriggered: {
        expandableModel.load()
        listView.dataModel = expandableDecorator
        listView.selectionChanged.connect(
            expandableDecorator.expandHeader)
    }

    attachedObjects: [
        // Decorator for allowing the expansion/contraction of
        // header items.
        ExpandableDataModelDecorator {
            id: expandableDecorator
            model: AsyncHeaderDataModel {
                id: expandableModel
                query: SqlHeaderDataQuery {
                    source: "sql/discogs_medium.db"
                    query: "select title as name, primary_image,
                            master_genre.genre from master_genre,
                            master where
                            master_genre.master_id=master.id
                            order by genre"
                    countQuery: "select count(*) from master_genre,
                                 master where
                                 master_genre.master_id=master.id"
                    headerQuery: "select master_genre.genre as
                                  header count(*) from master_genre,
                                  master where
                                  master_genre.master_id=master.id
                                  group by header"
                    onDataChanged: console.log("data changed:
                                     revision=" + revision)
                    onError: console.log("SQL query error: " + code
                               + ", " + message)
                }
            }
        }
    ]
}

In this code sample, calling the expandHeader() function, created earlier, allows the user selection to expand or collapse a header:

listView.selectionChanged.connect(
    expandableDecorator.expandHeader)

To learn more about DataQueryDecorator and DataModelDecorator, see the List decorators sample app on the Sample apps page.

Best practices

If you're retrieving and presenting data using a ListView, you may want to consider the following practices when designing your app.

Use item-level keys

If you expect the source data to change, then item-level keys are an important consideration. Item-level keys allow AsyncDataModel to track the location of each item in its cache and to report the movement of items to a ListView, even as items are inserted, removed, or updated in the data source. If item-level keys aren't used, changes in source data may result in jarring visual changes in the ListView.

Here's an example of an SQL statement that returns a unique key for every query:

"SELECT key_id, ...FROM ..."

Use overall revisions

The overall revision allows AsyncDataModel to ensure that different versions of source data are not mixed in the same cache. If different versions of source data are mixed, the same item could appear twice or some items could be missed. Using overall revision is important when data changes are expected in the source data. An overall revision represents the current, unique, state of the data source and is used to recognize changes to the data source. The overall revision must be incremented (or otherwise changed) each time a change occurs in the data source. If overall revisions are not returned, then, to be conservative, each query refreshes the entire cache, resulting in inefficient performance.

Here's an example of an SQL statement that returns the revision from a table with a single column and single row:

"SELECT revision.revision FROM revision"

If you don't expect database changes to occur, the best performance is achieved by returning a constant revision number. For example:

"SELECT 1"

For more information, see the Revision API reference.

Use item-level revisions

To avoid missing updates or warnings about missing item-level revisions, each data query should return a revision for each data item. This revision represents the current, unique, state of the item and is used to recognize changes to the data item. The item's revision must be incremented (or otherwise changed) each time a change occurs in the data source. The overall revision should be updated as well to indicate that the data source as a whole has changed.

Here's an example of an SQL statement that returns a revision field:

"SELECT ..., revision_id FROM ..."

For more information, see DataItem::revision().

Use the dataChanged() signal

To avoid missing updates of the data source that are performed by another process, you can create custom code that triggers a signal that the data changed. For more information, see DataQuery::dataChanged().

Last modified: 2013-12-21

comments powered by Disqus