Collections 2.0

From XMMS2
Jump to: navigation, search

Collections 2.0, or coll2 for short, is an evolution of the collections concept already implemented in XMMS2. Coll2 distinguishes between medialists (an ordered list of entries that may have duplicates) and mediasets (an unordered set of entries that does not have duplicates). It also adds operators to order a mediaset (turning it into a medialist), turn a medialist into a mediaset (thus removing duplicates and ordering) and limiting a medialist. All this gives the user more power when querying the media-library. In addition to this Coll2 adds a new query system, allowing more advanced queries with less code for the client programmer.

This page describes coll2 as implemented in the xmms2-cippo git repository

Operators

A collection is built up of collection operators. Coll2 adds some new operators and removes some old ones. The operators removed are _QUEUE and _PARTYSHUFFLE.

API breakage

Client code dealing with xmmsv_coll_t directly will most probably break, as the collection operator types have changed. Clients written in anything other than ruby or C will also break, because the bindings for those languages have not been ported yet. Clients using coll-parser to create their collections, and otherwise leaving them alone will be fine.

Operator list

Universe

Type identifier XMMS_COLLECTION_TYPE_UNIVERSE
Result

All mediaids in the medialibrary

Attributes

none

Operands

none


Idlist

Type identifier XMMS_COLLECTION_TYPE_IDLIST
Result

A medialist containing the mediaids in the idlist found within the operator.

Attributes
  • type: The type of playlist. Typically list, queue or pshuffle. The default value is list.
  • any
Operands

Zero or one


Reference

Type identifier XMMS_COLLECTION_TYPE_REFERENCE
Result

All mediaids in the collection identified by the namespace and reference attributes

Attributes
  • namespace: The namespace of the referenced collection, e.g. Playlists or Collections
  • reference: The name of the referenced, e.g. Muse, Never Played, That grand Opeth album or just Default
Operands

none (except on the daemon-side, where the referenced collection can be attached as an operand to the reference-operator)


Complement

Type identifier XMMS_COLLECTION_TYPE_COMPLEMENT
Result

All mediaids in medialibrary, except those in the operand.

Attributes

none

Operands

Exactly one


Intersection

Type identifier XMMS_COLLECTION_TYPE_INTERSECTION
Result

The mediaids that appear in all operands. It keeps the ordering of the first operand.

Attributes

none

Operands

One or more


Union

Type identifier XMMS_COLLECTION_TYPE_UNION
Result

The mediaids that appear in any of the operands. If all operands are ordered it concatenates the media-lists.

Attributes

none

Operands

One or more


Has

Type identifier XMMS_COLLECTION_TYPE_HAS
Result

The operand, but only with the mediaids that have a property with a given field name and source. Only source-preferred properties are examined.

Attributes
  • field: The name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • source-preference: The source-preference that needs to be applied, e.g. ("server", "plugin/id3v2", "client*", "plugin*"). It is a list of strings. Character '*' can be used in a string to match any string (including an empty string) and '?' to match any single character. If not set the server default will be used.
Operands

Exactly one


Match

Type identifier XMMS_COLLECTION_TYPE_MATCH
Result

The operand, but only with the mediaids that have a property with a given field name and appropriate value according to glob-matching. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE and BINARY. The default-value is NOCASE.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Token

Type identifier XMMS_COLLECTION_TYPE_TOKEN
Result

The operand, but only with the mediaids that have a property with a given field name and appropriate value according to token-matching. Only source-preferred properties are examined.

Attributes
  • field: The name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE and BINARY. The default-value is NOCASE.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Equals

Type identifier XMMS_COLLECTION_TYPE_EQUALS
Result

The operand, but only with the mediaids that have a property with a given field name and value. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY and NATCOLL. The default-value is NOCASE.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Not-Equal

Type identifier XMMS_COLLECTION_TYPE_NOTEQUAL
Result

The operand, but only with the mediaids that have a property with a given field name and a value different from the one specified. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY and NATCOLL. The default-value is NOCASE.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Smaller

Type identifier XMMS_COLLECTION_TYPE_SMALLER
Result

The operand, but only with the mediaids that have a property with a given field name and a value smaller than the one specified. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY, NATCOLL. The default-value is NATCOLL.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Smaller-Equal

Type identifier XMMS_COLLECTION_TYPE_SMALLEREQ
Result

The operand, but only with the mediaids that have a property with a given field name and a value smaller than or equal to the one specified. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY, NATCOLL. The default-value is NATCOLL.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Greater

Type identifier XMMS_COLLECTION_TYPE_GREATER
Result

The operand, but only with the mediaids that have a property with a given field name and a value greater than the one specified. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY, NATCOLL. The default-value is NATCOLL.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Greater-Equal

Type identifier XMMS_COLLECTION_TYPE_GREATEREQ
Result

The operand, but only with the mediaids that have a property with a given field name and a value greater than or equal to the one specified. Only source-preferred properties are examined.

Attributes
  • type: value to filter on values, id to filter on ids. Defaults to value.
  • field: If type is value, then the name of the property, e.g. artist, album or title. If this is not specified the name of the property is not taken into account.
  • value: A string.
  • collation: A collation to be used when comparing the value from the medialib and the value attribute. The possible values are NOCASE, BINARY, NATCOLL. The default-value is NATCOLL.
  • source-preference: The source-preference that needs to be applied. See at Has. If not set the server default will be used.
Operands

Exactly one


Order

Type identifier XMMS_COLLECTION_TYPE_ORDER
Result

A medialist, with the mediaids in the operand sorted according to a value depending on the type attribute and the mediaid. If the operand also is an order operator, then the values generated by that will be used for secondary sorting.

Attributes
  • direction: ASC or DESC for ascending and descending ordering. The default-value is ASC.
  • collation: A collation to be used when the values are strings. The default-value is NATCOLL. Not implemented yet
  • type: Defines what to order by. Possible values value, id or random. The default value is value.
  • field: If type=value, an obligatory attribute, defining which property to order by.
  • seed: If type=random, an optional seed attribute, defining the seed to use when randomizing so that the same randomized order can be queried multiple times. Not implemented yet
Operands

Exactly one


Limit

Type identifier XMMS_COLLECTION_TYPE_LIMIT
Result

A medialist equal to the operand, but without the first start entries and containing length or fewer entries.

Attributes
  • start: a non-negative integer. If not set it will start at 0.
  • length: a non-negative integer. If not set it will take all entries after start.
  • type: Defines what kind windowing should be applied. Possible values position, or value. The default value is position.
  • fields: If type=value, an obligatory attribute, defining which properties that should, when combined, create a unique key used for limitation.
Operands

Exactly one (a medialist)


Mediaset

Type identifier XMMS_COLLECTION_TYPE_MEDIASET
Result

A mediaset containing the mediaids in the operand. (Removing duplicates and order)

Attributes

none

Operands

Exactly one

Querying

Coll2 adds a new query mechanism, xmmsc_coll_query, in addition to the old xmmsc_coll_query_ids and xmmsc_coll_query_infos. It is more powerful, the two other query calls are actually implemented using it. It takes two arguments, a collection and a fetch specification. The collection is the same as before, but the fetch specification is new.

Fetch Specification

The fetch specification tells XMMS2 what to fetch and how to store it. It is built up of recursive dicts of different types. The "type" attribute defines what type the dict has.


Metadata

Type value metadata
Description

Fetches data from the current cluster and returns the fetched data

Attributes
  • get: A non-empty list of "id", "field", "value" or "source". If there are more than one item in the list it will use the data corresponding to the first n-1 items as keys in recursive dicts. The data corresponding to the last item will be aggregated using the aggregation function and stored at the deepest level. All entries of the list must be distinct. If get is ("value", "id"), fields is ("artist") and aggregate is list then something like this can be returned: { "The White Stripes" = (1, 2, 9), "Muse" = (3, 8, 10) }.
  • aggregate: A string specifying the aggregation function. Can be one of "first", "list", "set", "avg", "sum", "min", "max" and "random". Defaults to "first".
  • fields: A list of fields to get. All entries of the list must be distinct. If this is not set, or an empty list is provided, it will get all properties.
  • source-preference: The source preference to use when fetching the info. See at Has. If not set, it uses the inherited source preference instead.


Cluster list

Type value cluster-list
Description

Clusters the current collection into subcollections and returns a list with one entry per subcollection. Each entry is created using the fetch-spec dict in attribute data. In the case that cluster-by is "position", every entry in the collection will get its own subset. If it is "id", every song will get its own subset. In the case of "value", all songs having a specific property in common will form a subset.

Attributes
  • cluster-by: One of the strings "id", "position", or "value". Defaults to "value".
  • cluster-field: If cluster-by is set to "value", the name of the property which will be used.
  • source-preference: The source preference to use when fetching the info. See at Has. If set, is inherited down to child fetch-specs.
  • data A fetch-spec dict specifying what to fetch for every cluster.


Cluster dict

Type value cluster-dict
Description

Clusters the current collection into subcollections and returns a dict where the datum to cluster by is used as the key and the value is according to the fetch-spec in attribute data. If the datum to cluster by does not exist, then the string (No value) will be used.

Attributes
  • cluster-by One of the strings "id", "position", or "value". Defaults to "value".
  • cluster-field: If cluster-by is set to "value", the name of the property which will be used.
  • source-preference: The source preference to use when fetching the info. See at Has. If set, is inherited down to child fetch-specs.
  • data A fetch-spec dict specifying what to fetch for every cluster.


Organize

Type value organize
Description

Organizes data in a dict.

Attributes
  • data A dict with key-value pairs. The keys of the dicts will be used as keys in the returned dictionary and the values are fetch-specification used to fetch the value associated with the key.
  • source-preference: If set, is inherited down to child fetch-specs.


Count

Type value count
Description

Returns the number of number of entries in the current set

Attributes

none

Examples

This will hopefully clarify the above mentioned fetch specification. Below '{ key = value }' is used to mean a dict with key set to value and '( a, b )' is used to mean a list with the values a and b.

Get all fields in the media-library

Call xmmsc_coll_query (universe, {"type" = "metadata", "get" = ("field"), "aggregate" = "set"}
Result ("artist", "album", "title", ...)

Get a full id-field-source-value dict

Call xmmsc_coll_query (universe, {"type" = "metadata", "get" = ("id", "field", "source", "value")}
Result
{
1234 = {
  status = {
    server = 1
    },
  title = {
    plugin/id3v2 = "Some title"
    },
  lmod = {
    plugin/gvfs = 1252686313
    },
  ...
},
2345 = {
  status = {
    server = 1
    },
  title = {
    plugin/id3v2 = "Some other title"
    },
  lmod = {
    plugin/gvfs = 1252672341
    },
  ...
}, ...
}

Cluster by album and get some info

Call
xmmsc_coll_query (universe,
{
  "type" = "cluster-dict",
  "cluster-field" = "album",
  "data" = {
    "type" = "organize"
    "data" = {
      "tracks" = { "type" = "count" }
      "duration" = { "type" = "metadata", "fields" = ("duration"), "get" = ("value"), "aggregate" = "sum" }
      "titles" = { "type" = "metadata", "fields" = ("title"), "get" = ("value"), "aggregate" = "list" }
    }
  }
})
Result
{
  "OK Computer" = {
    "duration" = 3207806, 
    "tracks" = 12,
    "titles" = ("Exit Music (For a Film)", "Karma Police", "Let Down", "No Surprises", "Fitter Happier",
                "Airbag", "Electioneering", "Climbing Up the Walls", "Subterranean Homesick Alien",
                "Paranoid Android", "The Tourist", "Lucky")
  },
  "The Illusion of Safety" = {
    "duration" = 2308120,
    "tracks" = 13,
    "titles" = ("So Strange I Remember You", "Where Idols Once Stood", "A Subtle Dagger", "Deadbolt",
                "Kill Me Quickly", "The Red Death", "Betrayal Is a Symptom", "To Awake and Avenge the Dead",
                "A Living Dance Upon Dead Minds", "Trust", "The Beltsville Crucible", "In Years to Come",
                "See You in the Shallows")
  },
  ...
}

xmmsc_coll_query pseudo-code

xmmsc_coll_query (coll, spec) {
    if (spec["type"] == "metadata") {
        Fetch all data for all songs in coll.
        /* This might look something like this:
           | ID  | field  | value         | source        |
           +-----+--------+---------------+---------------+
           |   1 | url    | file:///ho... | server        |
           |   1 | artist | Beethoven     | plugin/id3v2  |
           |   1 | artist | BadId3v1Thing | plugin/mad    |
           |   2 | album  | 1             | plugin/id3v2  |
           |   2 | artist | Beatles       | plugin/id3v2  |
           |   3 | album  | White Album   | plugin/id3v2  |
           |   3 | artist | Beatles       | plugin/id3v2  |
           |   4 | album  | White Album   | plugin/id3v2  |
           |   4 | artist | Beatles       | plugin/id3v2  |
        */

        Remove all rows with field not in spec["fiels"].
        /* For instance if spec["fields"] == ("artist"):
           | ID  | field  | value         | source        |
           +-----+--------+---------------+---------------+
           |   1 | artist | Beethoven     | plugin/id3v2  |
           |   1 | artist | BadId3v1Thing | plugin/mad    |
           |   2 | artist | Beatles       | plugin/id3v2  |
           |   3 | artist | Beatles       | plugin/id3v2  |
           |   4 | artist | Beatles       | plugin/id3v2  |
        */

        Prune rows according to source-preference.
        /* With spec["source-preference"] = ("server", "plugin/id3v2". "*") we might get
           | ID  | key    | value         | source        |
           +-----+--------+---------------+---------------+
           |   1 | artist | Beethoven     | plugin/id3v2  |
           |   2 | artist | Beatles       | plugin/id3v2  |
           |   3 | artist | Beatles       | plugin/id3v2  |
           |   4 | artist | Beatles       | plugin/id3v2  |
        */

        Remove the columns that are not in spec["get"] and reorder columns
        according to spec["get"].
        /* With spec["get"] = ("field", "value", "id") we would obtain:
           | key    | value         | ID  |
           +--------+---------------+-----+
           | artist | Beethoven     |   1 |
           | artist | Beatles       |   2 |
           | artist | Beatles       |   3 |
           | artist | Beatles       |   4 |
        */

        Combine rows that are the same in the first spec["get"].length - 1
        columns specified in spec["get"]. For combining use the aggregation
        from spec["aggregate"].
        /* For instance if spec["aggregate"] == "list":
           | key    | value         | (ID)    |
           +--------+---------------+---------+
           | artist | Beethoven     | (1)     |
           | artist | Beatles       | (2,3,4) |
        */

        Return this table as a dict.
        /* In this example:
           { artist = { Beethoven = (1), Beatles = (2,3,4), ... } }
        */
    } else if (spec["type"] = "cluster-list" || spec["type"] = "cluster-dict") {
        Depending on the values of spec["cluster-type"], spec["cluster-field"]
        and spec["source-preference"]  gather some data for every element of the
        collection coll.
        /* For instance, with spec["cluster-type"] == "value" and
           spec["cluster-field"] == "album", we might get the following:
           | ID  | album       |
           +-----+-------------+
           |   1 | (null)      |
           |   2 | 1           |
           |   3 | White Album |
           |   4 | White Album |
        */

        Merge the rows with the identical collected data.
        /* E.g.:
           | (ID)  | album       |
           +-------+-------------+
           | (1)   | (null)      |
           | (2)   | 1           |
           | (3,4) | White Album |
        */

        For every row execute a xmmsc_coll_query with the (ID)-column as the
        collection parameter and spec["data"] as the spec parameter.
        /*
           - XXX = xmmsc_coll_query ( idlist(1), spec["data"] );
           - YYY = xmmsc_coll_query ( idlist(2), spec["data"] );
           - ZZZ = xmmsc_coll_query ( idlist(3,4), spec["data"] );
           | (ID)  | album       | result |
           +-------+-------------+--------+
           | (1)   | (null)      | XXX    |
           | (2)   | 1           | YYY    |
           | (3,4) | White Album | ZZZ    |
        */

        if (spec["type"] = "cluster-dict") {
            Remove (ID)-column.
            /*
               | album       | result |
               +-------------+--------+
               | (null)      | XXX    |
               | 1           | YYY    |
               | White Album | ZZZ    |
            */

            Return this table as a dict.
            /*
               {"(No value)" = XXX, "1" = YYY, "White Album" = ZZZ}
            */
        } else {
            If necessary for every row execute queries for sorting.
            /* For instance if spec["order-by"] == ({"fields" = ("artist")},
                                                    {"fields" = (album)}),
               we might obtain:
               | (ID)  | album       | result | artist    | album       |
               +-------+-------------+--------+-----------+-------------+
               | (1)   | (null)      | XXX    | Beethoven | (null)      |
               | (2)   | 1           | YYY    | Beatles   | 1           |
               | (3,4) | White Album | ZZZ    | Beatles   | White Album |
            */

            Sort according to the directions spec["order-by"].
            /* In our case we use the default direction ASC (ascending):
               | (ID)  | album       | result | artist    | album       |
               +-------+-------------+--------+-----------+-------------+
               | (3,4) | White Album | ZZZ    | Beatles   | White Album |
               | (2)   | 1           | YYY    | Beatles   | 1           |
               | (1)   | (null)      | XXX    | Beethoven | (null)      |
            */

            Remove all columns but result.
            /*
               | result |
               +--------+
               | ZZZ    |
               | YYY    |
               | XXX    |
            */

            Return this table as a list.
            /*
               (ZZZ, YYY, XXX)
            */
        }
    } else if (spec["type"] == "organize") {
        for each spec["data"] as key => innerspec {
            result[key] = xmmsc_coll_query (coll, innerspec);
        }
        return result
    } else if (spec.type = count) {
        Return number of elements of collection coll.
    } else {
        CRY :'(.
    }
}

Try it out yourself

The code is now pretty stable, and you can try out coll2 together with S4 by using the master branch from the xmms2-cippo git repository. When you run the new xmms2d for the first time it will try to convert your current XMMS2 media library with the tool 'sqlite2s4'. It will rename your old library to 'medialib.db.obsolete' (assuming your media-library was named 'medialib.db') and create a new S4 database called 'medialib.s4'. To revert back to the way it was simple move 'medialib.db.obsolete' back to 'medialib.db' and change the line <property name="path">{path}/medialib.s4</property> into <property name="path">{path}/medialib.db</property>.