Realm-cocoa: distinct query in Realm database

Created on 7 Nov 2014  路  70Comments  路  Source: realm/realm-cocoa

hello, realm is super fast db
but very necessary distinct query from database e.g. schedules = Schedules.objectsWhere("areas.id = '(idAreas)' distict")

P-2-Expected T-Feature

Most helpful comment

Work on this is ongoing here: realm/realm-object-store#235.

However, we still need to design how this feature should be exposed in our Objective-C and Swift APIs. Unfortunately, NSPredicate's @distinctUnionOfObjects isn't appropriate as that only returns the unique values of the property being "uniqued" (e.g. @distinctUnionOfObjects.name returning an array of unique names, not one object for each unique name).

A new method on Results (e.g. extension Results { func distinct(property: String) -> Self }) is an obvious way to expose this, but like I said, we haven't discussed it internally and no one from the community has stepped up to make a proposal.

All 70 comments

We agree that this would be a useful feature and will consider adding it to Realm's querying capabilities.

In the meantime, you'll have to do this yourself:

let schedules = Schedule.allObjects()
var uniqueIDs = [String]()
var uniqueSchedules = [Schedule]()
for schedule in schedules {
  let schedule = schedule as Schedule
  let scheduleID = schedule.areas.id
  if !contains(uniqueIDs, scheduleID) {
    uniqueSchedules.append(schedule)
    uniqueIDs.append(scheduleID)
  }
}
println(uniqueSchedules) // Schedule objects with unique `areas.id` values

+1

+1 here :)

+1
This would be especially useful when number of sections in a grouped tableView is dynamic, iterating over all objects to find distinct sectionIdentifier properties after every change notification seems extremely inefficient, especially for large datasets.

It would be even better if this could be built into RLMResults or another similar class, and potentially support change/fine grain notifications in the future.

yes, something like @distinctUnionOfObjects is badly needed

+1 - 10,000+ records with a lot of duplicates, just need to know the distinct/unique values in a column!

+1

+1

+1

+1

For Swift 1.2 (due to the contains func) you could do something like this

import Foundation
import RealmSwift

extension Results {

    func uniqueValueForObject<U : Equatable>(objectKey: String, paramKey: String, type: U.Type)->[U]{

        var uniqueValues : [U] = [U]()

        for obj in self {
            let obj = obj as T
            if(obj.valueForKey(objectKey) != nil){
                let uniqueObj: Object = obj.valueForKey(objectKey) as! Object
                if(uniqueObj.valueForKey(paramKey) != nil){
                    let uniqueObjValue = uniqueObj.valueForKey(paramKey) as! U
                    if !contains(uniqueValues, uniqueObjValue) {
                        uniqueValues.append(uniqueObjValue)
                    }
                }
            }

        }
        return uniqueValues
    }

    func uniqueValue<U : Equatable>(paramKey: String, type: U.Type)->[U]{

        var uniqueValues : [U] = [U]()

        for obj in self {
            let obj = obj as T
            if(obj.valueForKey(paramKey) != nil){
                let uniqueValue = obj.valueForKey(paramKey) as! U
                if !contains(uniqueValues, uniqueValue) {
                    uniqueValues.append(uniqueValue)
                }
            }

        }
        return uniqueValues
    }

    func uniqueObject(paramKey: String)->[Object]{

        var uniqueObjects : [Object] = [Object]()

        for obj in self {
            let obj = obj as T
            if(obj.valueForKey(paramKey) != nil){
                let uniqueObj : Object = obj.valueForKey(paramKey) as! Object
                if !contains(uniqueObjects,uniqueObj) {
                    uniqueObjects.append(uniqueObj)
                }
            }

        }
        return uniqueObjects
    }

}

Then you can use it like this...

Let's say we have a two very simple models Item and SubItem

class SubItem : Object {
     dynamic var name : String = ""
}

class Item : Object {
     dynamic var name : String = ""
     dynamic var subItem : String = ""
}

To use the extension one would use:

Realm().objects(Item).uniqueValue("name", type: String.self)
Realm().objects(Item).uniqueValueForObject("subItem", paramKey: "name", type: String.self)
Realm().objects(Item).uniqueObject("subItem")

Here's an updated version of the extension above, using Swift 2.0 (and Swift 2.0 compatible version of Realm)

@jpsim and @segiddins - thoughts on this implementation?

extension Results {

    func uniqueValueForObject<U : Equatable>(objectKey: String, paramKey: String, type: U.Type)->[U]{

        var uniqueValues : [U] = [U]()

        for obj in self {

            if let o = obj.valueForKeyPath(objectKey) {

                if let v = o.valueForKeyPath(paramKey){

                    if(!uniqueValues.contains(v as! U)){
                        uniqueValues.append(v as! U)
                    }

                }
            }

        }
        return uniqueValues
    }

    func uniqueValue<U : Equatable>(paramKey: String, type: U.Type)->[U]{

        var uniqueValues : [U] = [U]()

        for obj in self {

            if let val = obj.valueForKeyPath(paramKey) {

                if (!uniqueValues.contains(val as! U)) {
                    uniqueValues.append(val as! U)
                }

            }

        }
        return uniqueValues
    }

    func uniqueObject(paramKey: String)->[Object]{

        var uniqueObjects : [Object] = [Object]()

        for obj in self {

            if let val = obj.valueForKeyPath(paramKey) {
                let uniqueObj : Object = val as! Object
                if !uniqueObjects.contains(uniqueObj) {
                    uniqueObjects.append(uniqueObj)
                }
            }

        }
        return uniqueObjects
    }

}

Instead of doing all that jazz above, I took the functional programming approach:

// Query all users
let allUsers = Realm().objects(User)
// Map out the user types
let allTypes = map(allUsers) { $0.type }
// Fun part: start with empty array [], add in element to reduced array if its not already in, else add empty array
let distinctTypes = reduce(allTypes, []) { $0 + (!contains($0, $1) ? [$1] : [] )

Or for a nice one-liner

let distinctTypes = reduce(map(Realm().objects(User)) { $0.type! }, []) { $0 + (contains($0, $1) ? [] : [$1] ) }

Couple of notes here:
Suffix'd map/reduce (Realm().objects(User).map { $0.type } for example) don't play nice with the Result object but the standalone map() and reduce() do just fine.
I'm not sure of performance benefits over above solution but for brevity and concision, its a winner :D
And finally, I'm not sure how efficiently Realm handles object/property loading, they are lazily loaded supposedly so there shouldn't be any real benefit to having the Library itself do this over using something as simple as the one-liner above (aside from indexed lookups however?)
Benefits for the library with distinct queries would be more along the lines of aggregations but I'm not sure thats the route Realm is going right now.

Edit: Updated to remove need for map bringing efficiency up an order

let distinctTypes = reduce(Realm().objects(User), []) { $0 + (!contains($0, $1.type) ? [$1.type] : [] ) }

@kevinmlong Thanks for sharing!

or, more efficient (but stringly typed):

let distinctTypes = Set(Realm().objects(User).valueForKey("type") as! [String])

+1

@apocolipse it's not very efficient to use reduce to build arrays, see e.g. http://airspeedvelocity.net/2015/08/03/arrays-linked-lists-and-performance/

I'm kinda confused here. :confounded:
I use the latest version of RealmSwift & when I try to use @distinctUnionOfObjects in valueForKeyPath I get this error:

*** Terminating app due to uncaught exception 'NSUnknownKeyException', reason: '[<RLMAccessor_v0_KXStatus 0x16574e00> valueForUndefinedKey:]: this class is not key value coding compliant for the key @distinctUnionOfObjects.'

On Realm's NSPredicate Cheatsheet it's documented as an available feature & yet here this issue is still open.

Am I doing something wrong or this feature is still not in production?

@kexoth We don't support @distinctUnionOfObjects in queries yet, everything with the pink dot next to it is what we currently support.

@kishikawakatsumi this is the fastest response on Github Issues ever!! :+1:
Thanks for the info, I missed that detail in the documentation, now I saw that :sweat_smile:

+1

Support for this has been added in Realm's core, so it's no longer blocked. We do need to implement it in cocoa though, probably via @distinctUnionOfObjects.

Any info about when it will be implemented to realm-cocoa?

We've made lots of progress in core on this recently, but no one's actively working on exposing it via the Objective-C or Swift APIs. So no timeline we can share.

+1 for this.

The number of times I've used Realm on a project then regretted it when somewhat fundamental tools like this are lacking... It's a bit grating :)
Love the work you guys are doing though, just probably worthwhile waiting another half year or so next time.

+1

Started moving an app from CoreData to RealmSwift and the first query to convert uses request.returnDistinctResults on a table with approximately 100k+ rows. ;)

+9

+1

+1

+1

This needs a detailed API design to be proposed before we can move forward

In this category the functionality is implemented in a simple way at high level. Until the guys from Realm implement the functionality at low level I don't think it鈥檚 a bad solution.:

https://github.com/torcelly/RLMObjectDistinct

@torcelly It's important to note that that solution loses the lazy-loading functionality of RLMResult since it iterates over all objects in the results.

Totally agree. The performance isn't optimal and the value returned it's an NSArray . As I mentioned before, it's a temporal solution until the Realm guys add the feature. However it could be valid in some cases and the measure times is good even for a huge RLMResults count.

Add your comment to Readme. Thanks man!

You could also improve performance from O(n^2) to O(n) by making uniqueIDs an NSMutableSet instead of an NSMutableArray. 馃檪

It would be a great solution to improve the performance. However several methods add sort and we won't be able to access to the indexes. I could return NSMutableSet in the methods without sort but I don't know if the category lose consistency. Thoughts?

uniqueIDs is never returned, so this shouldn't be an issue, no? uniqueArray would not change type.

My mistake! you're right. I thought you were talking about uniqueArray. Changed and committed. Thanks for the feedback.

+1

+1

Any updates?

+1

+1

+1

+1

+1

+1

Any updates? This seems to be a dead end, as RBQ or realm results controller don't work for swift 3!

Work on this is ongoing here: realm/realm-object-store#235.

However, we still need to design how this feature should be exposed in our Objective-C and Swift APIs. Unfortunately, NSPredicate's @distinctUnionOfObjects isn't appropriate as that only returns the unique values of the property being "uniqued" (e.g. @distinctUnionOfObjects.name returning an array of unique names, not one object for each unique name).

A new method on Results (e.g. extension Results { func distinct(property: String) -> Self }) is an obvious way to expose this, but like I said, we haven't discussed it internally and no one from the community has stepped up to make a proposal.

+9

+100500

We have sortedResultsUsingKeyPath:, could we possibly have distinctedResultsUsingKeyPath: ?

PS: Without having looked it up, I'm not sure if 'distincted' is the correct past-participle for 'distinct' (English isn't my first language).

Sorted isn't used as a verb in the past tense but rather as an adjective: "the results are sorted".

So the equivalent naming for distinct would be distinctResultsUsingKeyPath:: "the results are distinct".

But I think the current implementation doesn't support nested key paths, so they can only be applied to direct properties of an object, so we might not call it a "key path" either, simply just "property".

I think the objectstore part doesn't support nested keypaths, but that would be pretty easy to fix. The core distinct logic does support all the same things as sorting.

+1

Please implement it at last

Hi
Any status? 220000 rows. Query are fast, but get distinct are very slow.

Any updates? This issue open at 2014; and seems the "Core" already finished the distinct API.

I need it so badly, pls implement it ;)

@FreudGit what's your meaning?? so, Is there any indirectly way to implement it ?

@0xdatou you can perform a distinct operation in memory by copying data out of a Realm: https://github.com/realm/realm-cocoa/issues/1103#issuecomment-125381124

This issue is tracking adding built-in support in Realm itself, with auto-updating Results consistent with all our other querying capabilities.

The table I want to use distinct on has about 100.000 objects, also to be able to make sections in the table that displays the data. Post-query filtering/mapping to figure out the section count and titles is a hit on performance, because the user can filter the data using a search bar on the fly. Currently that means it'll need to re-map the sections and go over those 100.000 records with every keystroke.

Is there any other way to keep this performing well without the use of distinct? Or can we expect this feature soon?

@krummler you might want to checkout this pull request.

@qubblr Thanks, sounds great, but so far this hasn't been implemented/merged? Is there any alternative or temporary solution?

With Realm-Core 3.0+ the behavior seems well-determined enough so that this could be made available in Cocoa, just like in Realm-Java

@jpsim Will this be implemented in Cocoa version? :)

Fixed by #5467.

When will the next release be in which this is included?

禄 When will the next release be in which this is included?
+1

Was this page helpful?
0 / 5 - 0 ratings