Cht-core: Confirm validation can catch duplicate registrations

Created on 9 Oct 2014  路  37Comments  路  Source: medic/cht-core

Registrations with the same name and same lmp/dob within 3 months should not validate. i.e. ANCR #jina Sally # dob1 or ANCR# jina Sally# lmp1 submitted within 3 months of each other should fail validation.

1 - High Bug

All 37 comments

Also wondering if we can support/optionally confirm it is coming from the same phone number. Because name + lmp/dob might not be unique enough. Initially the name+dob/lmp+ 3 months time is good enough.

Closed: development is believed to be complete. Keeping acceptance-testing tag. Reopen if acceptance testing fails.

@mandric, with which config/validation rules is this? Is there a commit associated to this issue?

Assigning to @mandric for acceptance testing.

Will do @abbyad, basically need to test the uniqueWithin validation rule on a duplicate pregnancy registration.

Tested on alpha, still need to debug this, doesn't work with the following config (see medic-data-private/mch/dist/kujua-settings.json for full settings):

{
  "property": "age_in_years",
  "rule": "exists ? uniqueWithin('patient_name', 'age_in_years', '3 months') : optional",
  "message": "Jina la mama mjamzito imesajili hapo awali. Tafadhali hakikisha hujasajili huyu mama hapo awali."
}

and messages:

ANCR #n Sally # age21
ANCR #n Sally # age21

hi frand @abbyad, please triage before the end of this sprint. :parrot_conga::parrot_conga::parrot_conga:

Moving this to acceptance testing, and we can close if it works.

Do we have/need a specific config (or host) we can test this on?

Is this testable on any instance/config @abbyad ?

Any clues @abbyad on how to AT this ... @mandric ?

@ngaruko you would need to setup the right form and confguration and then send in some test messages. Check out https://github.com/medic/medic-webapp/issues/651#issuecomment-66848358 there is a snippet of validation configuration here that you could use as an example if you wanted to AT this. If there isn't tests covering this there probably should be.

@abbyad assigning to you for AT.

I was not able to get this to work, even with a simple config:

      "validations": {
        "join_responses": true,
        "list": [
          {
            "property": "patient_id",
            "rule": "uniqueWithin('patient_id', '1 week')",
            "message": [
              {
                "content": "Was caught as a duplicate with uniqueWithin()",
                "locale": "en"
              }
            ]
          }
        ]
      },

According to the old help test:

uniqueWithin: Used to specify one or more fields which must be unique within a given time frame, eg: "uniqueWithin('patient_id', '1 months')

Could this just be implemented in a couch db e.g. sms_received?

{
  "_id": hash(chw_phone_number + sms_content),
  "dates_accepted_messages_were_received": [ "2017-10-12T07:01:59.839Z" ]
}

Then when a message is received, hash it and check in the DB if an identical message has been received within a particular timeframe.

  • If it's a recent duplicate, discard it.
  • If it's a duplicate, but the previous message was outside the defined timeframe

    1. append the date to dates_accepted_messages_were_received

    2. process the message

  • If it's not a duplicate

    1. create a new doc in the sms_received db

    2. process the message

It might also be possible to make a view on the current db for the SMS hashes and dates, achieving the same thing.

@abbyad In your testing I assume the 'patient_id' was in the 'fields' property? eg:

{ _id: 'xyz', fields: { patient_id: '01234' } }

The code currently only supports fields at the top level, not nested. I'm tempted to change the code so it uses doc[fieldName] || doc.fields.[fieldName]. This is a slight hack because it's hardcoding the fields bit, but this is by far the most common location for additional fields. Also this matches the lucene index which adds all fields values to the top level, so fields.patient_id: '01234' ends up as patient_id:'01234' in the index.

Does this make sense to you?

Yeah, that makes sense. That means the feature stopped working when we moved form fields from top level to inside fields years ago, right?

That's right. It would still work for things like the form code but not for the form fields.

This also applies to the unique function...

Code review please @SCdF . Let's not merge it until we decide on merging api and sentinel - just leave it assigned to me.

@abbyad It turns out my assessment was wrong and the uniqueWithin code was correct. The problem was the lucene request was failing silently. I'm not sure how long this has been broken.

Back to you @garethbowen.

I don't want to derail solving this ticket, but I'm wondering if we want to consider scheduling in dropping lucene, and replacing it either with couchdb views, or with postgres (and an improved couch2pg that runs in realtime). We seem to use lucene in really weird ways that sound non-optimal for the problem it's supposed to solve (though I could be misjudging lucene here)

Yeah possibly. We actually have the medic-client/reports_by_freetext view which would probably work here, for example, it has keys like: patient_id:51943. I'll look into it...

@SCdF I've updated the PR with a non-lucene approach. It's a little more fiddly but does mean we can remove lucene from sentinel altogether so overall is much better IMO. Please review again.

@garethbowen as discussed let's re-jig it a bit so it is a bit more optimal in getting full docs to avoid possible pathologic performance issues

@SCdF Re-re-jigged - please re-re-review!

@garethbowen I'll leave you to merge

Merged manually.

I still can't get this to work. Here is the config portion added:

"validations": {
  "join_responses": true,
  "list": [
    {
      "property": "patient_id",
      "rule": "uniqueWithin('patient_id', '1 week')",
      "message": [
        {
          "content": "Was caught as a duplicate with uniqueWithin()",
          "locale": "en"
        }
      ]
    },
    {
      "property": "patient_id",
      "rule": "unique('patient_id')",
      "message": [
        {
          "content": "Was caught as a duplicate with unique()",
          "locale": "en"
        }
      ]
    }
  ]
},

I can see the validation error for unique but not uniqueWithin

@abbyad What instance did you test on? What report did you expect to fail validation?

Tested on alpha.dev, with the D form.

Code review please @SCdF

Merged into master.

I am seeing unique work and uniqueWithin work in some cases, but not all. Will debug a bit more and post more details shortly.

After debugging with @abbyad I think the problem is to do with the validation view request key parameter not being lowercased but the view emit lowercasing the keys. Confirm, patch, backport.

Note for future acceptance testing... if you want to validate for a patient_id being unique for a particular form you need to add the 'form' as a parameter. Otherwise it will look for patient_id across all reports.

unique('form', 'patient_id')

Code review please @dianabarsan . Also needs to be backported to 2.15.x.

I can confirm that this works well in 2.15.0-beta.2.

Updated the documentation via https://github.com/medic/medic-docs/commit/b1d6fcec3c7e6728ebda51f91c9692c12b7bd652 to clarify use.

Was this page helpful?
0 / 5 - 0 ratings