How do I remove Duplicate Records from a dataset?

How Can We Help?

How do I remove Duplicate Records from a dataset?

You are here:
< Back

These instructions may be helpful if you get an Error message after attempting an import from your HMIS implementation and see an error message in the Validation tab regarding duplicate IDs.

According to the HUD CSV Exchange Format guidance, this should be a unique field in that only one record should be assigned each ID.  While this issue should be resolved by your HMIS vendor there may be times when you are in a pinch and cannot wait for the vendor.  If this occurs, here are instructions on how to identify and remove duplicate records.

 

To see if this is your issue…

If you click on the “Validation” tab, scroll down, and see something along the lines of what is shown below, then you have an issue with duplicate IDs for the table and the data element that is listed.  In the example below, the issue is with the IncomeBenefitsID from the IncomeBenefits.csv table.

Addressing the issue…

The easiest way to remove the duplicates is to unzip the file locally, open up the table, and follow these instructions.  Once done, you must format the date fields following these instructions before saving the table.  The files will then need to be rezipped and reloaded into the warehouse.