Below is what I think the data model for capturing unspsc codes should look like.
table name : unspsc_codes
columns:
id number not null (internal id within your system that can be referenced in other tables)
key varchar(8 ) not null (even though it's currently 6 chars, giving extra buffer)
code varchar(8 ) not null (even though key and code are numbers, using varchar so that these codes can be stored in existing product classification data models)
level number not null (this refers to level 1, 2, 3 and 4 for the 4 levels of UNSPC)
parent_id number (this refers back to the parent's id)
title varchar(256) not null (this is the title of the code. even though titles are shorter, for multi-byte languages and for future expansion, keeping it 256)
description varchar(2048 ) (same comment as above)
For those using these codes in multiple languages, an additional column to use is
language varchar(3) not null
that captures the title and description for each language.
Apart from these above commonly known elements of unspsc codes, there are 2 more to capture. They are
value_code varchar(10) not null (segments are organized such that they reflect how the value is added progressively to the products. While this can be derived from the segment codes, storing this explicitly may come handy for certain types of analysis and even creating non-function based indexes)
business_function varchar(4) (even though this should probably not null, I have seen a mention of this business function in the unspsc basics pdf file but haven't seen any further information. Perhaps it's yet to come. I am just adding it to the data model anyway but keeping it null. Also, per UNSPC, the length is 2. Planning for the future is always good).
Now, the code itself can be stored in 3 formats
1. the entire 8 letter code for each level (1000000)
2. the entire 8 letter code with a delimiter (10.00.00.00)
3. just two letter parts at each level (10, 00, 00, 00)
I would prefer going with 1 because this is the standard while 2 is not. Also, 1 gives the code immediately for lower levels while 3 would require traversing up the hierarchy using the parent_id
Apart from all the above columns, I would like to add a few more for the purpose of audit. These are
unspsc_version varchar(10) (this indicates the unspsc version that first introduced this code)
creation_date datetime (this indicates when was the first time this code got created in the system)
update_date datetime (since UNSPSC codes could have updates, we need to track the latest update to the code)
deleted_flag char(1) (indicates if this code has been deleted by UNSPSC. You still want to keep such codes as your existing transactions might be referencing these codes)
deleted_unspsc_version varchar(1) (the unspsc version when this code got deleted)
I will be updating this article in the future based on my further study of the UNSPSC classification
