I am working in a project where database items are not deleted, but only marked as deleted. Something like this:
id name deleted
--- ------- --------
1 Thingy1 0
2 Thingy2 0
3 Thingy3 0
I would like to be able to define something like a UNIQUE constraint on the name
column. Seems easy, right?
Let's imagine a scenario in which "Thingy3" is deleted, and a new one is created (perhaps years later). We get:
id name deleted
--- ------- --------
1 Thingy1 0
2 Thingy2 0
3 Thingy3 1
...
100 Thingy3 0
From the user's point of view, he deleted an item and created a new one. Much like deleting a file, and creating a new file. So it's obvious to him that the new item is unrelated and unattached to any data connected to the old item.
That's already handled, since the DB only cares about the id
, and since the new item has an id
of 100 instead of 3, they are utterly different.
My difficulty arises when I want to prevent the user from creating another "Thingy3" item. If I had a UNIQUE constraint that only looked at items that aren't marked deleted
, then I would have solved one problem.
(Of course, then I'd have to deal with what happens when someone does an undo of the delete...)
So, how can I define that sort of a constraint?
-
Not sure about SQLServer2005, but you can define compound constrainst/indexes
CREATE UNIQUE INDEX [MyINDEX] ON [TABLENAME] ([NAME] , [DELETED])
As pointed out by SteveWeet, this will only allow you to delete/create twice.
Steve Weet : This will only allow one delete of a 'Thingy3' you can't create it, delete it, create it again and then delete it again.Robert Gould : Good point, this won't allow N updates, fixing it -
For example, you can add an illegal character (*) to the deleted name. But you still have problems undeleting a deleted item. So probably the better idea is to prohibit double names even if they are deleted.
You can clean the deleted records after an amount of time (or move them to a separate table).
belugabob : Modifying the user entered data is always going to be problematic, as you can't necessarily predict whether the data will end with an illegal character already. Admittedly, the character will only be added to 'deleted' records, so that introduces a certain element of predictability. Either way, I prefer Steve Weet's solution, as it support undeletes. -
Create a unique constraint on (name, deleted). This will mean you can only have one deleted per name, however.
The obvious work-around for that works under ANSI-92 but not on MS SQLServer: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=299229
-
For a simple table called [Test] with columns ID(int), Filter(nvarchar), Deleted(bit)
ALTER TABLE [dbo].[Test] ADD CONSTRAINT [DF_Test_Deleted] DEFAULT (0) FOR [Deleted], CONSTRAINT [IX_Test] UNIQUE NONCLUSTERED ( [filter], [Deleted] ) ON [PRIMARY]
belugabob : This approach become problematic when a record is 'deleted' a record with the same 'Filter' value is added and the an attempt to 'delete' this record is made - as the constraint is then violated. -
Add a random hash after the unique name. Something that is easily reversible. Possibly separate with an underscore or some other character.
Edit after comment: You could simply add underscore and the current timestamp.
scraimer : Brilliant! Heck, I could even change the name to be something like "Thingy3 (deleted on 1/1/2000 #1)" for extra info to the user! -
You could add the id value to the end of the name when a record is deleted, so when someone deletes id 3 the name becomes Thingy3_3 and then when they delete id 100 the name becomes Thingy3_100. This would allow you to create a unique composite index on the name and deleted fields but you then have to filter the name column whenever you display it and remove the id from the end of the name.
Perhaps a better solution would be to replace your deleted column with a deleted_at column of type DATETIME. You could then maintain a unique index on name and deleted at, with a non-deleted record having a null value in the deleted_at field. This would prevent the creation of multiple names in an active state but would allow you to delete the same name multiple times.
You obviously need to do a test when undeleting a record to ensure that there is no row with the same name and a null deleted_at field before allowing the un-delete.
You could actually implement all of this logic within the database by using an INSTEAD-OF trigger for the delete. This trigger would not delete records but would instead update the deleted_at column when you deleted a record.
The following example code demonstrates this
CREATE TABLE swtest ( id INT IDENTITY, name NVARCHAR(20), deleted_at DATETIME ) GO CREATE TRIGGER tr_swtest_delete ON swtest INSTEAD OF DELETE AS BEGIN UPDATE swtest SET deleted_at = getDate() WHERE id IN (SELECT deleted.id FROM deleted) AND deleted_at IS NULL -- Required to prevent duplicates when deleting already deleted records END GO CREATE UNIQUE INDEX ix_swtest1 ON swtest(name, deleted_at) INSERT INTO swtest (name) VALUES ('Thingy1') INSERT INTO swtest (name) VALUES ('Thingy2') DELETE FROM swtest WHERE id = SCOPE_IDENTITY() INSERT INTO swtest (name) VALUES ('Thingy2') DELETE FROM swtest WHERE id = SCOPE_IDENTITY() INSERT INTO swtest (name) VALUES ('Thingy2') SELECT * FROM swtest DROP TABLE swtest
The select from this query returns the following
id name deleted_at 1 Thingy1 NULL 2 Thingy2 2009-04-21 08:55:38.180 3 Thingy2 2009-04-21 08:55:38.307 4 Thingy2 NULL
So within your code you can delete records using a normal delete and let the trigger take care of the details. The only possible issue (That I could see) was that deleting already deleted records could result in duplicate rows, hence the condition in the trigger to not update the deleted_at field on an already deleted row.
belugabob : Voted this up, because it was exactly what I was considering for my own application. Trying to think of any downsides, but it seems solid enough so far.scraimer : I love the "deleted_at" idea! -
The problem with a compound unique constraint is that it's not possible to have multiple records with the same name that are deleted. This means that the system will break once you delete a third record. I wouldn't recommend appending stuff to the names because, theoretically, a duplicate situation could arise. Also, by doing so, you are basically corrupting the data in the database as well as adding cryptic business logic to the data itself.
The only possible solution, database wide, is to add a trigger that checks that the inserted/updated data is valid. It's also possible to put the checks outside of the database, into code.
-
It might be worth considering using a "recycle bin" table. Instead of keeping the old records in the same table with a flag, move them to its own table with its own constraints. For instance, in the active table you do have a UNIQUE constraint on name, but in the recycle bin table you don't.
-
Instead of deleted column use end_date column. When user deletes a record add the current date in end_date column. Any records where end_date column is NULL are your current records. Define a unique constraint on two columns name and end_date. Due to this constraint you never have a scenario where valid record name is duplicated. Any time user wants to undelete a record, you need to set the end_date column to null and if this violates the unique constraint then you show a message user to user that the same name already exists.
scraimer : That's what Steve Weet already said at http://stackoverflow.com/questions/771197/what-to-do-when-i-want-to-use-database-constraints-but-only-mark-as-deleted-inste/771337#771337 -
The type of constraint you require is a table-level
CHECK
constraint i.e. a CHECK constraint consisting of a subquery which testsNOT EXISTS
(or equivalent) for the table e.g.CREATE TABLE Test ( ID INTEGER NOT NULL UNIQUE, name VARCHAR(30) NOT NULL, deleted INTEGER NOT NULL, CHECK (deleted IN (0, 1)) ); ALTER TABLE Test ADD CONSTRAINT test1__unique_non_deleted CHECK ( NOT EXISTS ( SELECT T1.name FROM Test AS T1 WHERE T1.deleted = 0 GROUP BY T1.Name HAVING COUNT(*) > 1 ) ); INSERT INTO Test (ID, name, deleted) VALUES (1, 'Thingy1', 0) ; INSERT INTO Test (ID, name, deleted) VALUES (2, 'Thingy2', 0) ; INSERT INTO Test (ID, name, deleted) VALUES (3, 'Thingy3', 1) ; INSERT INTO Test (ID, name, deleted) VALUES (4, 'Thingy3', 1) ; INSERT INTO Test (ID, name, deleted) VALUES (5, 'Thingy3', 0) ; INSERT INTO Test (ID, name, deleted) VALUES (6, 'Thingy3', 0) ;
The last
INSERT
(ID = 6) will cause the constraint to bite and theINSERT
will fail. Q.E.D....ooh, nearly forgot to mention: SQL Server doesn't yet support
CHECK
constraints with that contain a subquery (I tested the above on ACE/JET, a.k.a. ). While you could use aFUNCTION
I've read this is unsafe due to SQL Server testing constraints on a row-by-row basis (see David Portas' Blog). Until this full SQL-92 feature is supported in SQL Server, my preferred workaround is to use the same logic in a trigger.
0 comments:
Post a Comment