Hi all.
I am wondering about simple, efficient, and accurate methods to create a unique ID for a DVD or BluRay optical disc. However I am not sure where to begin, as this is not something I have attempted before. I also have a few requirements for how I'd like it to work..
1: Must be able to tell the difference between a DVD and a BluRay disc. Cannot be hardware/drive based because of more complex cases like Combo DVD/BR reading/burning drives.
2: Needs to be able to uniquely ID unique movie titles (Batman vs Star Wars), but also not confuse two different discs of the SAME title (I put in my Batman blu-ray and someone else puts in their Batman Blu-ray - but they are the same movie).
3: Possibly need a way to identify that a file was not generated by a Disc - to help prevent duplicate entries and also erroneous entries by troublemakers (you NEED to own the disc to submit)
#3 I can probably work out on my own.. But I am looking for help/guidance on #1 and #2. Is there any part of commercial stamped DVD & BluRay discs that will guarantee I can correctly identify two copies of the same movie, but differentiate from a later release such as a "Special Edition" "Director's Cut" etc ??
As for identifying them in the DB and keeping entries unique, I was thinking of a simple MD5 of the unique part of the disc (Disc Title?) + an MD5 of the name the user entered for the title?
If anyone can offer ideas, or guidance it would be greatly appreciated.
What prompted me to think about starting this project
A while ago I set myself up an HTPC, and as I have an on/off relationship with video encoding, and the Doom9 forums, one thing that has struck me is that there is no reliable source for DVD and BluRay chapters on the Internet. With the advent of containers such as MKV (mp4 too?) that support Chapters containing both the traditional time stops, as well as text names, this is a hugely neglected field.
There is one project out there, called "Chapter Extractor" which can both pull chapter timings from a disc, and also reference Internet databases for timing/name information. However it is done in "free time" like most projects these days, and is IMHO woefully lacking; although I do use it when possible. It is lacking in a lot of areas, namely having a robust system to identify accurate records, and having a simple way to upload / update previously existing records. It also has weird unidentifiable rules for what it considers a valid name vs invalid (for the record itself).
For instance - I painstakingly took the time to copy chapter names from my Princess Bride: 25th Anniversary Edition, BluRay. I saved my local chapters file and it uploaded it. Then I realized I had a typo in one of the chapter names.. So I popped my disc back in and opened it, the program found and matched my entry for my disc in the database, and I made the typo correct, then re-saved my file.. Saving triggers an upload, yet it appears I can't even update/correct record entries that came from MY account/API key!!
So.. I'm thinking about making my own database. It'll be local to start with, and once if I get far enough to implement a better / more functional user experience, then I'll start thinking about maintaining an online database, or something like that...
Hashing Optical Media for Unique IDs ???
-
- Addict
- Posts: 1675
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Hashing Optical Media for Unique IDs ???
Nobody has any input? 

Re: Hashing Optical Media for Unique IDs ???
DVDs and Blu-rays are just files on a disc. Can't you just total their filesizes or something?
-
- Addict
- Posts: 1675
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Hashing Optical Media for Unique IDs ???
That's just the thing. I don't know the best way to go about doing this. It's all new to me. That's why I am looking for advice.
But ideally this needs to be something that works fast. As in you put a disc in a drive and within a couple of seconds it returns a matching record. As the software I wish to "compete" with can currently do.
So I don't want to have to parse the entire disc, especially a blu-ray which may have dozens of files a a few levels deep.
I am starting to believe I probably just want to hash the Disc "type" (DVD or BRD based on directory structure) and Disc Title. And if need be I could adapt it in the future to assign multiple Hashes to a database record (i.e you have the same disc, same chapter names, but something with the pressing changed; files or file names changed, or file count changed, etc)
But ideally this needs to be something that works fast. As in you put a disc in a drive and within a couple of seconds it returns a matching record. As the software I wish to "compete" with can currently do.
So I don't want to have to parse the entire disc, especially a blu-ray which may have dozens of files a a few levels deep.
I am starting to believe I probably just want to hash the Disc "type" (DVD or BRD based on directory structure) and Disc Title. And if need be I could adapt it in the future to assign multiple Hashes to a database record (i.e you have the same disc, same chapter names, but something with the pressing changed; files or file names changed, or file count changed, etc)
Re: Hashing Optical Media for Unique IDs ???
Maybe you can get some inspiration from this
http://en.wikipedia.org/wiki/CDDB
http://en.wikipedia.org/wiki/CDDB
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
-
- Addict
- Posts: 1675
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Hashing Optical Media for Unique IDs ???
Thanks, I will read that link right now.
Re: Hashing Optical Media for Unique IDs ???
I think the problem is the matter of speed. Maybe a mix of physical media properties and file sizes would help.
Unfortunately it is not so easy to get the medium Infomation. I know of two ways:
1. Using the SPTI interface and 2. using the IMAPI interface. However, both ways require a lot of work.
You can search for both ways on google / yahoo / msn, for IMAPI look here. Advantage of both methods is the speed.
Sure you could even calculate the checksums (md5/sha2/ripemd ...) of the files, but this would require much more time.
Yes ... I do not know if this information help you further, unfortunately another idea I have not.
Unfortunately it is not so easy to get the medium Infomation. I know of two ways:
1. Using the SPTI interface and 2. using the IMAPI interface. However, both ways require a lot of work.
You can search for both ways on google / yahoo / msn, for IMAPI look here. Advantage of both methods is the speed.
Sure you could even calculate the checksums (md5/sha2/ripemd ...) of the files, but this would require much more time.
Yes ... I do not know if this information help you further, unfortunately another idea I have not.
PureBASIC v5.41 LTS , Windows v8.1 x64
Forget UNICODE - Keep it BASIC !
Forget UNICODE - Keep it BASIC !
-
- Addict
- Posts: 1675
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Hashing Optical Media for Unique IDs ???
I am thinking I will first try a simple hashing of the Disc Type + the Disc Label (Volume Label)
So, a quick scan for folders that are mutually exclusive for DVD and Bluray formats, then mark the type in a variable.
Then query the disc label. Type+Label = Hash. Maybe I will query a file size of a unique file only found on either type of disc as well. Type+Label+String(filesize) = Hash.
This should work for differentiating between different versions of a movie, and also including the type (DVD/BRD) as a pseudo-salt will guarantee DVD and BRD titles are distinguished even if all other data result in a highly (99.9%) improbable match
So, a quick scan for folders that are mutually exclusive for DVD and Bluray formats, then mark the type in a variable.
Then query the disc label. Type+Label = Hash. Maybe I will query a file size of a unique file only found on either type of disc as well. Type+Label+String(filesize) = Hash.
This should work for differentiating between different versions of a movie, and also including the type (DVD/BRD) as a pseudo-salt will guarantee DVD and BRD titles are distinguished even if all other data result in a highly (99.9%) improbable match
Re: Hashing Optical Media for Unique IDs ???
CD's actualy haver serial numbers which can be read. However i dont know about there uniqueness. They are most likely not unique but could be a good starting point, especialy if you want to differentiate between two CD's with the same content and name.
There are even more values that can be read to hash a ID, like if it's a CDR. Couldnt find a good source of information about it just now. Try search for it on google.
http://www.codeproject.com/Articles/169 ... er-of-a-CD
There are even more values that can be read to hash a ID, like if it's a CDR. Couldnt find a good source of information about it just now. Try search for it on google.
http://www.codeproject.com/Articles/169 ... er-of-a-CD
-
- Addict
- Posts: 1675
- Joined: Sun Dec 12, 2010 12:36 am
- Location: Somewhere in the midwest
- Contact:
Re: Hashing Optical Media for Unique IDs ???
This might be the case for CD's, but I have tried numerous terms and haven't been able to find anything similar with DVD/BluRay media. What little mention I've found also only appears to be related to burned discs, and not retail pressed discs. I am aware that media produced for burning have Manufacturer/Product ID's but that isn't quite the same as what I need.
It would be nice if there was a way to get a fingerprint like that, but I'm not sure if there is a better way than hashing at this point.
It would be nice if there was a way to get a fingerprint like that, but I'm not sure if there is a better way than hashing at this point.