SHA-1 Hashes in DAT-o-MATIC

Post bug reports and suggestions for the website, forums and DOM here.
Post Reply
Buddybenj
Posts: 134
Joined: 29 Jan 2015 23:00

SHA-1 Hashes in DAT-o-MATIC

Post by Buddybenj » 31 Dec 2017 06:37

The DATs have SHA-1 hashes, so why aren't those hashes on the DAT-o-MATIC website (only CRC32 and MD5 are listed)?

This came to my attention based on this tweet: https://twitter.com/byuu_san/status/946528051428560897 Note however that byuu's saying SHA-2 is necessary because SHA-1 has known collisions is cryptographically flawed reasoning; SHA-1 is still secure against second-preimage attacks, which is all that matters for our purposes.

User avatar
xuom2
High Council
Posts: 908
Joined: 22 May 2008 18:45

Re: SHA-1 Hashes in DAT-o-MATIC

Post by xuom2 » 31 Dec 2017 14:45

authors are not visible to public, i think it's time to open this info, at least for authors outside our group.

about sha-2, no problem for me: if anyone who has time to send me a list of md5 and relative sha-2, I will add such data in db.

ready for next step: dumper's DNA sequence.
You do not have the required permissions to view the files attached to this post.

Whovian9369
Datter
Posts: 70
Joined: 09 Sep 2016 18:36

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Whovian9369 » 31 Dec 2017 17:03

I would definitely be up for extracting and hashing some of the smaller, but complete packs that I have on hand. (Like GB, GBC, SNES, etc)

My only issue with this is seeing if tools like CLRMamePro or RomCenter are updated to use and check the newer checksums (SHA{2,5}, etc...), and if the new entries (SHA2) in the DATs would confuse the heck out of them if not.

(Also, can we get static links that always link to the newest DAT for automatic downloading via CLRMP? It gets a little time consuming to click every single floppy, haha. This is just me hoping, at this point :P)

Buddybenj
Posts: 134
Joined: 29 Jan 2015 23:00

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Buddybenj » 31 Dec 2017 18:56

I too would be up for computing SHA-256 hashes for many ROM sets. But my point was that the SHA-1 hashes, which are already part of the DATs, should be displayed on the DAT-o-MATIC website (and, contrary to byuu's claim, SHA-1 is sufficient for our purposes).

I didn't know the authors were stored in the database. (Wouldn't "dumper" be a more accurate name by the way?) I see no reason this shouldn't be public. If dumpers within No-Intro want to remain anonymous (as I think you were implying), then the website could at least list the author as "No-Intro member".

Whovian9369
Datter
Posts: 70
Joined: 09 Sep 2016 18:36

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Whovian9369 » 31 Dec 2017 19:30

Buddybenj wrote:I too would be up for computing SHA-256 hashes for many ROM sets. But my point was that the SHA-1 hashes, which are already part of the DATs, should be displayed on the DAT-o-MATIC website (and, contrary to byuu's claim, SHA-1 is sufficient for our purposes).
Honestly for the SHA1 thing, yes it is sufficient for what we're doing at the moment. But what about the future when SHA2 or SHA5 are is the only secure one? *Shrugs*

Better safe than sorry, in my opinion, here.
Last edited by Whovian9369 on 01 Jan 2018 10:41, edited 1 time in total.

Buddybenj
Posts: 134
Joined: 29 Jan 2015 23:00

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Buddybenj » 31 Dec 2017 23:07

SHA-5 doesn't exist. You're thinking of SHA-2, which includes SHA-224, SHA-256, SHA-384, and SHA-512. If we used SHA-2, we'd probably use SHA-256, as that's the most common (and what byuu himself uses).

It's true that it's better to be safe than sorry, and thus I'd be fine adding SHA-256 hashes. But, using SHA-1 hashes is already conservative IMO, as I think we are a long ways away from computationally feasible SHA-1 second-preimage attacks; in fact, I don't think there are currently any practical ones for MD5, but that could change in the near future.

hydr0x
Dumper
Posts: 874
Joined: 25 May 2008 15:31

Re: SHA-1 Hashes in DAT-o-MATIC

Post by hydr0x » 01 Jan 2018 16:59

Buddybenj wrote:I see no reason this shouldn't be public.
Maybe that it's completely illegal to share your dumps with anyone in most countries, in some cases even to create them? That's what byuu doesn't get. Sure, the data only proves someone "also" dumped the game and got this checksum. It doesn't definitely prove he was the one to share it, but still, it's thin legal ground.

User avatar
xuom2
High Council
Posts: 908
Joined: 22 May 2008 18:45

Re: SHA-1 Hashes in DAT-o-MATIC

Post by xuom2 » 01 Jan 2018 17:13

Exactly. Once I'm ready, I will ask dumpers if they want appear on DOM pages.

About sha2, and other info (publisher, game type, tech info, ...): it can be added in "media" table. Working with an idea of "opening" such edits to general public, with a "confirm" needed by any datter. The problem is never with the db, but with finding people that are able to add such data.

norkmetnoil577
Datter
Posts: 21
Joined: 20 Aug 2016 21:30

Re: SHA-1 Hashes in DAT-o-MATIC

Post by norkmetnoil577 » 01 Jan 2018 22:08

Yes, I think showing SHA-1 on the pages is a good idea and makes sense since we have them in the DB already. As for other hashes, I think for the future SHA256 is a fine idea (though currently not implemented in CMP Dir2Dat or other tools, we could always store it in the DB and not in the dats until such a time) but I am not too concerned as a false match would have to match not just one of the weaker 3 hashes but all 3 to be undetected, which is very very unlikely.

I don't think the lack of public dumper names is a very valid criticism and since dumping is so sensitive (liability issues, privacy as we discussed, especially since dumpers of rare things can be involved in auctions etc with their identities). Having the name, even though it is probably a pseudonym, in the public view is unnecessary. What should really matter is the number of redumps (do other people get matching hashes?) and the tool used to dump. I think byuu wants to use the dumper identity as a proxy for quality of dumping tools, i.e. "all of my byuu dumps are clean but XYZ's aren't" which can be true but instead we already have the field in DoM for dumping tool/software used. It's used a lot in NDS/3DS with and could/should be used more perhaps in the older systems but I think that's a lot more fair judgment of whether a dump is "clean" than to say that we have to go by reputation alone. Another question is for dumps sourced from dumping projects/groups, like "XYZ Preservation Society" or TOSEC or Redump or something, which should be noted and are maybe less personal but still I think the tools matter most.

Obviously if one user is faking dumps or something the members who see the full DB can deal with it.

User avatar
xuom2
High Council
Posts: 908
Joined: 22 May 2008 18:45

Re: SHA-1 Hashes in DAT-o-MATIC

Post by xuom2 » 01 Jan 2018 22:17

now SHA-1 is shown instead of MD5 :mrgreen:

Screwtape
Posts: 25
Joined: 29 Aug 2017 08:46

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Screwtape » 03 Jan 2018 01:48

Buddybenj wrote:Note however that byuu's saying SHA-2 is necessary because SHA-1 has known collisions is cryptographically flawed reasoning; SHA-1 is still secure against second-preimage attacks, which is all that matters for our purposes.
While there are no currently-known second-preimage attacks, SHA1 is broken to the point that academic cryptographers probably won't be investigating it too closely in the future; any problems they found would be seen as "kicking a dead horse" rather than "bold new discovery". The prudent choice would be to move to SHA2 (or even SHA3) as the primary file identifier, but at the very least there should be a field in the database and No-Intro should ask for the SHA2 hash when people submit new dumps and redumps.
norkmetnoil577 wrote:I think byuu wants to use the dumper identity as a proxy for quality of dumping tools, i.e. "all of my byuu dumps are clean but XYZ's aren't"
The concern I've heard byuu express is people reporting false information: there's been a number of games in GoodTools that were marked as "verified" that turned out to be corrupt or modified to work in emulators or whatever. Whoever "verified" that dump was clearly untrustworthy, and all their submissions should be removed from the DB, or at least inspected more closely... but because GoodTools doesn't record the dumper identity, we have no idea what other submissions might have been affected.

Perhaps if the DAT-o-MATIC listed the SHA256 of each dumper that verified a particular game hash? That shouldn't be traceable back to any real-world person, but would allow people to figure out which dumps were well-attested and which dumps would be affected by any discovered falsity. Instead of a raw SHA256 of the username, you might want to prefix it with a unique string like "NoIntroDatOMatic" so that the hashes won't be found in rainbow tables, etc. Or, just give every datter a UUID or something.

Buddybenj
Posts: 134
Joined: 29 Jan 2015 23:00

Re: SHA-1 Hashes in DAT-o-MATIC

Post by Buddybenj » 03 Jan 2018 02:41

Screwtape wrote:Perhaps if the DAT-o-MATIC listed the SHA256 of each dumper that verified a particular game hash? That shouldn't be traceable back to any real-world person, but would allow people to figure out which dumps were well-attested and which dumps would be affected by any discovered falsity. Instead of a raw SHA256 of the username, you might want to prefix it with a unique string like "NoIntroDatOMatic" so that the hashes won't be found in rainbow tables, etc. Or, just give every datter a UUID or something.
If we did this, then the moment the dumper of one dump was known/leaked, all of his dumps would be publically known. For instance, since we know Callis dumped this Super Game Boy ROM, we would know all his other dumps. And even if Callis didn't reveal he was the dumper, it would've been obvious since he is (I presume) the only one who has this rare beta cartridge.

Also, if you did this, you definitely would want the prefix (salt) to be secret; otherwise, I could just hash (for example) "NoIntroDatOMaticCallis" to find all of Callis' dumps.

User avatar
xuom2
High Council
Posts: 908
Joined: 22 May 2008 18:45

Re: SHA-1 Hashes in DAT-o-MATIC

Post by xuom2 » 03 Jan 2018 16:53

The concern I've heard byuu express is people reporting false information: there's been a number of games in GoodTools that were marked as "verified" that turned out to be corrupt or modified to work in emulators or whatever.
happens. for example we have a trusted dumper that did 400+ dumps and the latest was a bad dump.

what we consider "trusted" is a dumper who is in contact with our staff, and not an anonymous source.
it is "trusted for no-intro". he may send false data, but not intentionally.
maybe "not vandal" is better tan "trusted"?

what the public should consider is the term "verified" that counts the number of redumps. the higher is the verified value, the lower is the probability of false information.

KingMike
Posts: 387
Joined: 22 Sep 2012 16:36

Re: SHA-1 Hashes in DAT-o-MATIC

Post by KingMike » 03 Jan 2018 19:06

Buddybenj wrote:And even if Callis didn't reveal he was the dumper, it would've been obvious since he is (I presume) the only one who has this rare beta cartridge.
Many years ago, someone on ebay listed a Run Saber proto cart with an anti-dumping clause ordering the buyer to sign an NDA or whatever you want to call it.
People on a forum where it was pointed out joked "I buy the game and rent it to my brother for a dollar, let him dump and release it" (although unsurprisingly I don't think anyone bought that)

(my guess why the seller would even care was they were aware that even owning the proto itself was kind of a legally gray situation and didn't want to attract further attention and possibly Atlus/Hori caring to track them down on ownership of a then decade-old obscure game beta :P )

User avatar
TheShadowRunner
Posts: 105
Joined: 14 Oct 2012 15:46

Re: SHA-1 Hashes in DAT-o-MATIC

Post by TheShadowRunner » 04 Jan 2018 17:46

xuom2 wrote:now SHA-1 is shown instead of MD5 :mrgreen:
oh come on, please...
I do use md5 hash daily so it's a pain. Please restore showing the MD5.

Post Reply