|
| |||||||
| |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
| | #1 (permalink) |
| duh Join Date: May 2002 Location: Boiler Up
Posts: 601
| Compare 2 mp3's to see if they are the same Hey guys, I'm about to start working on a project here soon. It's going to involve a lot of mp3 files from a lot of different people. What I want to do is cut down on duplicate mp3's as much as possible. Obviously, file name is an easy way to get rid of dupes. I've found MD5 will work to get those tracks that may have different file names, but are the same exact file. However, after changing the ID3 tag MD5 doesn't work anymore. I have a feeling MD5 is probably gonna be it, but can anyone think of more ways to possible cut down on dupe's even more, even mp3s with different sizes? I was thinking about like comparing song duration, with some of the file name, with some of the id3 stuff, blah blah blah. |
| | |
| | #2 (permalink) |
| Fuckin' 07er Join Date: May 2007 Location: Philly
Posts: 337
| Are these amongst the same 'songs' from different users, or just yourself? So many people download music now, and use different bitrates to encode at, even with the same id3 tags and bitrates, there'll be differences between files from one user to another, if one downloaded it and one ripped from CD most likely. |
| | |
| | #3 (permalink) |
| Registered User Join Date: Feb 2006
Posts: 1,634
+7 Internets | Even if you wrote a program to do an MD5 of just the audio data, ignoring the ID3/APE tags, you would still run into the problem of a million different people using a million different MP3 encoders at a million different bitrates. As such, the only plausible way of detecting duplicates would be some kind of acoustic fingerprinting; I don't know if there are any free software solutions for doing this (and unless you are looking for a summer project, it would probably be a Difficult Problem to write a good homegrown solution.) EDIT: I asked a friend and he recommended this. It includes source and a good explanation of the process. Last edited by Fog : 10-20-2007 at 09:39 PM. |
| | |
| | #6 (permalink) |
| Registered User Join Date: Mar 2002 Location: Orange County, California
Posts: 34
+1 Internets | Shareaza added support for audio data only hashing 4 years ago. You could download mp3s from multiple sources regardless of any ID3 modifications. Too bad distributed/decentralized p2p is basically dead or otherwise useless. |
| | |
| | #8 (permalink) |
| duh Join Date: May 2002 Location: Boiler Up
Posts: 601
| Someone should try this foosic. Quite easy to use, but sadly it doesn't work all that well. Code:
|
| | |
| | #12 (permalink) |
| You can betray me Join Date: Dec 2002 Location: Houston
Posts: 8,673
+20 Internets | Sorry semi related derail inc. Does anyone know how I can find out if an mp3 is v0/v2? All I know how to do is look at the basic bitrate of something. anyone able to tell me what this is too? Encoder : EAC (Secure mode) / LAME 3.92 Codec : LAME 3.97 Bitrate : VBR ~251K/s 44100Hz Joint Stereo ID3-Tag : ID3v2.3 |
| | |
| | #13 (permalink) | ||
| Registered User Join Date: Feb 2006
Posts: 1,634
+7 Internets | Quote:
Quote:
| ||
| | |
| | #14 (permalink) |
| Registered User Join Date: Feb 2006
Posts: 1,634
+7 Internets | It sounds like it does to me. I don't understand the fundamental difference between people sharing a set of files on a searchable P2P network and someone downloading portions of a file from multiple users sharing it, and people seeding a set of torrents in a searchable torrent database and someone downloading portions of a file from multiple users seeding it. |
| | |
![]() |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
| |