Update May 6, 2015: This script has not been tested with Mac OS X Yosemite 10.10.3 new Photos application.
Short Story:
I had several years of photos that I needed to identify and remove the duplicate. Instead of manually combing through 12,000 (read Long Story below) and before carpal tunnel set in, I needed a script to help me out. My situation may or may not be unique, so this script may not work 100% out-of-the-box for you, but it should get you started.
To use:
- Download and unzip the script
- Double-click the script to open in Script Editor
- Go into iPhoto and select a group of photos you want to compare
- Switch back to Script Editor and run the script
- Don’t Touch Anything! Just let the script finish, it could take a while if you are comparing a lot of photos
- After the script is done, go back into iphoto and search for “duplicate”
- You can highlight all the duplicates and delete them or move them some place safe
Photos are considered a duplicate if:
- both heights match
- both widths match
- the photo date in iPhoto match, this is typically the EXIF creation date
Long Story:
About a year ago I was editing down my iPhoto library of about 6000 images, just gitting rid of those out-of-focus shots and the ones of my wife’s feet (a curiously large number of these). After a long night of editing, the next morning I awoke to start again, but when I ran iPhoto there was nothing in the library.
It was all gone!
I couldn’t find anything anywhere. Could I restore from a backup? Ooh nooo. I had erased my backup drive the day before in preparation for moving the unwanted photos onto the backup drive and then making a new backup of my iPhoto Library. So I had no backup.
Not really funny. These were all the shots of my boys being born, first steps, first birthdays, first everything. I was up sh*t creek and it put a serious hurt in my stomach. At least I knew what to do: do nothing on the computer, boot from the Mac OS X install DVD and use Disk Utility to make a byte-for-byte copy of my internal hard disk. I could use this disk image to recover the images, hopefully.
So I tried several image recovery utilities and finally settled on PhotoRescue for Mac. I mounted the disk image of my internal disk and set PhotoRescue to the task. About 9 hours later (not a typo), PhotoRescue gave me several folders of recovered JPEGs, TIFFs, GIFs and PNGs. I tossed all but the JPEGs. I felt a little better at this point.
But when I looked in the JPEG folder there was over 12,000 images! Huh? Well, PhotoRescue does not discriminate, it recovers ALL images, including thumnails, web graphics, pron (you’ve been warned). Frankly, it was unbelievable and overwhelming.
So I set about dividing the images into folder that I knew were junk images and ones that I may want to keep. First, I eliminated everything below about 120K. I knew that my oldest digital camera was around 3M pixels and it saved a file that was typically > 200K so those images below 120K were most likely thumbnails and web images. That cut my stack almost in half.
Next I looked for images > 3M. These were corrupted image files that while they looked ok in Preview, I knew there was no way a 1200×1600 images was 40M. Just a consequence of PhotoRescue’s recovery routine. I can live with that, believe me. So I tossed everything > 3M because my current 6M pixel camera images are under 2M in size.
This left me with about 6,000 images that I imported into a new iPhoto Library. From the looks of it, all my images were there! What a relief, but the bad news was nothing was rotated properly, and there were many, many duplicates. Thousands of duplicates to be exact. After I rotated all the images so that I could view them properly, I set about removing the duplicates.
The good news about removing the duplicates was that they were fairly easy to spot. When I imported all the recovered images into iPhoto it apparently used the EXIF data data to date stamp each photo instead of using the photo file’s creation date, which was set by PhotoRescue to the day I performed the recovery. So all my photo’s were dated properly, I just had to look at each photo that matches (they were sorted by date) the one next to it and delete one of them.
A closer look at the duplicate photos revealed that while they had the same height, width and date/time, they varied in size. I was not able to determine why the file sizes varied as the images themselves looked identical, but my best guess is that the size difference came about from iPhoto’s insistence that when you rotate an image iPhoto considers this an “edit” and makes a copy of the original and add’s some iPhoto specific data (no verification on this though). So hey, if you are going to keep one, why not keep the smaller of the image files? So that’s what I was doing.
After hours and days of removing duplicates, I decided there has to be a better way. A bit of searching for “applescript iphoto remove duplicates” let me to Brattoo Propaganda Software’s Duplicate Annihilator. I tried the demo and it works very well. But there was one thing I wanted to do that Duplicate Annihilator could not, and that is mark the larger of the duplicate files. Duplicate Annihilator marks duplicate files by date/time which I am sure is what most people want to do. So definitely check it out.
So Duplicate Annihilator minor missing feature led me to write my own AppleScript to do pretty much the same. The script is pretty simple and requires no additional libraries or command line voodoo. But I will say that coding in Ruby for the past year-and-a-half really reminds my why I don’t like AppleScript. AS gets the job done, but it’s so much more work, frankly it’s confusing, and if you don’t do it often it’s a lot of work getting your head around AS’s nomenclature.
For you fellow rubists, there is rb-appscript which would have made my pain a little easier, but it relies on ruby and having the rb-appscript gem installed and that would be too much for most casual Mac users. So AppleScript won this round, but only because I knew I wanted to share the script for others.
Good luck to all you photo recoverers. I’ve been down your road before.
I have a large library like many people in this forum, if you can only select a portion of the library does it only compare/search for duplicates in the selected Photos? Have had some problems with the time/date stamps matching up so I do not think this is identifying anything outside of the files selected but want to make sure I am interpreting correctly.
How do you know when the script is finished running?
There isn’t a notification, it just stops. Though, typically, you will see the script running icon in your menu bar towards the right-hand side. Once the script is done, that icon will disappear.
Hope this helps.
[…] outra solução e encontrei um scritpt que escrevia a palavra “duplicate” nas fotos consideradas duplicadas. A ideia era selecionar as fotos que queria comparar e executar o […]
I moved iPhone photos onto iPad using wifi transfer.also a lot from samsung camera using similar onto iPad for easier viewing while on holiday.now I’ve put them all onto MacBook from bth.i hope this will help me sort the duplicates.i didn’t want to delete as I went just in case.
I found this a very useful script. Having set up a smart album to show all the photos I found I couldn’t delete anything in this album until I discovered I had to use Alt-Cmd-Del to do this.
It would be nice to be able to produce a smart album that compared the duplicates with the originals, and this could possibly be done by adding an “original” keyword to the latter. I don’t suppose you’ve tried this?
Oh so very helpful! Thank you so much for the tip!
Like many others, I have spent hours trying to find out the best way to rid these duplicates, however I am on a very old iMac5,1 with no updated software or operating systems (or mac knowledge either!) that allowed me to use the programs every other website / forum suggested.
You are a pure work of genius Mister. Your instructions could not have been clearer, – I had never seen or used this ‘script’ program before but it was 100% every time (I tried a few scenarios to make sure it was doing what I wanted haha)
Thank you sooo much for sharing this with us, although I have wasted hours up until now, you have saved me many hours more! Your children are very lucky to have you as their go-to man 🙂
Kindest thanks and Regards,
Emily Stahrrtrail*
Nice script. Thank you.
does the “duplicates” entry into the comments field overwrite current comments in the field? Or does it add 9append) to any existing comments?
Hi, I too had loads of pictures. downloaded a couple of apps from the app store. i found Duplicate Cleaner for iPhoto more convenient and easy to use. maybe you can try it too.
Went through the recent update and the system uses the new version of “Photos” instead of “iPhoto”. I went to run the script and it informed me that my library had been migrated to “Photos” and asked me if I wanted to open this in “iPhoto” or “Photos”. My question is, will this script still work with the new “Photos” library? Thanks!
Thank you for sharing this – it was a painless way to declutter my iPhoto by half in just a couple of minutes.
Does it tag “duplicate” on the original photo as well?
No, it only puts the “duplicate” tag on the duplicate.
I clicked on it and it says safari cannot open this file