@Gabe -- unfortunately we had to do this manually via the lead database.
We created a smartlist to identify all of the leads (IIRC, it was something like "lead was created = today" "lead source original = empty"), then sorted by email address. You can combine up to ten at a time.
We had at least 10K, so yeah, it took a long, long time. It had in fact synced to SFDC, as well, so we may have lost some data when combining them, but it was a risk we took.
While this may be an SFDC problem, it seems problematic that Marketo would continue to create leads due to sync failure when they all have the same email address.