Differences

This shows you the differences between two versions of the page.

Link to this comparison view

people:solar:unique-password-count [2021/03/04 13:23]
solar [Pwned Passwords (HIBP)] dropped the hypothesis as it's now known to be wrong per Troy Hunt's clarification
people:solar:unique-password-count [2021/03/04 14:05] (current)
solar [Pwned Passwords (HIBP)] added total vs. unique for RockYou-sized subsets of Pwned Passwords
Line 96: Line 96:
 ===== Pwned Passwords (HIBP) ===== ===== Pwned Passwords (HIBP) =====
  
-[[https://​haveibeenpwned.com/​Passwords|Pwned Passwords]] is a curated and regularly updated collection of leaked/​breached ​plaintext ​passwords redistributed in form of SHA-1 and NTLM hashes for the purpose of detecting and preventing password reuse. ​ It was introduced in August 2017 (many years later than the above analysis of RockYou). ​ Version 7 released in November 2020 contains over 613 million (613584246) hashes of unique passwords. ​ Helpfully, included with each hash is "a count of how many times that password had been seen in the source data breaches."​ Adding those up yields over 3.65 billion (3650716681).+[[https://​haveibeenpwned.com/​Passwords|Pwned Passwords]] is a curated and regularly updated collection of leaked/​breached ​plain text passwords redistributed in form of SHA-1 and NTLM hashes for the purpose of detecting and preventing password reuse. ​ It was introduced in August 2017 (many years later than the above analysis of RockYou). ​ Version 7 released in November 2020 contains over 613 million (613584246) hashes of unique passwords. ​ Helpfully, included with each hash is "a count of how many times that password had been seen in the source data breaches."​ Adding those up yields over 3.65 billion (3650716681).
  
-Extrapolation from RockYou using the formulas above gives 795 to 1225 million unique, with mean for the four formulas at 935 million. ​ This is the opposite from what we saw in the Adobe leak - Pwned Passwords appear to be significantly worse than RockYou'​s (fewer unique).+Extrapolation from RockYou using the formulas above gives 795 to 1225 million unique, with mean for the four formulas at 935 million. ​ This is the opposite from what we saw in the Adobe leak - Pwned Passwords appear to be significantly worse than RockYou ​(fewer unique). 
 + 
 +To confirm that it'​s ​indeed Pwned Passwords being more repetitive than RockYou rather than the extrapolation failing at these numbers of passwords, let's take the first 14344391 lines (same as RockYou ​unique ​password count) from pwned-passwords-ntlm-ordered-by-hash-v7.txt and add up the counts on those. ​ Turns out they correspond to 81548722 original passwords (including duplicates),​ which is 2.5x higher than RockYou'​s. ​ (Going with the last 14344391 lines instead gives 84209109, which is similar enough. ​ Ideally, we'd shuffle the file first, but since there'​s no reason to expect password complexity or number of occurrences in a plain text leak to correlate with NTLM hash value, these shortcut approaches work just as well.  Sorting by hash value effectively //is// random shuffling of the password counts.) 
 + 
 +Going the other way, it takes about 5.6M lines from pwned-passwords-ntlm-ordered-by-hash-v7.txt,​ which is about 2.5x lower than RockYou'​s unique password count of 14.3M, to achieve RockYou'​s original password count of 32.6M (including duplicates).
 ===== Perl script ===== ===== Perl script =====
  
people/solar/unique-password-count.txt · Last modified: 2021/03/04 14:05 by solar
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate to DokuWiki Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki Powered by OpenVZ Powered by Openwall GNU/*/Linux