analysis log file


Grep log log analysis collate finishing
1 . Analyze log files to access the page next 2012-05-04 The top 20 URL and sorting
cat access.log | grep '04 / May/2012 '| awk' {print $ 11} '| sort | uniq-c | sort-nr | head -20
Query the URL address to access the page URL contains the IP address of www.abc.com
cat access_log | awk '($ 11 ~ / \ www.abc.com/) {print $ 1}' | sort | uniq-c | sort-nr
(2) to gain access to up to 10 IP addresses can also be queried by time
cat linewow-access.log | awk '{print $ 1}' | sort | uniq-c | sort-nr | head -10
1 to gain access to the ip address before 10
cat access.log | awk '{print $ 1}' | sort | uniq-c | sort-nr | head -10
cat access.log | awk '{counts [$ (11)] + = 1}; END {for (url in counts) print counts [url], url}'
2 Most Visited file or page , take the top 20 and all access to IP Statistics
cat access.log | awk '{print $ 11}' | sort | uniq-c | sort-nr | head -20
awk '{print $ 1}' access.log | sort-n-r | uniq-c | wc-l
cat wangsu.log | egrep '06 / Sep/2012: 14:35 | 06/Sep/2012: 15:05 '| awk' {print $ 1} '| sort | uniq-c | sort-nr | head -10 query log period of time the situation
3 lists some of the largest transfer exe file ( download station when analyzing common )
cat access.log | awk '($ 7 ~ / \. exe /) {print $ 10 "" $ 1 "" $ 4 "" $ 7}' | sort-nr | head -20
4 lists the output is greater than 200000byte ( about 200kb) an exe file and the number of occurrences of the corresponding file
cat access.log | awk '($ 10> 200000 && $ 7 ~ / \. exe /) {print $ 7}' | sort-n | uniq-c | sort-nr | head -100
5 If the log records the last one is the page file transfer time , there are lists to the client the most time-consuming page
cat access.log | awk '($ 7 ~ / \. php /) {print $ NF "" $ 1 "" $ 4 "" $ 7}' | sort-nr | head -100
6 lists the most time-consuming page ( more than 60 seconds ) as well as the corresponding page number of occurrences
cat access.log | awk '($ NF> 60 && $ 7 ~ / \. php /) {print $ 7}' | sort-n | uniq-c | sort-nr | head -100
7 lists the transmission of documents longer than 30 seconds
cat access.log | awk '($ NF> 30) {print $ 7}' | sort-n | uniq-c | sort-nr | head -20
8 Statistics website traffic (G)
cat access.log | awk '{sum + = $ 10} END {print sum/1024/1024/1024}'
9 Statistics 404 connection
awk '($ 9 ~ / 404 /)' access.log | awk '{print $ 9, $ 7}' | sort
10 Statistical http status.
cat access.log | awk '{counts [$ (9)] + = 1}; END {for (code in counts) print code, counts [code]}'
cat access.log | awk '{print $ 9}' | sort | uniq-c | sort-rn
11 sec Concurrency :
awk '{if ($ 9 ~ / 200 | 30 | 404 /) COUNT [$ 4] + +} END {for (a in COUNT) print a, COUNT [a]}' | sort-k 2-nr | head-n10
12 . Bandwidth statistics
cat apache.log | awk '{if ($ 7 ~ / GET /) count + +} END {print "client_request =" count}'
cat apache.log | awk '{BYTE + = $ 11} END {print "client_kbyte_out =" BYTE/1024 "KB"}'
One day out of the 10 most visited IP
cat / tmp / access.log | grep "20/Mar/2011" | awk '{print $ 3}' | sort | uniq-c | sort-nr | head
Maximum number of connections that day ip ip are doing :
cat access.log | grep "10.0.21.17" | awk '{print $ 8}' | sort | uniq-c | sort-nr | head-n 10
Find out the most visited several minutes
awk '{print $ 1}' access.log | grep "20/Mar/2011" | cut-c 14-18 | sort | uniq-c | sort-nr | head
Attachment: View tcp connection status
netstat-nat | awk '{print $ 6}' | sort | uniq-c | sort-rn
netstat-n | awk '/ ^ tcp / {+ + S [$ NF]}; END {for (a in S) print a, S [a]}'
netstat-n | awk '/ ^ tcp / {+ + state [$ NF]}; END {for (key in state) print key, "\ t", state [key]}'
netstat-n | awk '/ ^ tcp / {+ + arr [$ NF]}; END {for (k in arr) print k, "\ t", arr [k]}'
netstat-n | awk '/ ^ tcp / {print $ NF}' | sort | uniq-c | sort-rn
netstat-ant | awk '{print $ NF}' | grep-v '[az]' | sort | uniq-c
netstat-ant | awk '/ ip: 80 / {split ($ 5, ip, ":"); + + S [ip [1]]} END {for (a in S) print S [a], a}' | sort-n
netstat-ant | awk '/: 80 / {split ($ 5, ip, ":"); + + S [ip [1]]} END {for (a in S) print S [a], a}' | sort-rn | head-n 10
awk 'BEGIN {printf ("http_code \ tcount_num \ n")} {COUNT [$ 10] + +} END {for (a in COUNT) printf a "\ t \ t" COUNT [a] "\ n"}'
(2) Find requests please 20 IP ( commonly used in the attack source lookup ) :
netstat-anlp | grep 80 | grep tcp | awk '{print $ 5}' | awk-F: '{print $ 1}' | sort | uniq-c | sort-nr | head-n20
netstat-ant | awk '/: 80 / {split ($ 5, ip, ":"); + + A [ip [1]]} END {for (i in A) print A [i], i}' | sort-rn | head-n20
3 with a sniffer tcpdump port 80 access to see who the highest
tcpdump-i eth0-tnn dst port 80-c 1000 | awk-F "." '{print $ 1 "." $ 2 "." $ 3 "." $ 4}' | sort | uniq-c | sort-nr | head - 20
4 Find more time_wait connection
netstat-n | grep TIME_WAIT | awk '{print $ 5}' | sort | uniq-c | sort-rn | head-n20
5 more investigation to find SYN connections
netstat-an | grep SYN | awk '{print $ 5}' | awk-F: '{print $ 1}' | sort | uniq-c | sort-nr | more
6 According to port out process
netstat-ntlp | grep 80 | awk '{print $ 7}' | cut-d /-f1

No comments: