close to 100% CPU load

Offline

close to 100% CPU load

Resolved

0 votes

Not sure if it is the same problem as other report, but my server Clearos 6.5 is running close to 100% now for a number of days. I am a newby on managing a server, so I have tried to replicate what I found on this forum to test where the load is coming from.

I think it is related to system-mysqld although top tells me that it is not driving the load to 100%, but it is almost continuously on top of top.



top - 23:30:31 up  3:04,  3 users,  load average: 2.41, 1.99, 2.17

Tasks: 391 total,   2 running, 389 sleeping,   0 stopped,   0 zombie

Cpu(s): 11.9%us,  2.7%sy,  0.0%ni,  0.0%id, 85.4%wa,  0.0%hi,  0.0%si,  0.0%st

Mem:   1030072k total,  1016564k used,    13508k free,     1696k buffers

Swap:  2064376k total,  1418040k used,   646336k free,   101332k cached



  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 2676 system-m  20   0  686m 537m 3116 S 12.6 53.5   5:45.86 system-mysqld

19078 root      20   0  2988 1240  828 R  0.7  0.1   0:00.13 top

   16 root      20   0     0    0    0 R  0.3  0.0   0:24.71 kblockd/0

   29 root      20   0     0    0    0 S  0.3  0.0   0:39.52 kswapd0

 3860 root      20   0 22908  13m  13m S  0.3  1.3   0:15.73 pmacctd

 3867 root      20   0 22100 9884 5848 S  0.3  1.0   0:07.91 pmacctd

 4457 snort     20   0  297m  13m 3880 S  0.3  1.4   0:34.59 snort

 4729 root      20   0 30956 1864 1460 S  0.3  0.2   0:03.47 X

 4851 root      20   0  3428  200  164 S  0.3  0.0   0:31.87 snortsam

 4872 clearcon  20   0  349m 9228 4108 S  0.3  0.9   0:29.38 gconsole

 4937 plex      20   0  203m 6064 1536 S  0.3  0.6   0:11.08 Plex DLNA Serve

    1 root      20   0  2948  892  784 S  0.0  0.1   0:01.15 init

    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd

In var/log/mysqld.log I find this information:



140502 20:25:33 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

140502 20:27:48 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql

140502 20:27:48  InnoDB: Initializing buffer pool, size = 8.0M

140502 20:27:48  InnoDB: Completed initialization of buffer pool

140502 20:27:49  InnoDB: Started; log sequence number 0 152806009

140502 20:27:49 [Note] Event Scheduler: Loaded 0 events

140502 20:27:49 [Note] /usr/libexec/mysqld: ready for connections.

Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution

Not sure if it has anything to do with it, but the dashboard in the UI takes ages to load, in particular the charts on memory and CPU usage.

Any help to get me to the next step is appreciated.

Regards, Ronald

In Database

Friday, May 02 2014, 09:37 PM

No. Favourite

Share this post:

Responses (77)

Accepted Answer
Ben Chambers

Offline
Saturday, May 03 2014, 12:53 PM - #Permalink
Resolved

0 votes

Hi Ronald,

Yup..your load is high, but the sluggishness (eg. of dashboard loading) is caused by the extreme wait times on I/O activity (85.4%wa). You're out of memory and hitting swap which only makes the situation worse.

I'd like to see the output of

ps afxw

I'll bet you have more than one script running that is trying to update your reports (parsing out the log file to the system MySQL database).

If you know how to kill processes, you can do that...a Windows TM reboot would also prevent that.

Is your software up-2-date? There was a release about 3-4 weeks ago that prevented multiple instances of some of the report scripts to run.

Finally...more memory never hurts...at 1G...you're right on the edge.

B.
The reply is currently minimized Show

Accepted Answer

Ronald

Offline

Saturday, May 03 2014, 02:02 PM - #Permalink

Resolved

0 votes

Thanks Ben, I will have a look to insert some more memory., you are right it is small. This is because I started with a small redundant box to see if I like to run a server on Clearos, and as I really like the clearos setup the usage is growing.

here is my ouput

[root@stevenaar ~]# ps afxw

  PID TTY      STAT   TIME COMMAND

    2 ?        S      0:00 [kthreadd]

    3 ?        S      0:00  \_ [migration/0]

    4 ?        S      0:02  \_ [ksoftirqd/0]

    5 ?        S      0:00  \_ [migration/0]

    6 ?        S      0:00  \_ [watchdog/0]

    7 ?        S      0:08  \_ [events/0]

    8 ?        S      0:00  \_ [cgroup]

    9 ?        S      0:00  \_ [khelper]

   10 ?        S      0:00  \_ [netns]

   11 ?        S      0:00  \_ [async/mgr]

   12 ?        S      0:00  \_ [pm]

   13 ?        S      0:00  \_ [sync_supers]

   14 ?        S      0:00  \_ [bdi-default]

   15 ?        S      0:00  \_ [kintegrityd/0]

   16 ?        S      1:30  \_ [kblockd/0]

   17 ?        S      0:00  \_ [kacpid]

   18 ?        S      0:00  \_ [kacpi_notify]

   19 ?        S      0:00  \_ [kacpi_hotplug]

   20 ?        S      0:00  \_ [ata_aux]

   21 ?        S      0:00  \_ [ata_sff/0]

   22 ?        S      0:00  \_ [ksuspend_usbd]

   23 ?        S      0:00  \_ [khubd]

   24 ?        S      0:00  \_ [kseriod]

   25 ?        S      0:00  \_ [md/0]

   26 ?        S      0:00  \_ [md_misc/0]

   27 ?        S      0:00  \_ [linkwatch]

   28 ?        S      0:00  \_ [khungtaskd]

   29 ?        D      1:37  \_ [kswapd0]

   30 ?        SN     0:00  \_ [ksmd]

   31 ?        S      0:00  \_ [aio/0]

   32 ?        S      0:00  \_ [crypto/0]

   37 ?        S      0:00  \_ [kthrotld/0]

   39 ?        S      0:00  \_ [kpsmoused]

   40 ?        S      0:00  \_ [usbhid_resumer]

   71 ?        S      0:00  \_ [kstriped]

   99 ?        S      0:00  \_ [ttm_swap]

  100 ?        S<     0:07  \_ [kslowd000]

  101 ?        S<     0:07  \_ [kslowd001]

  121 ?        S      0:00  \_ [scsi_eh_0]

  122 ?        S      0:00  \_ [usb-storage]

  153 ?        S      0:00  \_ [scsi_eh_1]

  154 ?        S      0:00  \_ [scsi_eh_2]

  157 ?        S      0:00  \_ [scsi_eh_3]

  158 ?        S      0:01  \_ [usb-storage]

  180 ?        S      0:00  \_ [scsi_eh_4]

  181 ?        S      0:00  \_ [scsi_eh_5]

  307 ?        S      0:09  \_ [kdmflush]

  309 ?        S      0:00  \_ [kdmflush]

  326 ?        D      0:27  \_ [jbd2/dm-0-8]

  327 ?        S      0:00  \_ [ext4-dio-unwrit]

  706 ?        S      0:00  \_ [kdmflush]

  750 ?        S      0:00  \_ [jbd2/sda1-8]

  751 ?        S      0:00  \_ [ext4-dio-unwrit]

  752 ?        S      0:00  \_ [jbd2/dm-2-8]

  753 ?        S      0:00  \_ [ext4-dio-unwrit]

  754 ?        S      0:00  \_ [jbd2/sdc1-8]

  755 ?        S      0:00  \_ [ext4-dio-unwrit]

  756 ?        S      0:00  \_ [jbd2/sdb-8]

  757 ?        S      0:00  \_ [ext4-dio-unwrit]

  810 ?        S      0:01  \_ [kauditd]

  925 ?        S      0:07  \_ [flush-253:0]

 1194 ?        S      0:00  \_ [rpciod/0]

 1797 ?        S      0:00  \_ [lockd]

 1798 ?        S      0:00  \_ [nfsd4]

 1799 ?        S      0:00  \_ [nfsd4_callbacks]

 1800 ?        S      0:00  \_ [nfsd]

 1801 ?        S      0:00  \_ [nfsd]

 1802 ?        S      0:00  \_ [nfsd]

 1803 ?        S      0:00  \_ [nfsd]

 1804 ?        S      0:00  \_ [nfsd]

 1805 ?        S      0:00  \_ [nfsd]

 1806 ?        S      0:00  \_ [nfsd]

 1807 ?        S      0:00  \_ [nfsd]

 4442 ?        S      0:00  \_ [bluetooth]

25986 ?        S      0:00  \_ [flush-8:32]

    1 ?        Ss     0:01 /sbin/init

  405 ?        S<s    0:00 /sbin/udevd -d

  703 ?        S<     0:00  \_ /sbin/udevd -d

 4544 ?        S<     0:00  \_ /sbin/udevd -d

  972 ?        S<sl   0:05 auditd

  990 ?        Ss     0:00 /sbin/portreserve

  997 ?        Ssl    0:04 /usr/sbin/nslcd

 1010 ?        Sl     0:06 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5

 1175 ?        Ss     0:00 rpcbind

 1200 ?        Ss     0:00 rpc.statd -p 662 -o 2020

 1226 ?        Ss     0:00 dbus-daemon --system

 1237 ?        S      0:01 avahi-daemon: running [stevenaar.local]

 1238 ?        Ss     0:00  \_ avahi-daemon: chroot helper

 1253 ?        Ss     0:01 cupsd -C /etc/cups/cupsd.conf

 1278 ?        Ss     0:00 /usr/sbin/acpid

 1287 ?        Ssl    0:00 hald

 1288 ?        S      0:00  \_ hald-runner

 1333 ?        S      0:00      \_ hald-addon-input: Listening on /dev/input/event3 /dev/input/event0 /dev/input/event1

 1337 ?        S      0:00      \_ hald-addon-acpi: listening on acpid socket /var/run/acpid.socket

 1742 ?        Ssl    0:12 /usr/sbin/slapd -h ldap://127.0.0.1/ ldaps://127.0.0.1 ldaps://192.168.2.111/  -u ldap

 1757 ?        Ss     0:00 winbindd

 1787 ?        S      0:00  \_ winbindd

 3879 ?        S      0:00  \_ winbindd

 1792 ?        Ss     0:00 rpc.mountd -p 892

 1828 ?        Ss     0:00 rpc.idmapd

 1839 ?        Ssl    0:12 /usr/sbin/nscd

 1945 ?        S      0:00 /bin/sh /usr/libexec/ipsec/_plutorun --debug  --uniqueids yes --force_busy no --nocrsend no --strictcrlpolicy no --nat_traversal yes --keep_alive  --p

 1950 ?        S      0:00  \_ /bin/sh /usr/libexec/ipsec/_plutorun --debug  --uniqueids yes --force_busy no --nocrsend no --strictcrlpolicy no --nat_traversal yes --keep_alive

 1952 ?        Sl     0:00  |   \_ /usr/libexec/ipsec/pluto --nofork --secretsfile /etc/ipsec.secrets --ipsecdir /etc/ipsec.d --use-netkey --uniqueids --nat_traversal --virtual_

 2177 ?        S      0:00  |       \_ _pluto_adns

 1951 ?        S      0:00  \_ /bin/sh /usr/libexec/ipsec/_plutoload --wait no --post

 1946 ?        S      0:00 logger -s -p daemon.error -t ipsec__plutorun

 1964 ?        S      0:02 /usr/sbin/dnsmasq -s nl

 2005 ?        Ssl    3:09 java -Xmx80m -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=10 -Djava.library.path=/opt/pcmonitor/bin/../native -jar /opt/pcmonitor/bin/../lib/pcmonitor

 2187 ?        S      0:16 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid

 2237 ?        S      0:02 arpwatch -u arpwatch -e - -i eth1 -f /var/lib/arpwatch/arp_eth1.dat

 2246 ?        Ssl    0:53 clearsyncd

 2319 ?        Ss     0:00 /usr/sbin/sshd

31433 ?        Ss     0:00  \_ sshd: root@notty

31437 ?        Ss     0:00  |   \_ /usr/libexec/openssh/sftp-server

32358 ?        Ss     0:00  \_ sshd: root@notty

32362 ?        Ss     0:00  |   \_ /usr/libexec/openssh/sftp-server

 4318 ?        Ss     0:00  \_ sshd: root@notty

 4320 ?        Ss     0:00  |   \_ -bash

26762 ?        Ss     0:00  \_ sshd: root@pts/0

26826 pts/0    Ss     0:00      \_ -bash

27573 pts/0    R+     0:00          \_ ps afxw

 2327 ?        Ss     0:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g

 2362 ?        Ssl    0:50 clamd

 2399 ?        S      0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=m

 2501 ?        Sl     0:30  \_ /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --s

 2553 ?        S      0:00 /bin/sh /usr/clearos/sandbox/usr/bin/mysqld_safe --defaults-file=/usr/clearos/sandbox/etc/my.cnf --datadir=/var/lib/system-mysql --socket=/var/lib/sys

 2676 ?        Sl    25:33  \_ /usr/clearos/sandbox/usr/libexec/system-mysqld --defaults-file=/usr/clearos/sandbox/etc/my.cnf --basedir=/usr/clearos/sandbox/usr --datadir=/var/l

 3571 ?        Ss     0:01 /usr/lib/cyrus-imapd/cyrus-master -d

 3637 ?        S      0:00  \_ imapd -s

 3638 ?        S      0:00  \_ imapd -s

 3642 ?        S      0:00  \_ imapd -s

 3643 ?        S      0:00  \_ imapd -s

 3644 ?        S      0:00  \_ imapd -s

 3645 ?        S      0:00  \_ imapd -s

27493 ?        D      0:00  \_ ctl_cyrusdb -c

 3581 ?        Ss     0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam

 3582 ?        S      0:00  \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam

 3583 ?        S      0:00  \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam

 3584 ?        S      0:00  \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam

 3585 ?        S      0:00  \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam

 3633 ?        S      0:02 idled

 3722 ?        Ss     0:00 /usr/libexec/postfix/master

 3731 ?        S      0:00  \_ qmgr -l -t fifo -u

20813 ?        S      0:00  \_ pickup -l -t fifo -u

 3734 ?        Ss     0:03 proftpd: (accepting connections)

 3767 ?        Ss     0:00 crond

 2499 ?        S      0:00  \_ CROND

 2500 ?        Zs     0:00      \_ [sh] <defunct>

 3797 ?        Ss     0:00 squid -f /etc/squid/squid.conf

 3800 ?        S      0:12  \_ (squid) -f /etc/squid/squid.conf

 3801 ?        S      0:00      \_ (pam_auth)

 3802 ?        S      0:00      \_ (pam_auth)

 3803 ?        S      0:00      \_ (pam_auth)

 3804 ?        S      0:00      \_ (pam_auth)

 3805 ?        S      0:00      \_ (pam_auth)

 3806 ?        S      0:00      \_ (pam_auth)

 3807 ?        S      0:00      \_ (pam_auth)

 3808 ?        S      0:00      \_ (pam_auth)

 3809 ?        S      0:00      \_ (pam_auth)

 3810 ?        S      0:00      \_ (pam_auth)

 3811 ?        S      0:00      \_ (pam_auth)

 3812 ?        S      0:00      \_ (pam_auth)

 3813 ?        S      0:00      \_ (pam_auth)

 3815 ?        S      0:00      \_ (pam_auth)

 3816 ?        S      0:00      \_ (pam_auth)

 3817 ?        S      0:00      \_ (unlinkd)

 3829 ?        S      0:01 /usr/bin/perl /usr/share/BackupPC/bin/BackupPC -d

 3835 ?        SN     0:00  \_ /usr/bin/perl /usr/share/BackupPC/bin/BackupPC_trashClean

 3842 ?        Ss     0:05 nmbd -D

 3845 ?        S      0:00  \_ nmbd -D

27571 ?        D      0:00  \_ nmbd -D

 3860 ?        Ss     1:26 pmacctd: Core Process [default]

 3865 ?        S      0:41  \_ pmacctd: MySQL Plugin [inbound]

26731 ?        S      0:00  |   \_ pmacctd: MySQL Plugin -- DB Writer [inbound]

26990 ?        S      0:00  |   \_ pmacctd: MySQL Plugin -- DB Writer [inbound]

 3867 ?        S      0:37  \_ pmacctd: MySQL Plugin [outbound]

26991 ?        S      0:00      \_ pmacctd: MySQL Plugin -- DB Writer [outbound]

 3870 ?        Ss     0:00 smbd -D

 3894 ?        S      0:00  \_ smbd -D

12116 ?        S      0:06  \_ smbd -D

 3896 ?        Ss     0:00 dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3898 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3899 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3900 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3901 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3902 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3903 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3904 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3905 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3907 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3908 ?        S      0:00  \_ dansguardian-av -c /etc/dansguardian-av/dansguardian.conf

 3914 ?        Sl     0:27 /usr/libexec/dropbox/dropbox study

 3917 ?        Sl     0:25 /usr/libexec/dropbox/dropbox linda

 4096 ?        Ds     2:29 /usr/bin/monitorix -c /etc/monitorix.conf -p /var/run/monitorix.pid

 4310 ?        S      0:00 su -s /bin/sh plex -c . /etc/sysconfig/PlexMediaServer; cd /usr/lib/plexmediaserver; ./'Plex Media Server' > /dev/null 2>&1

 4316 ?        Ss     0:00  \_ sh -c . /etc/sysconfig/PlexMediaServer; cd /usr/lib/plexmediaserver; ./'Plex Media Server' > /dev/null 2>&1

 4323 ?        Sl     0:15      \_ ./Plex Media Server

 4406 ?        SNl    2:06          \_ Plex Plug-in [com.plexapp.system] /var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Plug-ins/Framework.bundle/Content

 4937 ?        Sl     0:53          \_ /usr/lib/plexmediaserver/Plex DLNA Server

 4350 ?        Ss     0:02 /usr/sbin/webconfig

 4355 ?        S      0:24  \_ /usr/sbin/webconfig

 4356 ?        S      0:28  \_ /usr/sbin/webconfig

 4357 ?        S      0:22  \_ /usr/sbin/webconfig

 5026 ?        S      0:29  \_ /usr/sbin/webconfig

 5028 ?        D      0:22  \_ /usr/sbin/webconfig

 5083 ?        D      0:29  \_ /usr/sbin/webconfig

10681 ?        S      0:28  \_ /usr/sbin/webconfig

10684 ?        S      0:26  \_ /usr/sbin/webconfig

10685 ?        S      0:28  \_ /usr/sbin/webconfig

10985 ?        S      0:22  \_ /usr/sbin/webconfig

13488 ?        S      0:25  \_ /usr/sbin/webconfig

13491 ?        S      0:24  \_ /usr/sbin/webconfig

13492 ?        S      0:24  \_ /usr/sbin/webconfig

17255 ?        S      0:07  \_ /usr/sbin/webconfig

 4416 ?        Ss     0:00 pptpd

 4457 ?        Ssl    2:48 snort -i eth1 -u snort -g snort -D -c /etc/snort.conf

 4511 ?        Ssl    1:26 /usr/bin/transmission-daemon -b -t -a *.*.*.* -e /var/log/transmission/transmission.log

 4520 ?        Ss     0:00 /usr/bin/openvt -fwc 1 -- /bin/login -f clearconsole

 4528 ?        Ss     0:00  \_ login -- clearconsole

 4619 tty1     Ss+    0:00      \_ -bash

 4652 tty1     Sl+    0:59          \_ /usr/sbin/tconsole

 4661 ?        Ss     0:00              \_ /bin/sh /usr/bin/startx

 4717 ?        S      0:00                  \_ xinit /var/lib/clearconsole//.xinitrc -- /usr/bin/X :0 -auth /var/lib/clearconsole//.serverauth.4661

 4729 tty7     Ss+    0:22                      \_ /usr/bin/X :0 -auth /var/lib/clearconsole//.serverauth.4661

 4866 ?        Ss     0:00                      \_ sh /var/lib/clearconsole//.xinitrc

 4871 ?        S      0:00                          \_ /usr/bin/ratpoison

 4872 ?        Sl     2:40                          \_ /usr/lib/gconsole/gconsole

 4537 tty2     Ss+    0:00 /sbin/mingetty /dev/tty2

 4539 tty3     Ss+    0:00 /sbin/mingetty /dev/tty3

 4542 tty4     Ss+    0:00 /sbin/mingetty /dev/tty4

 4545 tty5     Ss+    0:00 /sbin/mingetty /dev/tty5

 4549 tty6     Ss+    0:00 /sbin/mingetty /dev/tty6

 4552 ?        Sl     0:00 /usr/sbin/console-kit-daemon --no-daemon

 5011 ?        S      0:00 dbus-launch --autolaunch 6e5ee68e56495e8f74db175c0000001d --binary-syntax --close-stderr

 5018 ?        Ss     0:00 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session

 5021 ?        S      0:00 /usr/libexec/gconfd-2

 2567 ?        Ss     0:00 syswatch

 2858 ?        Ss     0:31 snortsam /etc/snortsam.conf

 4363 ?        Ss     0:00 /usr/sbin/httpd

 4368 ?        S      0:00  \_ /usr/sbin/httpd

 4369 ?        S      0:00  \_ /usr/sbin/httpd

 4370 ?        S      0:00  \_ /usr/sbin/httpd

 4371 ?        S      0:00  \_ /usr/sbin/httpd

 4372 ?        S      0:00  \_ /usr/sbin/httpd

 4373 ?        S      0:00  \_ /usr/sbin/httpd

 4374 ?        S      0:00  \_ /usr/sbin/httpd

 4375 ?        S      0:00  \_ /usr/sbin/httpd

The reply is currently minimized Show

Accepted Answer
Nick Howitt

Offline
Saturday, May 03 2014, 04:00 PM - #Permalink
Resolved

0 votes

IDS (snort) and the Proxy (squid) are resource hogs. You are using 1.4GB of swap as well as 1GB of RAM. I'd expect your system to be crawling! I'd suggest at least 4GB RAM and if you can, increasing your swap (not so easy).

PS Can you put your output between [ code ] and [ /code ] removing the spaces between [ and ]. Then you get nicely formatted posts of screen dumps and files.
The reply is currently minimized Show
Accepted Answer

Ronald

Offline
Saturday, May 03 2014, 04:25 PM - #Permalink
Resolved

0 votes

Ok, as far as I am aware the only updates to the server have been the ones pushed by automatic updates, hence my server is up to date. YUM UPDATE and UPGRADE also return nothing.

I am wondering what has changed in the latest updates that make the CPU go mad. It is a server that sits in my home environment and I did not open up for load other than myself

I will try to get some incremental RAM, but what can do in the meantime to take the usage down.

Regards, Ronald
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Saturday, May 03 2014, 05:39 PM - #Permalink
Resolved

0 votes

Try stopping various services and see the effect. You can either look at the dashboard memory usage or do something like:
egrep 'Mem|Cache|Swap' /proc/meminfo
The first things I'd look at are the proxy (squid) and the IDS (snort).

I've just tried playing around with the "top" command. Run "top" then "M". This gives you top by memory usage. Then hit "f" then you can deselect some of the columns and add in things like Swapped Size etc.
The reply is currently minimized Show
Accepted Answer
Tony Ellis

Offline
Saturday, May 03 2014, 06:35 PM - #Permalink
Resolved

0 votes

In addition to Nick's suggestions, here are two more things to try that might provide useful info...

What's eating CPU?

watch -n 5 'ps axf | awk "{ if ( \$3 !~ /S/ ) { print; } }"'

In the following watch the "si" and "so" columns under swap for excessive swap activity. For changes to the command - the first number is the interval in seconds between updates - in this example 5, second number is the number of iterations - example is 32)

vmstat -w 5 32

"man ps" and "man vmstat" for more on these commands...
The reply is currently minimized Show

Accepted Answer

Ronald

Offline

Sunday, May 04 2014, 10:43 AM - #Permalink

Resolved

0 votes

I have been working with the tools both of you proposed an concluded that the 2 major swap and mem consumers are webconfig and system-mysqld. I killed the latter first and immediately mem and swap usage dropped back to normal and so did the waittime. Then I killed webconfig too and not surprisingly everything dropped even further back.

Then I started both again and after a few minutes the server was crawling again. I used htop to have a further look at the tree of mysql usage and looked in the mysqld.log too. I found that the InnoDB engine was creating most log entries.
With Google as my friend I found a couple of posts that were talking about the excessive loads that mysql creates on a server and the InnoDb engine in particular. As I have an app running on the server that is using the InnoDB engine I did not want to take the risk of switching it off entirely and work with MyISAM only.

I copied a couple of lines from a post that seem to address a similar problem as mine and put these in \etc\my.cnf:



## If open-files-limit is set very low, MySQL may increase on its own. Either

## way, increase this if MySQL gives 'too many open files' errors. Setting

## this above 65535 could be unwise (MySQL may crash).

open-files-limit                = 20000



### Cache

thread-cache-size               = 16

table-open-cache                = 4096

table-definition-cache          = 512



## Generally, it is unwise to set the query cache to be larger than 64-128M 

## as the costs associated with maintaining the cache outweigh the performance

## gains. A far superior solution would be to implement memcached, though this

## required modifying the application, among other things.

query-cache-type                = 1

query-cache-size                = 32M

query-cache-limit               = 1M



### Per-thread Buffers

sort-buffer-size                = 1M

read-buffer-size                = 1M

read-rnd-buffer-size            = 2M

join-buffer-size                = 1M



### Temp Tables

tmp-table-size                  = 64M 

max-heap-table-size             = 64M



### Networking

back-log                        = 100

max-connections                 = 50

max-connect-errors              = 10000

max-allowed-packet              = 16M

interactive-timeout             = 600

wait-timeout                    = 180

net_read_timeout        = 30

net_write_timeout       = 30

# This value is the size of the listen queue for incoming TCP/IP connections.

back_log            = 128





#### Storage Engines

## Set this to force MySQL to use a particular engine / table-type

## for new tables. This setting can still be overridden by specifying

## the engine explicitly in the CREATE TABLE statement.

default-storage-engine         = MyISAM



## Makes sure MySQL does not start if InnoDB fails to start. This helps

## prevent ugly silent failures.

innodb                          = FORCE



### MyISAM

## Not sure what to set this to?

## Try running a 'du -sch /var/lib/mysql/*/*.MYI'

## This will give you a good estimate on the size of all the MyISAM indexes.

## (The buffer may not need to set that high, however)

key-buffer-size                 = 2M

## This setting controls the size of the buffer that is allocated when 

## sorting MyISAM indexes during a REPAIR TABLE or when creating indexes 

## with CREATE INDEX or ALTER TABLE.

myisam-sort-buffer-size         = 2M



### InnoDB

## Note: While most settings in MySQL can be set at run-time, many InnoDB

## variables cannot be set at runtime as require restarting MySQL

###

## These settings control how much RAM InnoDB will use. Generally, when using

## mostly InnoDB tables, the innodb-buffer-pool-size should be as large as

## is possible without swapping or starving other processes of RAM. The other 

## two settings usually do not need to be changed, but can help for very large 

## datasets.

innodb-buffer-pool-size         = 285M

innodb-log-buffer-size          = 8M

To be sure I rebooted the server (was not sure if just restarting system-mysqld would do the trick) and from then my machine is OK again. I will try to add some RAM mem to it, but as it is SDRAM PC133 I am nt sure how much I can physically add given the limitation of the board.

I will edit this message if the problem reoccurs after a couple of days.

Regards, Ronald

The reply is currently minimized Show

Accepted Answer
Nick Howitt

Offline
Sunday, May 04 2014, 11:54 AM - #Permalink
Resolved

0 votes

You're beyond what I can really help with now except I can say that /etc/my.cnf is for mysql and /usr/clearos/sandbox/etc/my.cnf is for system-mysql so you may want to make changes to the latter. Also see the note here if you do make any changes.
The reply is currently minimized Show
Accepted Answer

Ronald

Offline
Sunday, May 04 2014, 03:44 PM - #Permalink
Resolved

0 votes

thanks Nick, I applied these to this cnf file too. So far so good. I have not touched the log files or sizes as refered to in the link.
will keep this forum posted if this does not work.

Regards, Ronald
The reply is currently minimized Show
Accepted Answer

rob youngquist

Offline
Wednesday, May 07 2014, 01:35 PM - #Permalink
Resolved

0 votes

Hi Nick. I was having the same issue. My RAM is low but my company does not want me shutting down the server.
I did up the max child processes because our user connections were higher than default and that helped big time.

I was wondering if you know if when I am looking at the GUI System > Resource >Processes and Page output
Is this " dansguardian child process " counts or overall processes on the actual Linux box?

And the Page output, is the actual " page swapping " stats so I know if it goes up due to the increased child processes / RAM usage ?
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Wednesday, May 07 2014, 04:38 PM - #Permalink
Resolved

0 votes

I'm afraid I've no idea what you are looking at. Are you running the Pro version? I suspect that if I I did know what you were looking at, I wouldn't know the answer.
The reply is currently minimized Show
Accepted Answer

rob youngquist

Offline
Wednesday, May 07 2014, 05:47 PM - #Permalink
Resolved

0 votes

It is ClearOS Enterprise 5.2. The Web GUI interface has a System > Resource Reports > page where you can select Processes and Page outputs. I was just wondering if this was " ALL " running processes on our server or just the " dansguardian " child processes. This would be nice if it is the child processes because then I could the history of child process counts instead of the real time,

" ps aux | grep dansguardian-av | wc -l "

which only shows me the current processes at that time.
If the GUI, which shows daily, weekly, monthly etc stats, I can see the max, avg and current child processes which would be very useful.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Wednesday, May 07 2014, 06:23 PM - #Permalink
Resolved

0 votes

Have you looked at the "top" command from earlier in the thread to monitor memory usage by process. It is a surprisingly powerful tool?
The reply is currently minimized Show
Accepted Answer
Ben Chambers

Offline
Wednesday, May 07 2014, 06:29 PM - #Permalink
Resolved

0 votes

I saw this on a buddy's server just today...Pete told me it was caused by a tool that watches network traffic and imports it to the system-msyql table.

If you run:

ps afxw | grep pmacct

and see 1 or more entries, this is likely what is causing the load.

For now, I just removed this (and the network report unfortunately) until we get a fix out:

service pmacctd stop service system-mysql stop killall -9 pmacctd yum remove pmacct service system-mysql start

Load dropped right away to 0 after this.

B.
The reply is currently minimized Show
Accepted Answer

rob youngquist

Offline
Wednesday, May 07 2014, 08:17 PM - #Permalink
Resolved

0 votes

is " top " historical, like I can see averages or max for an entire day, week or month?
or is it just real time?
With the web GUI, System > Resource Reports > processes, I can see all that in a graph chart as well as current usage.
I just cant find out if the " Processes " refers to ALL processes on the server itself, or " dansguardian " Child Processes.

Also, the response of " network traffic " tool adding to CPU usage is very helpful.

So thank you both for your input.

I am still hoping to get an answer on my initial question on the Web interfaces stats.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Wednesday, May 07 2014, 09:00 PM - #Permalink
Resolved

0 votes

top is real time only - I believe. I only found the other options looking at the man pages last weekend. You could check them as well.
The reply is currently minimized Show
Accepted Answer

zombu2

Offline
Monday, June 02 2014, 12:07 AM - #Permalink
Resolved

0 votes

well here is what i did to make system-mysqld behave

reset your stats and graphs
system-database reset

get your password for mysql
cat /var/clearos/system_database/reports

then log into mysql
/usr/clearos/sandbox/usr/bin/mysql -u reports -p reports

in mysql enter
alter table network_detail engine=innodb;

type exit and then restart mysql
service system-mysqld restart

been 4 days now and no runaway sql process , the system is nice and calm

and before anyone asks no i did not uninstall network detail etc.
The reply is currently minimized Show
Accepted Answer

Evince

Offline
Monday, June 02 2014, 12:20 PM - #Permalink
Resolved

0 votes

Thanks , Good Information
The reply is currently minimized Show

Accepted Answer

Nick Howitt

Offline

Tuesday, August 05 2014, 09:05 PM - #Permalink

Resolved

0 votes

I've just hit this one

[root@server ~]# ps afxw | grep pmacct

 5559 pts/0    S+     0:00          \_ grep pmacct

17054 ?        Ss    41:11 pmacctd: Core Process [default]

17059 ?        S     22:56  \_ pmacctd: MySQL Plugin [inbound]

 3351 ?        S      0:00  |   \_ pmacctd: MySQL Plugin -- DB Writer [inbound]

 4109 ?        S      0:00  |   \_ pmacctd: MySQL Plugin -- DB Writer [inbound]

 5119 ?        S      0:00  |   \_ pmacctd: MySQL Plugin -- DB Writer [inbound]

17062 ?        S     19:30  \_ pmacctd: MySQL Plugin [outbound]

 5118 ?        S      0:00      \_ pmacctd: MySQL Plugin -- DB Writer [outbound]

17063 ?        Ss    20:19 pmacctd: Core Process [default]

17065 ?        S      1:43  \_ pmacctd: MySQL Plugin [inbound]

17066 ?        S      1:04  \_ pmacctd: MySQL Plugin [outbound]

Is there any ClearOS fix in the pipeline?

The reply is currently minimized Show

Accepted Answer
Nick Howitt

Offline
Wednesday, August 06 2014, 06:14 PM - #Permalink
Resolved

0 votes

I could not find a bug report so I have filed bug 1885. In the meanwhile I've removed the app-network-detail-report packages and pmacct. If following Ben's instructions above note that it is system-mysqld and not system-mysql.
The reply is currently minimized Show
Accepted Answer
Legacy Disabled

Offline
Thursday, August 07 2014, 04:31 PM - #Permalink
Resolved

0 votes

Hi Nick,

This issue should have been resolved in a recent update (here's the tracker link. Does the /var/clearos/network_detail_report/purge_external file exist on your system? If not, then the update was not applied. If so, then something in the update did not work (?).

Short version: The network detail report table for external connections was not getting purged. This caused the report engine to hammer the MySQL system.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Thursday, August 07 2014, 05:09 PM - #Permalink
Resolved

0 votes

Hi Peter,

I'll have to reinstall to check. Do I need to clean anything out first?
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Thursday, August 07 2014, 05:14 PM - #Permalink
Resolved

0 votes

I've just checked and, strangely, the file does exist even though I've uninstalled the reports!
The reply is currently minimized Show
Accepted Answer
Legacy Disabled

Offline
Thursday, August 07 2014, 06:00 PM - #Permalink
Resolved

0 votes

The should still exist on an uninstall, so that's normal. It looks like the upgrade was performed but the MySQL issue still persisted. We'll keep an eye out for this issue.
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Thursday, August 07 2014, 06:12 PM - #Permalink
Resolved

0 votes

I'll reinstall and see what happens. I think it only maxed out one cpu core so I did not really notice it having a huge impact.

Do you have a comment on zombu2's solution further up the thread?
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Thursday, August 07 2014, 06:36 PM - #Permalink
Resolved

0 votes

Bad news I again have a 100% load, generally on one core but sometimes split between 2 or 3. Can I help to diagnose?
The reply is currently minimized Show
Accepted Answer

Intelliant

Offline
Friday, August 08 2014, 08:51 AM - #Permalink
Resolved

0 votes

Me too facing this same issue. Any detail required to help diagnose this shall be provided.
The reply is currently minimized Show
Accepted Answer
Legacy Disabled

Offline
Friday, August 08 2014, 02:36 PM - #Permalink
Resolved

0 votes

Nick Howitt wrote:

Do you have a comment on zombu2's solution further up the thread?

It's a different way to get to the same issue -- the network_detail_external table is getting too large.
The reply is currently minimized Show
Accepted Answer
Legacy Disabled

Offline
Friday, August 08 2014, 02:56 PM - #Permalink
Resolved

0 votes

It's normal to see MySQL working hard on an update, but it shouldn't last for more the 30-ish seconds. Every 15 minutes, the pmacct daemon does a data dump while the other reports update every 5 minutes. Are you seeing a sustained 100% usage? What's the 15-minute load? Are you running version 1.5.27 of app-network-detail?
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Friday, August 08 2014, 04:39 PM - #Permalink
Resolved

0 votes

Hi Peter,
Yes I see 100% sustained usage. From top:
load average: 1.13, 1.06, 1.01
I have 2 real cores and 2 virtual cores. If I watch this in htop, the system-mysqld load varies between one core at 100% and two cores totalling 100%.

Yes, I am runnig 1.5..27 reinstalled from yum yesterday afternoon:
[root@server ~]# rpm -qa | grep app-network-d app-network-detail-report-1.5.27-1.v6.noarch app-network-detail-report-core-1.5.27-1.v6.noarch
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Saturday, August 09 2014, 01:04 PM - #Permalink
Resolved

0 votes

FWIW, following this post my network_detail_external database has 770196 records. [strike]Following Dave Loper's comments in bug 1885 could the purge routine just be purging from network_detail and not network_detail_external?[/strike]

Also attached is my load average graph. The start of the problem is obvious as is the bit where I temporarily uninstalled the packages.

Attachments:

laod_average.png

laod_average.png
The reply is currently minimized Show
Accepted Answer
Nick Howitt

Offline
Saturday, August 09 2014, 08:31 PM - #Permalink
Resolved

0 votes

Hmm,
I've just worked out how to browse the reports in phpMyAdmin (where is root's password kept, I had to use reports'?) and, looking at the oldest entry in network_detail_external, it is dated 26 Jul 2014 so it looks like the data deletion routine is working. Is there any way I can do an sql query to count the records by day? I wonder if the 1.5.27 update suddenly increased the number of records being logged.
The reply is currently minimized Show

Accepted Answer

Peter Finch

Offline

Sunday, August 10 2014, 02:36 AM - #Permalink

Resolved

0 votes

First: There are now several threads on the system-mysqld excessive cpu usage for reports, and I have posted replies to several of those over the past 6 months. We probably need one thread but which one? I picked this one since it was most recently updated.

Second: I apologize for the length of this post but I wanted to make all this diagnostic information available.

Third: System is Community 6.5.0 (Final) with all released updates through today. And I do have the purge file referenced earlier in this thread with a size of 0 and date of Aug 2. (/var/clearos/network_detail_report/purge_external)

Peter

Let’s see what top shows hogging the CPU:

top - 18:00:12 up  5:28,  3 users,  load average: 1.17, 1.17, 1.21

Tasks: 277 total,   2 running, 275 sleeping,   0 stopped,   0 zombie

Cpu(s): 21.0%us, 29.6%sy,  0.0%ni, 47.1%id,  2.2%wa,  0.0%hi,  0.1%si,  0.0%st

Mem:   8050980k total,  7898036k used,   152944k free,  3585296k buffers

Swap:  8191992k total,        4k used,  8191988k free,  2775176k cached



  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 2091 system-m  20   0 2071m 105m 6472 S 199.8  1.3 373:54.90 system-mysqld

   78 root      39  19     0    0    0 S  0.3  0.0   0:17.07 kipmi0

 1746 ldap      20   0 1155m  84m 5284 S  0.3  1.1   0:24.97 slapd

 2192 ftp       20   0  147m 2112  812 S  0.3  0.0   0:07.43 proftpd

 2223 root      20   0  170m  20m  824 S  0.3  0.3   0:28.34 l7-filter

 2869 clearcon  20   0  657m 126m  33m S  0.3  1.6   0:22.56 gconsole

    1 root      20   0 21448 1556 1252 S  0.0  0.0   0:00.59 init

    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd

. . .

Two of my 4 3GHz cores are fully consumed by system-mysqld. (Yes I let top run for a while and it stayed at 200%.)

So my clearos system, on a very fast box, is absolutely loafing (hardly any cpu usage) beyond the reporting system. Why would/should reports do this? It shouldn't.

Login to MySQL as reports. (If you don't know where to find the password you probably shouldn't be doing this.)

/usr/clearos/sandbox/usr/bin/mysql -ureports -p???

Once logged in, select the reports database for your queries:

use reports;

Display the tables in the reports schema:

mysql> show tables;

+-------------------------+

| Tables_in_reports       |

+-------------------------+

| network                 |

| network_detail          |

| network_detail_external |

| proxy                   |

| proxy_domains           |

| resource                |

+-------------------------+

6 rows in set (0.00 sec)

How many rows are in each of these tables?

mysql> SELECT table_name, table_rows

    -> FROM INFORMATION_SCHEMA.TABLES

    -> WHERE TABLE_SCHEMA = 'reports';

+-------------------------+------------+

| table_name              | table_rows |

+-------------------------+------------+

| network                 |     653394 |

| network_detail          |    1617329 |

| network_detail_external |     885436 |

| proxy                   |    2013755 |

| proxy_domains           |          0 |

| resource                |     185377 |

+-------------------------+------------+

6 rows in set (0.20 sec)

Two of these tables have over a million rows. Let’s see how old the oldest record is for each:

mysql> SELECT timestamp FROM proxy ORDER BY timestamp ASC LIMIT 1;

+---------------------+

| timestamp           |

+---------------------+

| 2013-12-19 10:53:07 |

+---------------------+

1 row in set (0.90 sec)

mysql> SELECT stamp_inserted FROM network_detail ORDER BY stamp_inserted ASC LIMIT 1;

+---------------------+

| stamp_inserted      |

+---------------------+

| 2013-12-21 10:00:00 |

+---------------------+

1 row in set (0.49 sec)

Looks like rows in these tables are not being regularly pruned by the reports purge process as oldest records are from Dec 2013.

Which tables have indexes? Maybe we are querying large tables without indexes.

mysql> SELECT DISTINCT

    ->     TABLE_NAME,

    ->     INDEX_NAME

    -> FROM INFORMATION_SCHEMA.STATISTICS

    -> WHERE TABLE_SCHEMA = 'reports';

+---------------+------------+

| TABLE_NAME    | INDEX_NAME |

+---------------+------------+

| network       | PRIMARY    |

| network       | iface      |

| network       | timestamp  |

| proxy_domains | PRIMARY    |

| proxy_domains | ip         |

| proxy_domains | timestamp  |

| proxy_domains | hostname   |

| resource      | PRIMARY    |

+---------------+------------+

8 rows in set (0.00 sec)

Interestingly, there are no indexes on the two largest tables.

Let’s see what queries the system is currently running. Time is number of seconds in current state.

mysql> show FULL processlist;

+-----+---------+-----------------+---------+---------+------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Id  | User    | Host            | db      | Command | Time | State    | Info                                                                                                                                                                                             |

+-----+---------+-----------------+---------+---------+------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

|  21 | reports | localhost:42686 | reports | Query   |    0 | Locked   | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+72, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407603000) = stamp_inserted AND ip_dst='ff02::1:ff4c:29a4'                      |

|  68 | reports | localhost:47829 | reports | Query   |    2 | Locked   | INSERT INTO `network_detail_external` (stamp_updated, stamp_inserted, ip_dst, packets, bytes) VALUES (FROM_UNIXTIME(1407607201), FROM_UNIXTIME(1407606600), 'ff02::1:ff08:ade2', 1, 72)          |

| 115 | reports | localhost:53665 | reports | Query   |    1 | Locked   | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+72, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407610200) = stamp_inserted AND ip_dst='ff02::1:ff04:df52'                      |

| 133 | reports | localhost:54854 | reports | Query   |    3 | Locked   | INSERT INTO `network_detail_external` (stamp_updated, stamp_inserted, ip_dst, packets, bytes) VALUES (FROM_UNIXTIME(1407611701), FROM_UNIXTIME(1407610800), 'ff02::1:ff26:64f6', 2, 144)         |

| 162 | reports | localhost:59025 | reports | Query   |    1 | Locked   | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+72, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407613800) = stamp_inserted AND ip_dst='ff02::1:ff15:46d2'                      |

| 189 | reports | localhost:33090 | reports | Query   |    0 | Locked   | INSERT INTO `network_detail_external` (stamp_updated, stamp_inserted, ip_dst, packets, bytes) VALUES (FROM_UNIXTIME(1407616201), FROM_UNIXTIME(1407615000), 'ff02::1:ffef:f2bc', 1, 72)          |

| 210 | reports | localhost:35334 | reports | Query   |    1 | Locked   | INSERT INTO `network_detail_external` (stamp_updated, stamp_inserted, ip_dst, packets, bytes) VALUES (FROM_UNIXTIME(1407618001), FROM_UNIXTIME(1407616800), 'ff02::1:ffbc:5708', 1, 72)          |

| 236 | reports | localhost:37628 | reports | Query   |    1 | Locked   | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+72, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407618600) = stamp_inserted AND ip_dst='ff02::1:ff43:ef6c'                      |

| 243 | reports | localhost       | reports | Query   |    0 | NULL     | show FULL processlist                                                                                                                                                                            |

| 247 | reports | localhost:38735 | reports | Query   |    4 | Updating | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+72, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407619800) = stamp_inserted AND ip_dst='ff02::1:ff53:20e'                       |

| 256 | reports | localhost:39442 | reports | Query   |    2 | Updating | UPDATE network_detail SET ip='??

?', hostname='wemoswitchc70.local.lan', username='', device_vendor='', device_type='' WHERE (ip_src='192.168.10.170' OR ip_dst='192.168.10.170') AND ip IS NULL |

+-----+---------+-----------------+---------+---------+------+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

11 rows in set (0.00 sec)

So a bunch of queries running. The last two are active updates and sucking up the CPU. The others are inserts and updates blocked while they wait for tables to be unlocked by completion of the two update queries. Adding indexes may help on the updates but may also slow down inserts for large tables...

I welcome feedback from devs...

The reply is currently minimized Show

Accepted Answer

Nick Howitt

Offline

Sunday, August 10 2014, 07:31 AM - #Permalink

Resolved

0 votes

I've worked out how to summarise by dates in sql now and for network_detail_external I get:

mysql> SELECT count(*), DATE(stamp_inserted) DateOnly FROM `network_detail_external` GROUP BY DateOnly;

+----------+------------+

| count(*) | DateOnly   |

+----------+------------+

|    54780 | 2014-07-26 |

|    54253 | 2014-07-27 |

|    53489 | 2014-07-28 |

|    57178 | 2014-07-29 |

|    54701 | 2014-07-30 |

|    52011 | 2014-07-31 |

|    51586 | 2014-08-01 |

|    54122 | 2014-08-02 |

|   110018 | 2014-08-03 |

|    45995 | 2014-08-04 |

|    52052 | 2014-08-05 |

|    32959 | 2014-08-06 |

|    11383 | 2014-08-07 |

|    56079 | 2014-08-08 |

|    51203 | 2014-08-09 |

|    16981 | 2014-08-10 |

+----------+------------+

16 rows in set (1.92 sec)

Unfortunately my history does not go back as far as the 1.5.27 update so I can't prove anything about the latest update. The dip on 07 Aug was when I had removed the network detail report.

For network_detail I get:

mysql> SELECT count(*), DATE(stamp_inserted) DateOnly FROM `network_detail` GROUP BY DateOnly;

+----------+------------+

| count(*) | DateOnly   |

+----------+------------+

|       46 | 2014-05-18 |

|      789 | 2014-05-19 |

|      862 | 2014-05-20 |

|      698 | 2014-05-21 |

|      822 | 2014-05-22 |

|      746 | 2014-05-23 |

|      853 | 2014-05-24 |

|      925 | 2014-05-25 |

|     1112 | 2014-05-26 |

|     1016 | 2014-05-27 |

|      834 | 2014-05-28 |

|      873 | 2014-05-29 |

|      915 | 2014-05-30 |

|      861 | 2014-05-31 |

|      973 | 2014-06-01 |

|      856 | 2014-06-02 |

|      740 | 2014-06-03 |

|      824 | 2014-06-04 |

|      782 | 2014-06-05 |

|      939 | 2014-06-06 |

|     1027 | 2014-06-07 |

|     1097 | 2014-06-08 |

|      868 | 2014-06-09 |

|      797 | 2014-06-10 |

|      828 | 2014-06-11 |

|      824 | 2014-06-12 |

|     1051 | 2014-06-13 |

|     1147 | 2014-06-14 |

|      746 | 2014-06-15 |

|      826 | 2014-06-16 |

|      856 | 2014-06-17 |

|      863 | 2014-06-18 |

|      831 | 2014-06-19 |

|      749 | 2014-06-20 |

|     1016 | 2014-06-21 |

|      902 | 2014-06-22 |

|      957 | 2014-06-23 |

|      668 | 2014-06-24 |

|      587 | 2014-06-25 |

|      537 | 2014-06-26 |

|      662 | 2014-06-27 |

|      613 | 2014-06-28 |

|      524 | 2014-06-29 |

|      612 | 2014-06-30 |

|      501 | 2014-07-01 |

|      576 | 2014-07-02 |

|      446 | 2014-07-03 |

|      547 | 2014-07-04 |

|      673 | 2014-07-05 |

|      796 | 2014-07-06 |

|      616 | 2014-07-07 |

|      545 | 2014-07-08 |

|      666 | 2014-07-09 |

|      643 | 2014-07-10 |

|      834 | 2014-07-11 |

|      948 | 2014-07-12 |

|      882 | 2014-07-13 |

|      688 | 2014-07-14 |

|      652 | 2014-07-15 |

|      749 | 2014-07-16 |

|      678 | 2014-07-17 |

|      647 | 2014-07-18 |

|      757 | 2014-07-19 |

|      621 | 2014-07-20 |

|      739 | 2014-07-21 |

|      813 | 2014-07-22 |

|      562 | 2014-07-23 |

|      531 | 2014-07-24 |

|      625 | 2014-07-25 |

|      577 | 2014-07-26 |

|      410 | 2014-07-27 |

|      441 | 2014-07-28 |

|      519 | 2014-07-29 |

|      515 | 2014-07-30 |

|      530 | 2014-07-31 |

|      481 | 2014-08-01 |

|      668 | 2014-08-02 |

|      509 | 2014-08-03 |

|      551 | 2014-08-04 |

|      639 | 2014-08-05 |

|      383 | 2014-08-06 |

|      211 | 2014-08-07 |

|      768 | 2014-08-08 |

|      408 | 2014-08-09 |

|      100 | 2014-08-10 |

+----------+------------+

85 rows in set (0.05 sec)

This looks reasonable but I am getting different results from Peter Finch. My tables seem to get purged and my network_detail_external is much larger than network_detail.

The reply is currently minimized Show

Accepted Answer
Tim Burgess

Offline
Monday, August 11 2014, 04:43 PM - #Permalink
Resolved

0 votes

I'll try and have a poke around with this tonight

Do you also have the app-network-map installed and configured? from the SQL queries above it maybe tripping over a large number of IPV6 address traffic... Nick do you see similar queries if you run 'show FULL processlist;'

On a quiet system post install I have none.... to alleviate the 'runaway' nature of the script try increasing the time interval in cron (/etc/cron.d/app-network-detail-report) currently scheduled at 5minute intervals
*/5 * * * * root /usr/sbin/networkdetail2db >/dev/null 2>&1
12 3 * * * root /usr/sbin/networkdetailpurge >/dev/null 2>&1
The reply is currently minimized Show

Accepted Answer

Nick Howitt

Offline

Monday, August 11 2014, 05:12 PM - #Permalink

Resolved

0 votes

Hi Tim,

I do have app-network-map installed. I've had a little browse through system-mysql and I did not notice any ipv6 traffic. I've tried exporting the file to a csv file to have a further look but my version of Excel (2003) won't load it as it is way to big.

Here is my process list:

mysql> show FULL processlist;

+-----+---------+-----------------+---------+---------+------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Id  | User    | Host            | db      | Command | Time | State    | Info                                                                                                                                                                      |

+-----+---------+-----------------+---------+---------+------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| 169 | reports | localhost:56274 | reports | Query   |    1 | Updating | UPDATE `network_detail_external` SET packets=packets+2, bytes=bytes+244, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407775200) = stamp_inserted AND ip_dst='49.205.148.242' |

| 170 | reports | localhost:56276 | reports | Query   |    1 | Locked   | UPDATE `network_detail_external` SET packets=packets+1, bytes=bytes+131, stamp_updated=NOW() WHERE FROM_UNIXTIME(1407775200) = stamp_inserted AND ip_src='88.222.186.3'   |

| 179 | reports | localhost       | NULL    | Query   |    0 | NULL     | show FULL processlist                                                                                                                                                     |

+-----+---------+-----------------+---------+---------+------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3 rows in set (0.00 sec)

Note it is shorter than when I looked yesterday, perhaps because I rebooted earlier to day to move the kernel on, but my 15min load average is now up to 1.17 (out of 4).

I'll try your approach to cron first but as I am on holiday soon I may have to remove it again.

Does your post indicate you are also having the same issue?

[edit]
I've just loaded the table in Access (which I don't really know) and I am really surprised at the amount of traffic logged and the amount of single byte traffic (pings?) - 526775 out of 878400 packets.
[/edit]

[edit2]
Also 235684 records with NULL hostname
[/edit2]

[edit3]
Changing the cron job did nothing (or very little), and, yes, I did restart crond after the edit.
I notice /var/log/system is getting a lot of:

Aug 11 19:20:01 server networkdetail2db: Unable to start script - currently running.

Aug 11 19:40:01 server networkdetail2db: Unable to start script - currently running.

Aug 11 19:50:01 server networkdetail2db: Unable to start script - currently running

Looking at the system log it seems like one error message is missed after about 50 mins. This means that the script is taking that long to execute!
[/edit3]

The reply is currently minimized Show

Accepted Answer
Tim Burgess

Offline
Monday, August 11 2014, 09:53 PM - #Permalink
Resolved

0 votes

Hi Nick, no I don't have the same symptoms here, but my table is only small so far.

FYI you can browse the system-database contents using the phpMyAdmin web interface (https://clearosip:81/mysql). This is a new feature in ClearOS 6.6 beta 1, select the 'system database' from the drop down and login with your system-mysql root password.

I also have lots of IP entries in network_detail_external but not as much as yours... do you by any chance also host torrents? this might explain the large growth of very small packet connections
The reply is currently minimized Show
Accepted Answer
Tim Burgess

Offline
Monday, August 11 2014, 10:36 PM - #Permalink
Resolved

0 votes

Some more poking around... I also see lots of NULL IP records

I've been experimenting with adding a table index to the mysql query columns to improve the query performance. My hunch is that with large amounts of TCP connections the database size grows rapidly and so does the time taken to query and update the host information in it - hence my question about Torrents. The queries appear to usually use WHERE stamp_inserted, ip_dst OR ip_src... but as I don't have the symptoms I can't tell if these improve things

If you are happy to test could you try running the following on the reports database?

ALTER TABLE `network_detail_external` ADD INDEX(`ip_src`); ALTER TABLE `network_detail_external` ADD INDEX(`ip_dst`); ALTER TABLE `network_detail_external` ADD INDEX(`stamp_inserted`);
The reply is currently minimized Show
Accepted Answer
Tim Burgess

Offline
Monday, August 11 2014, 11:02 PM - #Permalink
Resolved

0 votes

Something else to try...

modifying the pmacctd daemon (I incorrectly thought it was the cron job earlier but that just calls a script to update the mappings in network_detail which is by comparison much smaller than network_detail_external)

With reference to:-
http://wiki.pmacct.net/OfficialExamples

You should have two generate configlets by ClearOS in /etc/pmacctd/pmacctd_ethX.conf - where ethX is your WAN and will log to network_detail_external

In here are two parameters, sql_refresh_time: 900, sql_history: 10m... try reducing the first to say 600 so that only INSERTS and not UPDATES are carried out once per SQL timeslot...
Alternatively drop it right down to 60, which will reduce the amount of traffic kept in memory (at the expense of more frequent but smaller database inserts) to try and reduce the size of data being dumped into system-mysql at any one time

You may need to run 'service pmacctd restart' to implement
The reply is currently minimized Show
Accepted Answer
Tim Burgess

Offline
Monday, August 11 2014, 11:43 PM - #Permalink
Resolved

0 votes

P.S pmacctd daemon gobbles up CPU... on a 50Mbps download from my LAN it accounts for nearly 50% of my CPU time, cumulatively more than snort which is doing packet inspection as well!

18403 snort 30 10 469m 174m 4152 R 43.1 3.6 6:19.41 snort 17679 squid 30 10 158m 86m 3664 R 29.2 1.8 2:09.92 squid 13389 root 20 0 61156 10m 9968 S 19.2 0.2 0:46.42 pmacctd 13398 root 20 0 64692 14m 4140 S 10.6 0.3 0:25.17 pmacctd 13394 root 20 0 64692 14m 4140 S 10.3 0.3 0:24.75 pmacctd 30272 dansguar 30 10 128m 16m 1340 R 10.0 0.3 0:14.85 dansguardian-av 13397 root 20 0 61156 10m 9.8m S 9.0 0.2 0:10.88 pmacctd

EDIT: adding "plugin_buffer_size: 1024" to both pmacctd_ethX.conf files seems to have significantly reduced the usage! as per the FAQ here. Some room for tuning?
http://wiki.pmacct.net/OfficialFAQs
The reply is currently minimized Show

Load more replies (37 )

Your Reply

Please login to post a reply

You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here.

Community Forums

ClearOS Portal

ClearVM Platform

ClearVM 2 Platform

Forums