If you use StatsStore/Analytics on Solaris 11 you should check out the new GitHub oraclesolaris-contrib repository. For more details see Tanmay's blog post.
Robert Milkowski's blog
Friday, September 25, 2020
Tuesday, December 03, 2019
Checking/modifying file permissions on an underlying directory
If you have a file system (nfs, etc.) mounted on-top of a directory and you need to see file permissions, ACLs, etc. of the underlying directory and not the mounted fs then:
Solaris:
# mount -F lofs -o nosub /some/path/ /mnt/fix
Linux:
# mount -B /some/path/ /mnt/fix
If you access /mnt/fix now you won't see any mounted filesystems on top of any directories, just the underlying fs.
I used it in the past and forgot about it, needed it yesterday and found it here.
Monday, October 21, 2019
ONTAP 9.6+: REST API: netapp_ontap: performance of get_collection() vs. get()
NetApp's ONTAP 9.6 introduced new REST API along with a new python module netapp_ontap.
#!/bin/python
import time
import getpass
import netapp_ontap
from netapp_ontap import config
from netapp_ontap.host_connection import HostConnection
from netapp_ontap.resources import QuotaRule
password = getpass.getpass()
config.CONNECTION = HostConnection(host, user, password)
start = time.time()
quota_rules = QuotaRule.get_collection(fields='type')
# quota_rules = QuotaRule.get_collection()
total = 0
for l in quota_rules:
# l.get(fields='type')
total = total + 1
end = time.time()
print(total)
print(end-start)
#!/bin/python
import time
import getpass
import netapp_ontap
from netapp_ontap import config
from netapp_ontap.host_connection import HostConnection
from netapp_ontap.resources import QuotaRule
password = getpass.getpass()
config.CONNECTION = HostConnection(host, user, password)
start = time.time()
quota_rules = QuotaRule.get_collection(fields='type')
# quota_rules = QuotaRule.get_collection()
total = 0
for l in quota_rules:
# l.get(fields='type')
total = total + 1
end = time.time()
print(total)
print(end-start)
With 471 quota rules on my filer it takes about 4s with get_collection(fields='type') vs. ~40s when calling l.get(fields='type') for each rule being processed. So if you are after all the entries it is more quicker to pass the required fields to get_collection() and not call get() on each returned resource.
I did a little bit of debugging and it seems that each get() results in a new TCP/HTTPS connection being established which is likely the main reason of the much worse performance performance. Also get_collection() gets all 471 results in a single HTTP GET.
There seems to be a bug in regards to re-ussing connections though, as it shouldn't have to establish a new session for each get.
Thursday, June 13, 2019
DTrace: nfsv4 provider and utf8string
The nfsv4 provider provides some structures with component4 type which is defined as:
So for example, to print NFSv4 file renames you have to do:
Ideally DTrace (strjoin(), etc.) should deal with utf8string type automatically.
typedef struct { uint_t utf8string_len; char *utf8string_val; } utf8string; typedef utf8string component4;
So for example, to print NFSv4 file renames you have to do:
nfsv4:::op-rename-start { this->a = (char *)alloca(args[2]->oldname.utf8string_len + 1); bcopy(args[2]->oldname.utf8string_val, this->a, args[2]->oldname.utf8string_len); this->a[args[2]->oldname.utf8string_len + 1] = '\0'; this->b = (char *)alloca(args[2]->newname.utf8string_len + 1); bcopy(args[2]->newname.utf8string_val, this->b, args[2]->newname.utf8string_len); this->b[args[2]->newname.utf8string_len + 1] = '\0'; printf("NFSv4 rename: %s\n", strjoin(this->a, strjoin(" -> ", this->b))); }
Ideally DTrace (strjoin(), etc.) should deal with utf8string type automatically.
Linux Load Averages
Linux measures load average differently than other OS'es. In a nutshell it includes both CPU and disk i/o and more. Brendan has an excellent blog entry on this explaining in much more detail how it works.
Friday, May 03, 2019
Testing ZFS/L2ARC
Solaris 11.4 - setting zfs_arc_collect_check=0 via mdb (takes immediate effect) or via /etc/system makes ZFS to start feeding l2arc immediately. Notice that this can negatively impact ARC performance so use it with care. This is useful for testing if you want to push some data to into L2ARC quicker/sooner, especially on large memory systems. The variable is checked by arc_can_collect() function (if it returns 1 then l2arc cab be fed, if zero it can't).
Monday, March 11, 2019
DTrace stop() action
The stop() action in DTrace stops an entire process... well, actually it doesn't. It stops a single thread in a multi-threaded process, which got me surprised as I always thought it did stop an entire process. Now, this is actually very useful, though a stopall() action which would stop all threads could now be useful as well :)
Update: this is getting more complicated now, the way stop() action behaves depends on probe type it is called from. For example, if called from a probe from syscall provider it will just stop a thread which called the syscall, but if called from a probe from PID provider it will stop entire process with all its threads. This is getting confusing...
btw: pstop PID stops entire process while pstop PID/LWPID stops a single thread
Update: this is getting more complicated now, the way stop() action behaves depends on probe type it is called from. For example, if called from a probe from syscall provider it will just stop a thread which called the syscall, but if called from a probe from PID provider it will stop entire process with all its threads. This is getting confusing...
btw: pstop PID stops entire process while pstop PID/LWPID stops a single thread
Friday, March 01, 2019
DTrace %Y print format with nanoseconds
Small but useful extension to DTrace is now available in Solaris 11.4.SRU6. It allows to easily print current date with an optional nanosecond resolution. It is disabled by default for backward compatibility.
To enable it you need to add timedecimals option to dtrace:
# dtrace -q -x timedecimals -n syscall::open*:entry \
'{printf("%Y %s called %s()\n", walltimestamp, execname, probefunc);}'
2019 Mar 1 11:50:48.774114445 firefox called openat64()
2019 Mar 1 11:50:49.149290513 dtrace called openat()
2019 Mar 1 11:50:49.149283375 dtrace called openat()
2019 Mar 1 11:50:50.030217373 firefox called openat64()
2019 Mar 1 11:50:49.974253263 firefox called openat64()
2019 Mar 1 11:50:50.114684381 VBoxService called openat()
^C
You can also specify number of decimal places to be printed, fox example:
# dtrace -q -x timedecimals=2 -n syscall::open*:entry \
'{printf("%Y %s called %s()\n", walltimestamp, execname, probefunc);}'
2019 Mar 1 11:56:51.09 VBoxService called openat()
2019 Mar 1 11:56:51.09 VBoxService called openat()
2019 Mar 1 11:56:51.45 dtrace called openat()
^C
To enable it you need to add timedecimals option to dtrace:
# dtrace -q -x timedecimals -n syscall::open*:entry \
'{printf("%Y %s called %s()\n", walltimestamp, execname, probefunc);}'
2019 Mar 1 11:50:48.774114445 firefox called openat64()
2019 Mar 1 11:50:49.149290513 dtrace called openat()
2019 Mar 1 11:50:49.149283375 dtrace called openat()
2019 Mar 1 11:50:50.030217373 firefox called openat64()
2019 Mar 1 11:50:49.974253263 firefox called openat64()
2019 Mar 1 11:50:50.114684381 VBoxService called openat()
^C
You can also specify number of decimal places to be printed, fox example:
# dtrace -q -x timedecimals=2 -n syscall::open*:entry \
'{printf("%Y %s called %s()\n", walltimestamp, execname, probefunc);}'
2019 Mar 1 11:56:51.09 VBoxService called openat()
2019 Mar 1 11:56:51.09 VBoxService called openat()
2019 Mar 1 11:56:51.45 dtrace called openat()
^C
Friday, November 23, 2018
RAID-Z improvements and cloud device support
Solaris 11.4 introduced few new ZFS pool versions with interesting new features or enhancements:
# zpool upgrade -v
...
38 Xcopy with encryption
39 Resilver restart enhancements
40 New deduplication support
41 Asynchronous dataset destroy
42 Reguid: ability to change the pool guid
43 RAID-Z improvements and cloud device support
44 Device removal
...
The RAID-Z improvements mean that data is written more efficiently - in some cases it can now store more data in a pool than before. But even more importantly the performance (both throughput and IOPS) of RAIDZ is now close to RAID10!
# zpool upgrade -v
...
38 Xcopy with encryption
39 Resilver restart enhancements
40 New deduplication support
41 Asynchronous dataset destroy
42 Reguid: ability to change the pool guid
43 RAID-Z improvements and cloud device support
44 Device removal
...
The RAID-Z improvements mean that data is written more efficiently - in some cases it can now store more data in a pool than before. But even more importantly the performance (both throughput and IOPS) of RAIDZ is now close to RAID10!
Friday, November 09, 2018
Spectre and Kernel Modules
On Linux one needs to recompile kernel modules to get protection, while on Solaris this is not necessary. Once you are on Solaris 11.4 with Spectre fixes enabled, all kernel modules, even compiled on older Solaris releases, just work and are protected. Nothing special to do there.
Friday, October 12, 2018
bpftrace
Right, finally Linux is getting something similar and useful to DTrace, see bpftrace. However for it to be useful in enterprise it has to be included in RedHat - I wonder how long it will take though... but maybe around 2020 this will finally happen and then Linux will truly have an equivalent of DTrace, even if 15 years later.
Tuesday, October 02, 2018
Solaris: Spectre v2 & Meltdown fixes
Solaris 11.4 includes fixes for Meltdown and Spectre v2 (fixes for v1 were delievered few months ago for 11.3 via SRU and are also included in 111.4). What I really like about them is that you can turn them on or off via sxadm. The sxadm command will also report if your HW requires the fixes and if they are enabled or not. Additionally there is an FMA alert generated if you HW should have fixes enabled but due to old microcode it can't be done - so this way you also get alerting. Very nice intergration indeed.
Example output with Solaris running in Virtual Box:# sxadm status EXTENSION STATUS FLAGS aslr enabled (tagged-files) u-c-- nxstack enabled (all) u-c-- nxheap enabled (tagged-files) u-c-- kpti enabled -kcr- ibpb not supported ----- ibrs not supported ----- smap not supported -----
The kpti is fix for Meltdow and it is active, while ibpb and ibrs are mitigations for Spectre v2 and are not enabled as these are not supported on this HW.
Let's see how FMA is reporting an old version of microcode:
# fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Oct 02 14:19:24 383538f1-9268-4a07-9ff8-86be48c02e72 SUNOS-8000-LG Major Problem Status : open Diag Engine : software-diagnosis / 0.2 System Manufacturer : unknown Name : unknown Part_Number : unknown Serial_Number : unknown System Component Manufacturer : innotek GmbH Name : VirtualBox Part_Number : Serial_Number : 0 Firmware_Manufacturer : innotek GmbH Firmware_Version : (BIOS)VirtualBox Firmware_Release : (BIOS)12.01.2006 Host_ID : 00482293 Server_Name : solaris ---------------------------------------- Suspect 1 of 1 : Problem class : alert.oracle.solaris.cpu.firmware.security Certainty : 100% FRU Status : Active Location : "/SYS/MB" Manufacturer : unknown Name : unknown Part_Number : unknown Revision : unknown Serial_Number : unknown Chassis Manufacturer : Oracle Corporation Name : VirtualBox Part_Number : Serial_Number : 0 Resource Status : Active Response : No automated response available Impact : Oracle Solaris is not running with Spectre Vulnerability Mitigation Enabled Action : Update the CPU with Spectre capable microcode. Please refer to the associated reference document at http://support.oracle.com/msg/SUNOS-8000-LG for the latest service procedures and policies regarding this diagnosis.
Tuesday, August 28, 2018
Friday, July 06, 2018
dumpadm -d none
Solaris 11.3 still doesn't support dumpadm -d none. This is useful in some scenarios, for example when troubleshooting failed AI installations when you try to restart it withouth rebooting. This will generally fail as it won't be able to destroy rpool as there is a dump device already configured there.
There is a workaround though.
Edit /etc/dumpadm.conf file and comment out line containing DUMPADM_DEVICE, then run dumpadm -u.
This will unconfigure dump device entirely. Then just run zpool destroy rpool and now you can svcadm clear auto-installer.
Monday, June 18, 2018
ZFS Raw Send
This got finally integrated into 11.3 SRU 11.3.33.5.0
zfs send compressed data (Bug 15387669)
Subscribe to:
Posts (Atom)