My Digital Brain: 2009

Wednesday, December 16, 2009

Solaris 10 and VMWare vmxnet ethernet driver

The problem that I had been experiencing was that I was unable to install the VMWare vxmnet ethernet driver onto a Solaris 10 x86 server running on a VMWare vSphere 4.0 server.

The solution to getting the vmxnet interface to work is as follows:

1) install stock e1000 ethernet interface on a Solaris VM
2) install the VMWare 4.0 tools on a Solaris VM
3) halt and power off VM
4) install a second ethernet interface onto a Solaris VM, using the vmxnet3 driver
5) ifconfig vmxnet3s0 plumb
6) ifconfig e1000g0 unplumb
7) mv /etc/hostname.e1000g0 /etc/hostname.vmxnet3s0
8) reboot

The trick is to have two ethernet interfaces (e1000 and vmxnet3) and just unplumb from Solaris the e1000 interface. Don't remove the e1000 interface from the VMWare host configuration.

Monday, December 14, 2009

Solaris syslog-ng error

If you get an error like:

# svcs syslog-ng
STATE STIME FMRI
maintenance 8:50:44 svc:/system/syslog-ng:default

# svcs -xv syslog-ng
svc:/system/syslog-ng:default (syslog-ng)
State: maintenance since Mon Dec 14 08:50:44 2009
Reason: Start method failed repeatedly, last exited with status 1.
See: http://sun.com/msg/SMF-8000-KS
See: man -M /usr/share/man -s 1M syslog-ng
See: /var/svc/log/system-syslog-ng:default.log
Impact: This service is not running.

# cat /var/svc/log/system-syslog-ng:default.log
[ Dec 14 08:50:42 Disabled. ]
[ Dec 14 08:50:42 Rereading configuration. ]
[ Dec 14 08:50:43 Enabled. ]
[ Dec 14 08:50:43 Executing start method ("/lib/svc/method/svc-syslog-ng") ]
Error parsing command line arguments: Conversion from character set '646' to 'UTF-8' is not supported[ Dec 14 08:50:44 Method "start" exited with status 1 ]
[ Dec 14 08:50:44 Executing start method ("/lib/svc/method/svc-syslog-ng") ]
Error parsing command line arguments: Conversion from character set '646' to 'UTF-8' is not supported[ Dec 14 08:50:44 Method "start" exited with status 1 ]
[ Dec 14 08:50:44 Executing start method ("/lib/svc/method/svc-syslog-ng") ]
Error parsing command line arguments: Conversion from character set '646' to 'UTF-8' is not supported[ Dec 14 08:50:44 Method "start" exited with status 1 ]
[ Dec 14 08:52:06 Rereading configuration. ]

This has nothing to do with the configuration, merely a missing charset.alias file.

To correct:

# ln -s /usr/lib/charset.alias /usr/local/lib/charset.alias

# svcadm clear syslog-ng

# svcs syslog-ng
STATE STIME FMRI
online 8:59:38 svc:/system/syslog-ng:default

Tuesday, December 8, 2009

Change WMware vSphere IP Address

To change the IP Address of a vSphere 4.0 server:

1) check the service console:

# esxcfg-vswif -l

[will return the current IP information and the name of the Service Console, in most cases: vswif0]

2) run the esxcfg-vswif command with the new information

# esxcfg-vswif -i newipaddress -n newsubnetmask vswif0

3) check the following files:

# cat /etc/sysconfig/network-scripts/ifcfg-vswif0

make sure the new information is in the file.

# cat /etc/sysconfig/network

4) restart the network -- this will disconnect you unless you are on a console:

# service network restart

You should be done at this point.

Monday, November 30, 2009

Citrix XenApp Server Application Installation Best Practice

Prior to installing an application on a Citrix XenApp server:

Start --> run --> change user/install

[install the applications and all updates]

Start --> run --> change user/execute

That will ensure that all of the registry settings will work under XenApp.

Tuesday, October 27, 2009

Astaro not updating

For Unified Threat Management at some of my sites I like to use the Astaro Security Gateway (ASG). Sometimes the update process gets wedged and new firmware won't download to the box.

There is a command line ssh access that will allow a user to get into the system, but it's disabled for security reasons. In order to fix the update problem, ssh must be enabled.

I won't go into how to enable ssh or how to log into the system, but I will detail the fix:

Once logged in as root, check the files in /var/up2date/sys -- if there are files there, remove them and then upload the new firmware via scp to /var/up2date/sys.

After the files have been uploaded, type the following commands:

audld.plx

auisys.plx --rpmargs --force

This will verify the digital signature of the update file and if correct, install the firmware update and reboot the ASG.

Problem solved.

Thursday, October 1, 2009

VMware service console memory setting

I had been running vmware ESX server 3.5 with the default memory setting of 272MB. This caused a couple of host migration and snapshot timeouts.

VMWare's suggested value is 800MB. So you click on the Properties... link in the VIC or vSphere Client and change the number:

The setting won't take effect until the system is rebooted and the Memory Configuration now reflects that:

Tuesday, September 22, 2009

Solaris Live Upgrade

Never tried Live Upgrade, and my first attempt was a disaster -- now that's not to say it doesn't work, but my configuration didn't lend itself to the perfect Live Upgrade process. I had non-global zones mounted from SAN volumes on /zones and when I ran Live Upgrade I didn't realize that I needed to size my system for that and add in additional SAN volumes to copy the non-global zones as well.

The other major problem was that I didn't have all of the correct patches for Live Upgrade and ran into a major ludelete bug that when run deleted files in /zones and destroyed my non-global zones.

So, now that I realize the issues, here are the command that I used. The logic is sound and I've since rebuilt the server and put the non-global zones on the root filesystem so that when I try Live Upgrade again I won't be bitten by these issues.

I added a second san volume that's the same size as server's current boot volume.

1) copy over the Solaris 10 iso and mount it via loopback:

# mkdir /iso; mount -F hsfs -o ro `lofiadm -a /var/tmp/sol-10-u6-ga1-sparc-dvd.iso` /iso

2) format the new volume:

c2t6000D31000089A000000000000000043d0s0 existing root volume
c2t6000D31000089A0000000000000000E9d0s0 new root volume

3) patch the live upgrade distribution

SunSolve document: 206844:

3) install the live upgrade packages from the distribution:

# cd /iso/Solaris_10/Tools/Installers
# ./liveupgrade20 -noconsole -nodisplay

4) run the lucreate command to create a copy ofthe active boot environment

# lucreate -c sol10807 -n sol101008 -m /:c2t6000D31000089A0000000000000000E9d0s0:ufs

[where sol10807 is the name of the original boot disk and sol101008 is the name for the new boot disk]

5) after the new boot disk is created, the upgrade is run against the new boot disk:

# luugrade -u -n sol101008 -s /iso

6) after the upgrade is done, activate the new boot environment:

# luactivate sol101008

7) init 6

8) test the new boot environment and patch it

9) after testing remove the initial boot environment with the ludelete command, but save the original SAN volume for future upgrades.

In the future I could run patches using Live Upgrade.

Check out this SUN article.

Tuesday, August 18, 2009

Solaris processor sets and resource pools

As a follow up on the earlier posting about setting up hard cpu limits under Solaris. That posting didn't address processor sets and resource pools, this will. The basic steps are as follows:

1. build a processor set with a number of CPU's assigned
2. bind the processor set to a resource pool
3. bind the zone to the same resource pool

1. I’ll create three non-global zones. The first is bound to a two processor set and the second two zones are bound to a quad processor set. In the global zone, list the available CPU’s:

root@x4140> psrinfo
0 on-line since 08/11/2009 13:18:42

1 on-line since 08/11/2009 13:18:48

2 on-line since 08/11/2009 13:18:50

3 on-line since 08/11/2009 13:18:52

4 on-line since 08/11/2009 13:18:54

5 on-line since 08/11/2009 13:18:56

6 on-line since 08/11/2009 13:18:58

7 on-line since 08/11/2009 13:19:00
2. Enable the resource pool facility on the global zone:

root@x4140> pooladm –e

3. Save the configuration:

root@x4140> pooladm –s

4. List the pools:

root@x4140> pooladm
system default
string system.comment
int system.version 1
boolean system.bind-default true
string system.poold.objectives wt-load

pool pool_default
int pool.sys_id 0
boolean pool.active true
boolean pool.default true
int pool.importance 1
string pool.comment
pset pset_default

pset pset_default
int pset.sys_id -1
boolean pset.default true
uint pset.min 1
uint pset.max 65536
string pset.units population
uint pset.load 6
uint pset.size 8
string pset.comment

cpu
int cpu.sys_id 7
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 4
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 1
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 6
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 3
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 0
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 5
string cpu.comment
string cpu.status on-line

cpu
int cpu.sys_id 2
string cpu.comment
string cpu.status on-line

5. Create the processor set that contains two CPU's:

root@x4140> poolcfg -c 'create pset duo-pset (uint pset.min=1; uint pset.max=4)'

6. Create a resource pool for the processor set:

root@x4140> poolcfg -c 'create pool duo-pool'

7. Assign the resource pool to the processor set:

root@x4140> poolcfg -c 'associate pool duo-pool (pset duo-pset)'

8. Enable the configuration:

root@x4140> pooladm –c

9. List the changes:

root@x4140> pooladm

system default

string system.comment

int system.version 1

boolean system.bind-default true

string system.poold.objectives wt-load

pool duo-pool

int pool.sys_id 1

boolean pool.active true

boolean pool.default false

int pool.importance 1

string pool.comment

pset duo-pset

pool pool_default

int pool.sys_id 0

boolean pool.active true

boolean pool.default true

int pool.importance 1

string pool.comment

pset pset_default

pset duo-pset

int pset.sys_id 1

boolean pset.default false

uint pset.min 1

uint pset.max 4

string pset.units population

uint pset.load 0

uint pset.size 2

string pset.comment

cpu

int cpu.sys_id 3

string cpu.comment

string cpu.status on-line

cpu
int cpu.sys_id 4

string cpu.comment

string cpu.status on-line

pset pset_default

int pset.sys_id -1

boolean pset.default true

uint pset.min 1

uint pset.max 65536

string pset.units population

uint pset.load 2

uint pset.size 4

string pset.comment

cpu
int cpu.sys_id 7

string cpu.comment

string cpu.status on-line

cpu
int cpu.sys_id 4

string cpu.comment

string cpu.status on-line

cpu

int cpu.sys_id 6

string cpu.comment

string cpu.status on-line

cpu

int cpu.sys_id 5

string cpu.comment

string cpu.status on-line

10. Make the changes to the zone:

root@x4140> zonecfg -z zone
zonecfg:zone> set pool=duo-pool
zonecfg:zone> verify
zonecfg:zone> commit
zonecfg:zone> exit

11. Create a second processor set with four CPU's and assign it to a new resource pool:

root@x4140> poolcfg -c 'create pset quad-pset (uint pset.min=1; uint pset.max=4)'
root@x4140> poolcfg -c 'create pool quad-pool'
root@x4140> poolcfg -c 'associate pool quad-pool (pset quad-pset)'
root@x4140> pooladm -c

12. Make changes to the second and third zones:

root@x4140> zonecfg -z zone2
zonecfg:zone> set pool=quad-pool
zonecfg:zone> verify
zonecfg:zone> commit
zonecfg:zone> exit

root@x4140> zonecfg -z zone3
zonecfg:zone3> set pool=quad-pool
zonecfg:zone3> verify
zonecfg:zone3> commit
zonecfg:zone3> exit

13. List the processors available to the zones:

root@x4140> zlogin zone

[Connected to zone 'zone' pts/3]

Last login: Mon Aug 17 16:25:23 on pts/3

Sun Microsystems Inc. SunOS 5.10 Generic January 2005

# psrinfo

3 on-line since 08/17/2009 16:46:44

4 on-line since 08/17/2009 16:46:46

root@x4140-a> zlogin zone2

[Connected to zone 'zone2' pts/3]

Last login: Mon Aug 17 16:25:39 on pts/3

Sun Microsystems Inc. SunOS 5.10 Generic January 2005

# psrinfo

0 on-line since 08/17/2009 16:46:34

1 on-line since 08/17/2009 16:46:40

2 on-line since 08/17/2009 16:46:42

root@x4140-a> zlogin zone3

[Connected to zone 'zone3' pts/4]

Sun Microsystems Inc. SunOS 5.10 Generic January 2005

# psrinfo

0 on-line since 08/17/2009 16:46:34

1 on-line since 08/17/2009 16:46:40

2 on-line since 08/17/2009 16:46:42

14. Then if we want to, we can enable the Fair Share Scheduler so that zone2 and zone3 share the processors in the quad-pool, while zone has dedicated access to two cpu's and doesn't share any cpu's with any other zones.

root@x4140> poolcfg -c 'modify pool quad-pool (string pool.scheduler="FSS")'
root@x4140> pooladm -c
[reboot zones or run this command]
root@x4140> priocntl -s -c FSS -i class TS
root@x4140> priocntl -s -c FSS -i pid 1

Assign three shares to zone2 and one share to zone3:

root@x4140> zonecfg -z zone2
zonecfg:zone2> add rctl
zonecfg:zone2:rctl> set name=zone.cpu-shares
zonecfg:zone2:rctl> add value (priv=privileged,limit=3,action=none)
zonecfg:zone2:rctl> end
zonecfg:zone2> exit

root@x4140> zonecfg -z zone3
zonecfg:zone3> add rctl
zonecfg:zone3:rctl> set name=zone.cpu-shares
zonecfg:zone3:rctl> add value (priv=privileged,limit=1,action=none)
zonecfg:zone3:rctl> end
zonecfg:zone3> exit

The zone "zone" now runs on it's own dedicated and guaranteed CPU's protected from the workload of the other zones. zone2 and zone3 now share the other quad resource pool with zone2 getting a higher weighting than zone3 and the global zone has the remaining CPU.

Thursday, July 30, 2009

Setting hard cpu limits under Solaris zones using zonecfg

Very easy to do. There are two methods:

1) set hard cpu limits
2) set cpu pools

For ease of use I'd recommend going with option 1 and here are the steps to take:

bash-3.00# zlogin myserver
[Connected to zone 'myserver' pts/1]
Last login: Thu Jul 30 11:40:18 on pts/1
Sun Microsystems Inc. SunOS 5.10 Generic January 2005

# psrinfo
4 on-line since 07/04/2009 09:53:11
5 on-line since 07/04/2009 09:53:11
6 on-line since 07/04/2009 09:53:11
7 on-line since 07/04/2009 09:53:11
8 on-line since 07/04/2009 09:53:11
9 on-line since 07/04/2009 09:53:11
10 on-line since 07/04/2009 09:53:11
11 on-line since 07/04/2009 09:53:11
12 on-line since 07/04/2009 09:53:11
13 on-line since 07/04/2009 09:53:11
14 on-line since 07/04/2009 09:53:11
15 on-line since 07/04/2009 09:53:11
16 on-line since 07/04/2009 09:53:11
17 on-line since 07/04/2009 09:53:11
18 on-line since 07/04/2009 09:53:11
19 on-line since 07/04/2009 09:53:11
20 on-line since 07/04/2009 09:53:11
21 on-line since 07/04/2009 09:53:11
22 on-line since 07/04/2009 09:53:11
23 on-line since 07/04/2009 09:53:11
# exit

[Connection to zone 'myserver' pts/1 closed]

bash-3.00# zonecfg -z myserver
zonecfg:maass> add dedicated-cpu
zonecfg:maass:dedicated-cpu> setncpus=4
zonecfg:maass:dedicated-cpu> end
zonecfg:maass> exit

Reboot the zone and log back in:

bash-3.00# zlogin myserver
[Connected to zone 'myserver' pts/1]
Last login: Thu Jul 30 11:44:59 on pts/1
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
# psrinfo
4 on-line since 07/04/2009 09:53:11
5 on-line since 07/04/2009 09:53:11
6 on-line since 07/04/2009 09:53:11
7 on-line since 07/04/2009 09:53:11

Wednesday, June 10, 2009

Fibre Channel card initialization

I had a new SUN M4000 server with two identical QLOGIC FC cards installed. The first FC interface came up without any problem and I kind of forgot about the second one.

Well, in order to move the server into production I needed to multipath the two cards together using the stmsboot -e command, but the second card wasn't responding. I checked the physical connections between the FC switch and the server, but that wasn't it. I tried to poke the system using the cfgadm command -- no luck...

# cfgadm -c configure c2
# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
[...]
c1 fc-fabric connected configured unknown
c2 fc connected unconfigured unknown

# fcinfo hba-port
HBA Port WWN: xxxxxxxx294
OS Device Name: /dev/cfg/c1
Manufacturer: QLogic Corp.
Model: 375-3355-02
Firmware Version: 4.04.01
FCode/BIOS Version: BIOS: 1.24; fcode: 1.24; EFI: 1.8;
Serial Number: 0402G00-0905674110
Driver Name: qlc
Driver Version: 20080617-2.29
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: 4Gb
Node WWN: xxxxxxxx294
HBA Port WWN: xxxxxxxx594
OS Device Name: /dev/cfg/c2
Manufacturer: QLogic Corp.
Model: 375-3355-02
Firmware Version: 4.04.01
FCode/BIOS Version: BIOS: 1.24; fcode: 1.24; EFI: 1.8;
Serial Number: xxx-xxx
Driver Name: qlc
Driver Version: 20080617-2.29
Type: unknown
State: offline
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: not established
Node WWN: xxxxxxxx594

# luxadm -e port
/devices/pci@3,700000/SUNW,qlc@0/fp@0,0:devctl CONNECTED
/devices/pci@2,600000/SUNW,qlc@0/fp@0,0:devctl NOT CONNECTED

No commands would bring the card to life so the FC switch wouldn't see anything attached to the port.

After a lot of troubleshooting I took it to basics and halted the system and reset the bus:

{1} ok
{1} ok reset-all
{1} ok probe-scsi-all

now the second card was responding, so I rebooted the server:

{1} ok boot -rv.

and now it's fine...

# cfgadm -al
[...]
c1 fc-fabric connected configured unknown
[bunch of disks]
c2 fc-fabric connected configured unknown
[bunch of disks]

Solaris 10 routes

If you have a Solaris 10 server with several NIC's and a different subnet plugged into each NIC, you might set up zones to use a specific NIC. In order for the non-global zone to route correctly, you will need to set up routing on the global zone.

I've found that putting in a startup script in /etc/rc3.d/S99add-route doesn't always work.

By the way, I use these commands in the script:

#!/bin/sh
/usr/sbin/ifconfig nxge0 plumb
/usr/sbin/ifconfig nxge1 plumb
/usr/sbin/route add default 10.10.10.1 -ifp nxge0
/usr/sbin/route add default 10.10.20.1 -ifp nxge1

It seems that SUN has added a "-p" flag to the route command that now makes the statement persistent across reboots, rendering the script moot:

# /usr/sbin/route -p add default 10.10.10.1 -ifp nxge0
add net default: gateway 10.10.10.1: entry exists
add persistent net default: gateway 10.10.10.1

# /usr/sbin/route -p add default 10.10.20.1 -ifp nxge1
add net default: gateway 10.10.20.1: entry exists
add persistent net default: gateway 10.10.20.1

The entries are written to the file: /etc/inet/static_routes

# File generated by route(1M) - do not edit.
default 10.10.10.1 -ifp nxge0
default 10.10.20.1 -ifp nxge1

Monday, March 23, 2009

Enable syslog for VMWare ESX3

# vi /etc/syslog.conf

add *.* @IPaddress_of_syslog_server

# esxcfg-firewall -o 514,udp,out,syslog
# service syslog restart
Shutting down kernel logger: [ OK ]
Shutting down system logger: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]

Your syslog server should now be receiving the ESX3 server's syslog information.

Monday, February 23, 2009

ps on Solaris -- zone tip

This is a mini-tip. I was running a ps command to see if snmp was running on a particular host and got back way too much information:


#  ps -ef |grep snmp
   root   938     1   1   Jan 17 ?        2906:56 /usr/local/sbin/snmpd -r
   root 13536     1   0   Jan 25 ?          59:46 /usr/local/sbin/snmpd -r
   root  2545     1   0   Jan 17 ?          19:32 /usr/local/sbin/snmpd -r
   root  6050 26619   0 08:48:36 pts/2       0:00 grep snmp

At a glance I wanted to know what host the command was running on and it wasn't obvious, so I looked at the man page for the ps command and sure enough there was a new option that I didn't know about -- the "Z" option, which tells you which zone it's running on.

Very nice job SUN! Here's the new command:


# ps -efZ |grep snmp
  global    root   938     1   1   Jan 17 ?        2907:03 /usr/local/sbin/snmpd -r
   host1    root 13536     1   0   Jan 25 ?          59:46 /usr/local/sbin/snmpd -r
   host2    root  2545     1   0   Jan 17 ?          19:32 /usr/local/sbin/snmpd -r
  global    root  6189 26619   0 08:50:32 pts/2       0:00 grep snmp

Friday, February 20, 2009

Upgrading Solaris on hosts without a DVD or CDROM drive

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh96?a=view

This is the easiest way to upgrade hosts that don't have a local cdrom or DVD disk.

One thing to remember is that the two hosts need to be on a subnet that have no other tftpboot servers...

Thursday, January 15, 2009

Solaris SMF import failure

Ever get the following message when trying to import an SMF manifest:

# svccfg import /var/svc/manifest/system/syslog-ng.xml
svccfg: Temporary service "TEMP/system/syslog-ng" must be deleted before this manifest can be imported.
svccfg: Import of /var/svc/manifest/system/syslog-ng.xml failed.  Progress:
svccfg:   Service "system/syslog-ng": not reached.
svccfg:     Instance "default": not reached.

The answer is to delete it using:

# svccfg delete -f TEMP/system/syslog-ng

Then you can re-import it and enable it:

# svccfg import /var/svc/manifest/system/syslog-ng.xml
# svcadm enable syslog-ng

My Digital Brain