This page describes an older version of the product. The latest stable version is 14.0.

Troubleshooting


OS-level problems

There have been initialization errors with certain devices, notably Fusion-io ioDrive2. This error is simply due to the sector configuration. The device must be low-level formatted with a 512-byte sector. This can be done by entering the BIOS at start time and reformatting the device.

ZetaScale issues

If a missing library is reported, it is usually reported by the library that is linked to it. In our case this would be one of the libfdf_jni.so files, or another *.so file.

The command below will return a list of the dependencies and indicate where they were found

ldd lib*.so

If a dependency can be found, try determining if it is on the system at all by using the following command:

find / -name lib<>.so

If it is present somewhere on the system, it is often enough to simply create a symlink to the location where the system expects to find it. Recently, libevent has been causing problems of this nature. The solution is as follows:

Run the following command:

find /usr -name libevent*

then run, as root (and it might be libevent.so.54)

ln -s /usr/lib-x86_64/libevent.so.53 /usr/lib-x86_64/libevent.so

or

ln -s /usr/lib/libevent.so.53 /usr/lib/libevent.so

or

or whatever was reported by find

Blobstore

Statistics error

When receiving the following error:

info stats.c:918 zs_start_admin_thread Starting ZS admin on TCP Port:51350
error stats.c:859 ZSAdminThread Unable to bind admin port 51350

This indicates that 2 spaces are trying to use the same default ZS admin port 51350.In order to overcome this

<blob-store:properties>
    <props>
        <prop key="FDF_ADMIN_PORT">5135${clusterInfo.runningNumber}</prop>
    </props>
</blob-store:properties>

Initialization error

When you receive the following error:

Aug 12 10:24:00 2014 3cfcb700 fatal mcd_rec.c:638 read_label Invalid signature '' read from fd 0

this indicates that the device has not been properly initialized. This will probably be unnecessary in the future, but at the current time you should add the following section to the blob-store:sandisk-blob-store declaration:

 <blob-store:properties>
  <props>
    <prop key="FDF_REFORMAT">1</prop>
  </props>
 </blob-store:properties>
Important

You can also delete the blobstore contents by setting this value. Do not forget to comment it out for subsequent deployments

Flash device malfunction

When a single flash device is not responding or has an hardware malfunction and you wish to replace it while your applicatin is running, you will need to perform the following steps:

  • In space instance machine which attached to malfunctioned device at /tmp/blobstore/devices/device-per-space.properties (default) delete the space attached to the malfunctioned device.
  • Restart the GSC of the above space.

For example /dev/sdc has an HW malfunction, you should replace the flash device, delete the second row which contains /dev/scd and restart GSC which contains mySpace_container1_1-mySpace.

mySpace_container1-mySpace=/dev/sdb@Consistent^Sat Jul 18 10\:47\:48 GMT+02\:00 2015
mySpace_container1_1-mySpace=/dev/sdc@Consistent^Sat Jul 18 10\:47\:56 GMT+02\:00 2015

Last Primary

When you see this exception in the backup space log:

    Space recovery failure.; Caused by: com.gigaspaces.internal.server.space.recovery.direct_persistency.DirectPersistencyRecoveryException:

This indicats that the last primary space, which it’s name is written to a shared file, is not available to the backup space.The backup space will not take the primary role since only the last primary has the most updated data. Therefor this partition will not be available. In order to resolve this you will need to find why the primary space is not available, it could be a network disconnection or a storage error.