Data Protection Issues Using BookingCenter

    What could be causing data corruption problems?
    How can corruption problems be rectified?

    Contents
    Introduction
    Opportunistic Locking Screen
    Savers and Energy Savers
    Write Behind Caching
    Repairing Datafiles
    Cabling
    General Hints and Tips
    Back-up and Data Corruption data corruption case study

    Introduction
    Generally one can say that the Data Corruption Datafile technology is safe, but it does have one essential weak point: there is no server side checking of the data, as is the case with SQL backends. Data Corruption relies on a functioning network to write data to a file server and if this networking is defective network packets can get lost, thereby possibly corrupting the datafile. The impression that the network is OK is not always correct, especially in high traffic situations, as in these cases corrupted packets can slip through.

    More often than not, the cause of the corrupted datafile is some network issue. The points below try to give an overview of what can cause network problems and what might be the cause of damage to a datafile and are in no specific order. The Write-Behind-Caching seems to have helped in many cases though.

    Opportunistic Locking
    Opportunistic locking on NT should be turned off:

    WHAT IS OPPORTUNISTIC LOCKING: Opportunistic locking is used by Windows NT to perform read-ahead, write-behind, and lock caching. Basically, if one client is accessing a block range in a file, that range is marked for opportunistic locking and the client can perform read-ahead, write-behind, and lock caching. If another user attempts to write to that block range, the opportunistic locking has to be switched off for the previous client and the data needs to be synchronized with the server before the second user can access the range.
    SITUATION: Users were seeing regular corruption of their database. All had the package installed on a Windows NT Server (3.51 or 4) and were running Windows 95 at the workstations. Corruption would happen several times a day.
    CAUSE: Windows NT Server tries to use a feature called Opportunistic Locking in order to speed them up. This does not work well with a database.
    RESOLUTION: This fix needs careful attention. We recommend that a responsible network person make this change. Any time that you edit a machine's registry information, you risk bigger problems if it is not done correctly.

    Steps to disable opportunistic locking on an NT Server
    1 Open REGEDT32 on the server machine.
    2 Go toHKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ LanmanServer\Parameters.
    3 From the menu select Edit > New > DWORD Value
    4 Fill in the blanks (Value Name) EnableOplocks (Data Type) REG_DWORD.
    5 Select OK.
    6 A DWORD editor dialogue box will appear, type in a zero and leave it HEX.
    7 Select OK. The new value should appear on the right half of the registry viewer.
    8 Exit the registry editor.
    9 Reboot the server. The value will only go into effect after a reboot.

    Screen Savers and Energy Savers
    Screensavers and energy savers on Windows machines should be turned off, especially on the server. These are supposed to disable all connections cleanly if the computer has been idle for some time and reconnect after the computer recognises some action, but more often than not these do not function correctly. BookingCenter is very network sensitive, so if the network is not OK with packets getting lost, Data Corruption has no influence over what data is written to the datafile. So if the reconnection to the datafile is not clean, data can be lost and the datafile corrupted.

    Write-Behind Caching
    Turn off write-behind caching on the Win95/98 machines. This type of caching stores information that needs to be written to the hard disk and sends it when the system is idle or after a certain amount of time has elapsed. This is a built-in feature of Windows 95/98 and is provided by the SmartDrv Utility under the various versions of Windows 3.
    Disabling Write Behind Caching:

    Using Windows 95:

    1 Right button-click on My Computer, and select the properties menu item.
    2 Click on the Performance tab in the System Properties window that appears.
    3 Click on the File system... button at the bottom left hand corner of the window.
    4 Click on the Troubleshooting tab in the File System Properties window that appears.
    5 Place a tick in the "Disable write-behind caching for all drives" check box.
    6 Click OK in the File System Properties window.
    7 Click OK in the System Properties window.
    8 Reboot Windows 95.

    This will reduce the performance of your machine slightly, as writes to disk are changed from write-behind to write-through caching. If you are doing a reorganisation of a very large file and want to get every bit of performance out of your system, it is worth turning this flag off and rebooting before doing the reorganisation. The speed hit depends on the performance of your hard drives and their interfaces.
    The main concern is that popular system optimisation software (e.g. First Aid) suggests to the user that this setting is bad, and tries to turn it off again, enabling write behind caching. So even if you have done the right thing, the user (or a technician trying to improve the system performance) may unwittingly undo all your good work.

    Using Windows for Workgroups 3.11 (WFW 3.11):
    N.B. You must upgrade to WFW 3.11 if you are using an earlier version.

    If you are using 32-bit file access on all or some drives:
    There may be a line (or lines) in your system.ini file that is/are present in the section entitled [vcache] beginning with ForceLazyOn= or ForceLazyOff=
    .

    1 If there is a line beginning with ForceLazyOn=, delete the entire line.
    2 If there is a line beginning with ForceLazyOff=, ensure that all the active drives in your system are included in the letters following ForceLazyOff=, e.g. if your system has two drives, C: & D:, make the line read as follows: ForceLazyOff=CD
    3 If there is no line beginning with ForceLazyOff=, add the following line in the [vcache] section: ForceLazyOff=CDEF

    Again, in this instance the letters CDEF refer to the four drives C:, D:, E: & F: and should be changed as required to suit your system. You should also include network drives in deciding what letters to add to the line.

    The [vcache] section of the system.ini file should look something like this when you have finished:

    • [vcache]
    • MinFileCache=512
    • ForceLazyOff=CDEF

    If you are using 16-bit file access on all or some drives, there should be a line in your autoexec.bat file that looks something like this:

    Add the switch /x to this line so that it reads:

      \c:\dos\smartdrv.exe /x

    Using Windows 3.1:
    There should be a line in your autoexec.bat file that looks something like this:
    c:\dos\smartdrv.exe

    Add the switch /x to this line so that it reads:
    c:\dos\smartdrv.exe /x

    Repairing Datafiles
    The best way to check or repair a datafile is to run a Full check. The procedure is:

    1 Run the Utilities (BookingCenter | Paramteres \ Utilities using password: precious) with all the 'Check data file structure', 'Check records', 'Check indexes' and 'Repair data' options selected. Completely ignore any messages reported in the log.
    2 Repeat step 1. A second time.
    3 (Optional) Clear the check data log and repeat step 1. but without the 'Repair data' option. Any messages that now appear in the log will probably denote irreparable damage.

    A full check should fix the great majority of problems, if it doesn't the only solution is to export and re-import the data. Datafiles often pick up small amounts of damage with regular use and this generally causes no long-term problems (just like Norton Utilities nearly always seems to find something wrong with a hard disk). So even if a datafile seems to be working fine it is sensible to perform the Full Check routine described above every month or two. This could usefully be carried out after the network hardware check recommended elsewhere in the document.

    Rely on performing a Full Check every couple of months.

    If a datafile becomes damaged on a regular basis always check the network for hardware problems before attempting to repair the datafile.

    Plan ahead and assume that problems will happen from time to time. Make sure there is a reliable backup system and plans in place to perform periodic checks and deal with emergencies - it takes a long while to perform a Full Check on a large datafile and even longer to export and re-import data.

    Cabling
    Defective network cable or connectors can be a problem, especially in an Ethernet network. Twisted pair tends to be a lot safer. Even old network cables can be a cause.

    There can be "cross talk" caused by poor quality cables and connectors. "Reflection" caused by improper cable radiuses and running too close to electrical lines. Missed and corrupted packets caused at the software level by incorrectly installed drivers and/or corrupted drivers. A malfunctioning hard drive can write bad data and or lose data in selected sectors. The list goes on.

    A 4K Cable tester is an investment worth making. Many sites that are inspected with this tester do not meet category 5 cable guidelines.

    Problems can mysteriously clear up after cabling is upgraded to Cat 5 from Cat 3. In the case of low-end network cards, perhaps some cards do not do check summing very well, in which case a corrupted packet could get through. Even a network class 5 cable that had a desk leg placed on it has been the cause of problems. It was causing one computer to run slowly and thus corrupted the data.

    It may be necessary to check for bad cables, cards and hardware by doing a 'ping-a-thon' every once in a while to every piece of network hardware.

    General Tips and Hints
    Never try to reorganise data if severe data damage is suspected. With current Data Corruption versions this will only make things worse.
    Make especially sure that there is a reliable backup before repairing or reorganising data. Otherwise a crash during these operations could be really bad news. If a workstation crashes for any reason whilst Data Corruption is updating the datafile it can cause corruption and locking problems.

    Make sure the users are educated not to switch off their workstations improperly.

    We always set the NT Server performance setting to "balanced" rather than "maximize for file sharing".

    Back-up and Data Corruption data corruption case study
    This case pertains to an NT site where everything had previously been fixed but bad things began happening despite no apparent changes on the application, server or network of clients.

      Having spent some considerable time trying to find the problem, and building registry checking into launching of the application, the customer's own IT support person found the problem as follows:

      They are using Backup Exec.

      In the setup, apparently there is an option of whether to backup open files. If you set to backup open files, there is another option of "with locks".
      So, what was happening was that every few weeks (or days), the on-site administration person would forget to swap the backup tapes when they went home. Realising their mistake the next morning, they would then swap them and Backup Exec was setup in such a way as to wait until the correct tape arrived - therefore backups commenced during the day. When Backup Exec got to the datafile, it was in use and it therefore proceeded to lock portions of it whilst backing them up.

      The datafile can get corrupted very quickly, and apparently randomly. The support guy spotted it when he realised that his backup error log corresponded with the log of when damage appeared in the data file.

      So if you use Backup Exec (or any other software with similar settings), don't let it lock portions of your data file. Since finding the cause (we hope), no data damage has appeared.

    [ Close Window ]