Q: What are the differences between database synchronization and database replication?

A: Database synchronization is closely related to database replication. In fact, sometimes people use the terms interchangeably. However, there are big differences, understanding which will help us understand the different approaches used for solving replication and synchronization problems.

Replication is mostly used in situations where identical replicas of the complete data set are maintained on two or more database instances for high availability and performance reasons. Database instances can often work independently as backups for each other. The relationships between the instances are equal, symmetric. Normally the number of DB instances is small.

On the other hand, in a database synchronization system, typically you have a big central database on the server side and a large number of small databases each residing on a device. The central database contains data for all the devices while each device's local database only contains the device's private data and some shared data.

In the non-database world, Synchronization is also often used to describe the data exchange between a more temporal sub-set of data and a more persistent full-set of data. For instance, parts of a file could be buffered in-memory by an operating system and are "synchronized" with the file on hard disk. Another example is the synchronization of the data in a CPU cache memory with the data in the main memory.

Q: What are the different approaches used for database synchronization and replication?

A: In database replication schemes, physical transactions on each node are recorded and played back on all the other nodes. This technique would only work if each node has a replica of the full-set data.

There is also a stability issue with physical transaction based replications when the number of nodes goes up. Transactions on different replicas could conflict with each other. To handle this, normally cross system locking or complicated conflict resolution schemes are needed. In fact, they are used in eager replication and lazy replication respectively.

Eager replication synchronously updates all replicas as part of one atomic transaction. This is also called synchronous replication or pessimistic replication as it incurs global locking and waiting.

In contrast to eager replication, lazy replication allows updates of any replicas without locking others. The local transactions then propagate to other replicas in background. This scheme is also called asynchronous replication or optimistic replication since it is based on the assumption that conflicts will occur rarely. Here each host must apply transactions generated by the rest of the hosts. Conflicts must be detected and resolved. Since the transactions are final at their origin nodes, usually manual or complicated ad hoc conflict resolutions are required at the destination nodes.

Researchers have shown that the traditional transactional replication has unstable behaviors as the workload scales up: a ten-fold increase in node and traffic gives a thousand fold increase in deadlocks or reconciliations/conflict resolutions. A system that performs well on a few nodes may become unstable as the system scales up to even a dozen of nodes. To alleviate this problem, master-slave schemes are created where writes are only allowed on the master server and the slave servers are made read-only.

The traditional database replication schemes are clearly not suited for database synchronization, which involves hundreds or even thousands of nodes in one system. The traditional replication systems are symmetric and each node contains a full-set data, while a synchronization system is asymmetric and the client nodes contain sub-sets of the data on central server. The asymmetry, together with the instability introduced with large number of nodes calls for a different approach than propagating physical transactions to all nodes, as used in replication.

The key to a successful synchronization scheme is server-centric. Replication schemes, except for the special purpose master-slave scheme, treat all the nodes equally, symmetrically. Transactions on each node are propagated to all other nodes and conflict detection and resolution are done on the destination nodes. In contrast, database synchronization employs an asymmetric client-server paradigm. In the server centric synchronization system, the server database is the single source of truth. Client nodes only talk to the server, not each other. Conflict detection and resolution only happen on one node, the server. This ensures the system stability.

Physical transactions applied on the central database are final. In contrast, device database serves as a cache of a sub-set of central database data. Physical transactions applied on the device database are tentative until they are checked in to the central database. At check in time, the changes committed by all the local physical transactions form a single logical transaction, which is applied directly to only the central database, not directly to other devices. The checking in of logical transactions in the sync system is just like the committing of user physical transactions in a traditional client-server DB system.

The checkin of client transactions is pretty straightforward except that you cannot simply record the SQL statements on client and replay them on server. Remember the client has a sub-set of server data and the same SQL statement could do different things on server than on client. Instead, you keep track of changes to records/rows and apply the changes to server DB.

On the other hand, it is trickier to refresh client DB with server transactions. Again, you cannot simply record the SQL statements and should instead track the changes to records/rows. In addition, you need to handle logical inserts and deletes. For example, let’s say you have a tasks table on server containing all the tasks. You want each user/client to sync only the tasks assigned to him or her. In case you re-assign the task from user A to user B, user A should get a logical delete and user B should get a logical insert. It is called a “logical” insert/delete since the task wasn’t physically inserted or deleted on server.

Fortunately a good synchronization system should be able to handle all the intricacies like change tracking, logical inserts/deletes, conflict detection and resolution.

Q: Does database synchronization implement some sort of Two-Phase Commit (2PC)?

We are evaluating Pervasync, and wondered about the internals of the synchronization process: does it implement some sort of 2-phase-commit?

For example:

- Say I Updated five records on the client, and I now ask to synchronize them into the main Oracle server.

- Now, 4 'updates' are successfully transmitted over the network, but just as the last 'update' is transmitted, the network cable is unplugged. The client can't know the status of the last update (was it updated on the server or not). What should it do when network is reconnected? Should it send it again (risking duplicate 'updates'), or drop it (risking a complete loss of the 'update')?

Q: Does Pervasync guarantee that my 'update' will be received exactly once? For example, does it use some form of 2-phase-commit?

A: There are two types of replications, synchronous replication and asynchronous replication. If you have to think of mobile database synchronization as a kind a replication, it belongs to asynchronous replication.

The Two-Phase Commit (2PC) is used in synchronous replication. The 2PC protocol is a blocking protocol – before the distributed transaction is completed successfully, resources remain locked. If a node or the network goes bad during the 2PC process, the system has to wait for them to recover, blocking other conflicting transactions. Due to this reason, synchronous replication is only used in tightly coupled systems, e.g. database clusters. Mobile sync system is characterized by un-reliable network and not always up local DBs. So synchronous replication is not an option.

Asynchronous replication is non-blocking. Local transactions are committed locally and later propagated to remote nodes. Conflict detection and resolution have to be done on the destination nodes.

In your example, the transaction containing the five updates is first committed on client device. At sync time, the transaction is sent to server. Server will commit or roll back the transaction as a whole. If network goes bad in the middle, server normally won’t know until it is done with the transaction and tries to send a response back to client. It is not possible that only four of the five are committed.

Now, if server doesn't receive all five updates, it will roll back and client will re-send, there is no problem. If server receives all five updates and commits and client fails to get the response, client will still re-send. What will happen? There are two possibilities. First, client and server could keep track of the IDs of the transactions and make sure a same transaction is not applied twice. However, more commonly, as adopted by Pervasync too, client would include new changes with the re-sent transaction and server cannot ignore the transaction. Normally, synchronization is primary key based. So this second case won’t cause any duplicates – sync server would convert inserts into updates.

Q: What's wrong with those time stamp based database synchronization schemes?

Why does Pervasync implement a sync engine instead of using a simple time stamp based database synchronization scheme like a lot of others?

A: We know that in synchronization you cannot simply record the SQL statements on one database and replay them on another database. The reason is that the client database typically only has a sub-set of server database data and the same SQL statement could do different things on server than on client. Instead, you should somehow track the changes to records/rows and apply the changes to the destination DB.

The most popular change tracking method on server DB is based on timestamps. It looks very straightforward. You add a timestamp column to your table and update the timestamp column whenever you change a row. This can be done in your SQL statement or through triggers. By the way, deletes have to be tracked separately, maybe in a companion table. Then at sync time, the sync client would come in with a last sync timestamp and you select all the server changes that have a newer timestamp than the last sync timestamp.

As I said, this is a widely used technique when people have to implement their own sync logic. Also, some sync solutions put the burden of change tracking on app developers and this is the number one recommended technique. However, be aware of its pitfalls.

One obvious pitfall is system time. This is timestamp based so be careful with system time. Don’t adjust system time even it is wrong. Do not sync during time shifts between daylight saving time and standard time.

There is a more serious problem with this technique that could cause change loss. Let me try to explain it. The default isolation level for Mysql with innodb and Oracle is “Read Committed”, which means that others cannot see the changes before a transaction is committed. Let’s say at time T1 you modified a record R1. The timestamp column would have a value of T1. Then before you commit, a sync happened at T2 (T2 > T1). This sync could not pickup the change to record R1 since the transaction was not committed and the change was invisible. At T3 you committed the transaction and at T4 (T4>T3>T2>T1) you do another sync. Guess what, you still won’t get the change to R1! It was committed but it has a timestamp T1 that is older than last sync time T2. The client will forever miss the change no matter how many times you sync.

This problem is not so well known and is very hard to workaround in a production environment.

Fortunately Pervasync has an innovatively designed sync engine that can take care of all the sync issues for you. You don’t need to have any timestamp column to worry about. Just do your normal DB DML operations and the system would track the changes for you and guarantee no change loss.

Q: How to generate globally unique key values in distributed databases?

A: A synchronization system contains distributed databases. How do you ensure that new records created on these databases have unique primary key values? In a single DB system, people usually use AUTO_INCREMENT columns (e.g. in Mysql) or sequence objects (e.g. in Oracle) that produce unique sequence numbers. These won’t work in a multi-DB environment where different DBs might generate same primary key values that would conflict with each other when new records are synced from one DB to another.

Pervasync has a "sync sequence" feature that is specifically designed to solve this issue. You publish a sync sequence defining its start value and window size. Each sync client would get a window of a sequence (a range of numbers) from which it can draw globally unique values for unique keys. When the client is about to run out of the numbers, a new range is synced to the client. All the application needs to do is to call the sequenceNextval Java API to retrieve the numbers locally. In addition to Java API, for Oracle one can use the locally created native sequence objects directly and for Mysql, one can use stored procedures to retrieve the sequence numbers.

We believe that sync sequence is the best choice for most situations. Still, there are other options that may fit your specific needs. We list them below.

  1. Only allow transactions to happen on central DB. Device local DBs are made read-only, i.e. for queries only. Believe it or not, there are systems that adopt this model.
  2. Use randomly generated numbers for key values. The length of the random numbers has to be long to reduce the possibility of collisions.
  3. Use a composite key. The sync client API has a method that returns the sync client ID. Apparently this is a globally unique number. You can use the client ID as the first column of a primary key. You then use a second column that takes locally unique values. Let’s call it LUID column. You define the client ID column and the LUID column together as your composite primary key. The values of the composite key are globally unique.
  4. Compute a GUID using client ID and LUID. For example, you could use this formula: GUID = <client ID> * 1000000000 + LUID
  5. Map local UID with GUID. This is the technique used by OMA DS, AKA SyncML, and Activesync. Locally created records are assigned an LUID. During Activesync synchronization, a GUID for the same record is generated on server and synced back to client. For OMA DS, client sends an LUID back to server for every server sent record. A map table of LUID and GUID is maintained on server.

The mapping methods maybe OK for simple PIM sync. However for enterprise applications that have large amount of data, a lot of tables and complex referential relationships between tables, the mapping would cause performance and maintenance problems. Pervasync does not support this method.

Q: I got errors starting Pervasync web application. What to do?

I'm trying to install the sync server web application but keep getting the following error when trying to start it. This is in catalina.out.

Jan 22, 2009 12:34:26 PM org.apache.catalina.core.StandardContext start

SEVERE: Error listenerStart

Jan 22, 2009 12:34:26 PM org.apache.catalina.core.StandardContext start

SEVERE: Context [/pervasync] startup failed due to previous errors

A: First of all, if you are using Tomcat 5.5, switch to Tomcat 6.0 or newer.

If the web app still fails to start, you can look at the log files under <tomcat_home>/logs. One of the files should contain the detailed error messages. Forward us the error messages if you don't know how to fix it.

Q: I got Oracle JDBC Diagnosability Mbean Error. Any ideas?

I got the following in Tomcat log:

Jan 22, 2009 2:33:44 PM oracle.jdbc.driver.OracleDriver registerMBeans

SEVERE: Error while registering Oracle JDBC Diagnosability MBean.

javax.management.MalformedObjectNameException: Invalid character ' ' in value part of property

A: Seems to be a bug in Oracle 11g JDBC driver. See http://forums.oracle.com/forums/thread.jspa?threadID=549705&tstart=-2

Try an older driver or the newest drive from OTN that supposed to contain the fix.

Q: Can we use Pervasync if we don’t have a need of data subsetting?

We have Oracle 10g based application published on Citrix Presentation Server and used in several locations. As some of locations in East Europe have poor Internet connection, we want to install in these locations same databases as in central location and use databases synchronization. All locations will be equal in rights and content. Will your solution allow this configuration?

A: Yes, Pervasync supports the configuration you were describing. Conventional replication schemes won’t work in situations that need data sub-setting. However, synchronization solutions like Pervasync are able to work for replication of full-set of data as well as sub-set of data.

Q: Does your synchronization require good Internet connection?

A: Pervasync uses HTTP streams for sending and receiving data. It does not require good Internet connection. If the connection went bad during sync, it can sync the changes next time when connection is back. No data corruption will occur.

Q: Can Pervasync synchronize the DB of the Internet web app with the DBs of the Intranet web app?

I work for client who is working on medical product, which essentially is a web application. We have hosted it on the web as well as on intranet. Each instance of the application will have its own database. The reason we have it on intranet is because some hospital offices have slow Internet connection. Speed is an important criterion for the application. There may be cases where the doctor does something in the application hosted in the Internet and comes down to his office in hospital where he has to access the web application from the intranet. Ideally he wants what he did on the Internet to immediately available on the intranet also or the vice versa.

Q: Will your Database sync product solve this kind of scenario?

A: Yes, Pervasync is perfect for your scenario. The web apps are just a user interface to display and modify data stored in the database. Pervasync can be used synchronize the DB serving the Internet web app with the DB serving the intranet app.

Install the sync server with the Internet web DB, then publish all the tables, create a user for each of the intranet instance. Then install a sync client with each intranet DB. Do a sync the tables will be created on the intranet DB. A second sync will download all the data to the intranet DB. After that setup a job to sync periodically. Changes made on the Internet DB will be propagated to the intranet DB and vice versa.

It should be easy to set this up. Give a try and let us know if there are any issues.

Q: Does Pervasync need any special ports to be open?

Is there any issue if the some of the nodes (databases) reside inside networks, which can only be accessed via VPN? I am not sure how comfortable the IT team would be to open up ports.

A: No need to open ports. The sync server is just like the Internet web app. If the Internet app can access the central DB, the sync server can. No extra requirements.

The intranet DB and web app cannot be accessed from outside. However we don't need to. The sync client initiates the HTTP connection to sync server from inside the intranet. The only thing needed would be setting the proxy, just like you set the proxy for a browser in order to browse the Internet. Open the config file of the sync client. There are lines for you to set the proxy server host and port.

Q: How much time would a synchronization session take?

What are the bandwidth requirements to run Pervasync? How will huge number of transactions be handled and how they will affect the bandwidth?

A: Do you mean network bandwidth or server scalability? In any case, this shouldn't be a problem. Once you get a sample app working, you will see that the sync process is very fast. Only changed records are exchanged and thousands of records can be exchanged in seconds. Of course this also depends on the record size and network speed.

How many sync clients are you planning? Even if you have a lot of sync clients, they do not sync at exactly the same time. Also, while the client is syncing, the client DB is not locked so the client Db web app can still work.

Q: Can sync client communicate with sync server via HTTPS?

Can we do sync between two database over secured network i.e. Can Sync client communicate with sync server via HTTPS (and not just http)?

A: HTTPS is not a problem. You just need to config the HTTP server to enable HTTPS and then use HTTPS in the sync server URL on sync client. For example, if you use Tomcat, see http://tomcat.apache.org/tomcat-6.0-doc/ssl-howto.html for how to config it for SSL.

Q: Can Pervasync server and client run on a same machine?

Can we run the server side and client side on the same machine or we should use another PC for client side application?

A: Yes, you can run server and client on the same machine.

Q: I am getting “PVS-2023: Sync engine error” when running pvc.sh.

After issuing the sync on the server-console, by:

./pvc.sh sync

We get a sync engine error, as following:

PVS-2023: Sync engine error

java pervasync.client.SyncAgent total time: 0.716 seconds

Exception in thread "main" pervasync.SyncException: PVC-1020: Sync server reported error: PVS-2023: Sync engine error

at pervasync.client.SyncAgent.receiveSyncSummary(SyncAgent.java:1967)

at pervasync.client.SyncAgent.sync(SyncAgent.java:574)

at pervasync.client.SyncAgent.main(SyncAgent.java:128)

Q: We were wondering if you could give us some advice regarding this problem?

A: pvc.sh invokes the sync client to do a sync with the sync server. The error was from the server which has a servlet and a sync engine. The engine encountered an exception.

Try log in to the web admin console. You should see the error status of the sync engine. Try re-start it and if successful, do pvc.sh again.

Look at the log files in <pervasync server home>/log/ for what's causing the error. Send us the file pervasync_server_mysql.log.